Reinforcement Learning Diary
Subscribe
Sign in
Home
Notes
Archive
About
My Non-Linear Path to OpenAI - (Part 1 of 2)
A Journey in Endless Curiosity - The Pre-ChatGPT Era
Dec 26, 2025
•
Vignesh Ramesh
2
3
November 2025
Dissecting a Language Model
What cutting open the different layers of a large language model tell us about its real-self.
Nov 1, 2025
•
Vignesh Ramesh
July 2025
The $100 Agents
A new project to train task-specific agents powered by Reinforcement Learning tuned language models with a compute budget of $100.
Jul 13, 2025
•
Vignesh Ramesh
The Entropy Conundrum
Post-training with Reinforcement Learning and its impact on the entropy of the model
Jul 7, 2025
•
Vignesh Ramesh
2
Why does RLDiary exist?
Firsthand account of the challenges and insights in applying reinforcement learning to language model–based agents, with a focus on environment design…
Jul 4, 2025
•
Vignesh Ramesh
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts