Reinforcement Learning Diary

Reinforcement Learning Diary

Home
Notes
Archive
About
The $100 Agents
A new project to train task-specific agents powered by Reinforcement Learning tuned language models with a compute budget of $100.
Jul 13 • 
Vignesh Ramesh
The Entropy Conundrum
Post-training with Reinforcement Learning and its impact on the entropy of the model
Jul 7 • 
Vignesh Ramesh
Why does RLDiary exist?
Firsthand account of the challenges and insights in applying reinforcement learning to language model–based agents, with a focus on environment design…
Jul 4 • 
Vignesh Ramesh
© 2025 Vignesh Ramesh
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture