Reinforcement Learning Diary

Reinforcement Learning Diary

Home
Notes
Archive
About
Dissecting a Language Model
What cutting open the different layers of a large language model tell us about its real-self.
Nov 1 • 
Vignesh Ramesh

July 2025

The $100 Agents
A new project to train task-specific agents powered by Reinforcement Learning tuned language models with a compute budget of $100.
Jul 13 • 
Vignesh Ramesh
The Entropy Conundrum
Post-training with Reinforcement Learning and its impact on the entropy of the model
Jul 7 • 
Vignesh Ramesh
Why does RLDiary exist?
Firsthand account of the challenges and insights in applying reinforcement learning to language model–based agents, with a focus on environment design…
Jul 4 • 
Vignesh Ramesh
© 2025 Vignesh Ramesh
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture