Reinforcement Learning Diary
Subscribe
Sign in
Home
Notes
Archive
About
The $100 Agents
A new project to train task-specific agents powered by Reinforcement Learning tuned language models with a compute budget of $100.
Jul 13
•
Vignesh Ramesh
The Entropy Conundrum
Post-training with Reinforcement Learning and its impact on the entropy of the model
Jul 7
•
Vignesh Ramesh
Why does RLDiary exist?
Firsthand account of the challenges and insights in applying reinforcement learning to language model–based agents, with a focus on environment design…
Jul 4
•
Vignesh Ramesh
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts