Reinforcement Learning Diary

Reinforcement Learning Diary

Home
Notes
Archive
About
My Non-Linear Path to OpenAI - (Part 1 of 2)
A Journey in Endless Curiosity - The Pre-ChatGPT Era
Dec 26, 2025 • Vignesh Ramesh

November 2025

Dissecting a Language Model
What cutting open the different layers of a large language model tell us about its real-self.
Nov 1, 2025 • Vignesh Ramesh

July 2025

The $100 Agents
A new project to train task-specific agents powered by Reinforcement Learning tuned language models with a compute budget of $100.
Jul 13, 2025 • Vignesh Ramesh
The Entropy Conundrum
Post-training with Reinforcement Learning and its impact on the entropy of the model
Jul 7, 2025 • Vignesh Ramesh
Why does RLDiary exist?
Firsthand account of the challenges and insights in applying reinforcement learning to language model–based agents, with a focus on environment design…
Jul 4, 2025 • Vignesh Ramesh
© 2026 Vignesh Ramesh · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture