News
Latest
Top
Search
Submit
Login
Search
▲
207
CS234: Reinforcement Learning Winter 2025
(web.stanford.edu)
by jonbaer |
view
|
60 comments
▲
4
TMLR: Outcome-Based Reinforcement Learning to Predict the Future
(openreview.net)
by bturtel |
view
|
1 comments
▲
3
Olympiad-level formal mathematical reasoning with reinforcement learning
(nature.com)
by mauricioc |
view
|
0 comments
▲
3
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning
(github.com)
by yhzan |
view
|
1 comments
▲
2
An FAQ on Reinforcement Learning Environments
(epoch.ai)
by dcre |
view
|
0 comments
▲
2
Reinforcement Learning Control of Quantum Error Correction
(arxiv.org)
by SweetSoftPillow |
view
|
0 comments
▲
1
Solving Physics Olympiad via reinforcement learning on physics simulators
(sim2reason.github.io)
by ivansavz |
view
|
0 comments
▲
1
Autonomous Rocket Landing with Reinforcement Learning (YouTube)
(youtube.com)
by rafacm |
view
|
1 comments
▲
1
Formalizing the "generative crash" via inverse reinforcement learning
by abrahamhaskins |
view
|
0 comments
▲
1
Show HN: REST API for Gymnasium (fka OpenAI Gym) reinforcement learning library
(github.com)
by cloudkj |
view
|
0 comments
▲
1
What is reinforcement learning finetuning
(youtube.com)
by kumama |
view
|
0 comments
▲
1
How LLMs Got Good: Humility, Tools, and Reinforcement Learning
(medium.com)
by tudorhn |
view
|
0 comments
▲
1
Hamilton-Jacobi-Bellman Equation: Reinforcement Learning and Diffusion Models
(dani2442.github.io)
by sebzuddas |
view
|
0 comments
▲
1
Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents
(arxiv.org)
by brandonb |
view
|
0 comments
▲
1
Rust-accelerated reinforcement learning, 140x faster than Python
(github.com)
by wkowalpl |
view
|
1 comments
▲
1
Reinforcement Learning (I.e. Policy Gradient Algorithms)
(rlhfbook.com)
by vinhnx |
view
|
0 comments
▲
1
Reinforcement Learning environments and how to build them
(unsloth.ai)
by vinhnx |
view
|
0 comments
▲
1
AI Gold Trading Bot reinforcement learning system for autonomous XAUUSD trading
(github.com)
by solosquad |
view
|
0 comments
▲
1
How Well Does Reinforcement Learning Scale?
(tobyord.com)
by AntiDyatlov |
view
|
0 comments
▲
1
A Reinforcement Learning Environment for Automatic Code Optimization in MLIR
(arxiv.org)
by matt_d |
view
|
0 comments
▲
1
Why reinforcement learning breaks at scale, and how a new method fixes it
(techxplore.com)
by brandonb |
view
|
0 comments
▲
1
Reinforcement Learning for LLMs
(mesuvash.github.io)
by gmays |
view
|
0 comments
▲
1
Notes on Reinforcement Learning
(mattlanders.net)
by sato_sakura |
view
|
0 comments
▲
1
Intuitive Intro to Reinforcement Learning for LLMs
(mesuvash.github.io)
by mesuvash |
view
|
0 comments
▲
1
Experiential Reinforcement Learning
(arxiv.org)
by geophile |
view
|
0 comments
▲
1
Composition-RL: Compose Verifiable Prompts for Reinforcement Learning of LLMs
(arxiv.org)
by gmays |
view
|
0 comments
▲
1
Maximum Likelihood Reinforcement Learning
(zanette-labs.github.io)
by belleville |
view
|
0 comments
▲
1
A Survey of In-Context Reinforcement Learning
(arxiv.org)
by handfuloflight |
view
|
0 comments
▲
1
A reinforcement learning agent that learns to play Kung Fu Master
(shantanugoel.com)
by devnonymous |
view
|
0 comments
▲
1
Reinforcement Learning from Human Feedback
(arxiv.org)
by onurkanbkrc |
view
|
0 comments
▲
1
HelloRL: A modular framework, like Lego for Reinforcement Learning
(github.com)
by AndrewHart |
view
|
1 comments
▲
1
When deep reinforcement learning meet trading
(github.com)
by solosquad |
view
|
1 comments
▲
1
Reinforcement learning for humans – Quiz your understanding
(cramsandwich.com)
by filepod |
view
|
0 comments
▲
1
Why reinforcement learning plateaus without representation depth (NeurIPS 2025)
(venturebeat.com)
by brandonb |
view
|
0 comments
▲
1
An FAQ on Reinforcement Learning Environments
(epoch.ai)
by dcre |
view
|
0 comments
▲
1
Deep reinforcement learning trading bot 90%-120% returns yearly
(github.com)
by solosquad |
view
|
0 comments
▲
1
Show HN: Reinforcement learning tic-tac-toe in C, annotated
(github.com)
by aportnoy |
view
|
0 comments
▲
1
Prompt optimization can outperform reinforcement learning on LLMs
(sderosiaux.substack.com)
by chtefi |
view
|
0 comments
▲
1
Reinforcement learning without human annotations
(twitter.com)
by armytricks |
view
|
0 comments
▲
1
Potential-Based Reward Shaping in Reinforcement Learning
(medium.com)
by brandonb |
view
|
0 comments
▲
1
Deep reinforcement learning trading bot 70%-100% returns yearly
(github.com)
by solosquad |
view
|
0 comments
▲
1
Wii Reinforcement Learning
(github.com)
by arvindh-manian |
view
|
0 comments
▲
1
rLLM: Reinforcement Learning for Language Agents
(rllm-project.readthedocs.io)
by jonbaer |
view
|
0 comments
▲
1
Dark Forest Theory and Multi-Agent Reinforcement Learning (2023)
(hal.science)
by hamburgererror |
view
|
0 comments
▲
1
Reinforcement Learning Infrastructure for LLM Agents
(github.com)
by bakigul |
view
|
0 comments
▲
1
Bitwise Consistent On-Policy Reinforcement Learning with VLLM and TorchTitan
(blog.vllm.ai)
by brrrrrm |
view
|
0 comments