News
Latest
Top
Search
Submit
Login
Search
▲
207
CS234: Reinforcement Learning Winter 2025
(web.stanford.edu)
by jonbaer |
view
|
60 comments
▲
4
TMLR: Outcome-Based Reinforcement Learning to Predict the Future
(openreview.net)
by bturtel |
view
|
1 comments
▲
3
Olympiad-level formal mathematical reasoning with reinforcement learning
(nature.com)
by mauricioc |
view
|
0 comments
▲
3
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning
(github.com)
by yhzan |
view
|
1 comments
▲
2
Reinforcement Learning Control of Quantum Error Correction
(arxiv.org)
by SweetSoftPillow |
view
|
0 comments
▲
1
How Well Does Reinforcement Learning Scale?
(tobyord.com)
by AntiDyatlov |
view
|
0 comments
▲
1
A Reinforcement Learning Environment for Automatic Code Optimization in MLIR
(arxiv.org)
by matt_d |
view
|
0 comments
▲
1
Why reinforcement learning breaks at scale, and how a new method fixes it
(techxplore.com)
by brandonb |
view
|
0 comments
▲
1
Reinforcement Learning for LLMs
(mesuvash.github.io)
by gmays |
view
|
0 comments
▲
1
Notes on Reinforcement Learning
(mattlanders.net)
by sato_sakura |
view
|
0 comments
▲
1
Intuitive Intro to Reinforcement Learning for LLMs
(mesuvash.github.io)
by mesuvash |
view
|
0 comments
▲
1
Experiential Reinforcement Learning
(arxiv.org)
by geophile |
view
|
0 comments
▲
1
Composition-RL: Compose Verifiable Prompts for Reinforcement Learning of LLMs
(arxiv.org)
by gmays |
view
|
0 comments
▲
1
Maximum Likelihood Reinforcement Learning
(zanette-labs.github.io)
by belleville |
view
|
0 comments
▲
1
A Survey of In-Context Reinforcement Learning
(arxiv.org)
by handfuloflight |
view
|
0 comments
▲
1
A reinforcement learning agent that learns to play Kung Fu Master
(shantanugoel.com)
by devnonymous |
view
|
0 comments
▲
1
Reinforcement Learning from Human Feedback
(arxiv.org)
by onurkanbkrc |
view
|
0 comments
▲
1
HelloRL: A modular framework, like Lego for Reinforcement Learning
(github.com)
by AndrewHart |
view
|
1 comments
▲
1
When deep reinforcement learning meet trading
(github.com)
by solosquad |
view
|
1 comments
▲
1
Reinforcement learning for humans – Quiz your understanding
(cramsandwich.com)
by filepod |
view
|
0 comments
▲
1
Why reinforcement learning plateaus without representation depth (NeurIPS 2025)
(venturebeat.com)
by brandonb |
view
|
0 comments
▲
1
An FAQ on Reinforcement Learning Environments
(epoch.ai)
by dcre |
view
|
0 comments
▲
1
Deep reinforcement learning trading bot 90%-120% returns yearly
(github.com)
by solosquad |
view
|
0 comments
▲
1
Show HN: Reinforcement learning tic-tac-toe in C, annotated
(github.com)
by aportnoy |
view
|
0 comments
▲
1
Prompt optimization can outperform reinforcement learning on LLMs
(sderosiaux.substack.com)
by chtefi |
view
|
0 comments
▲
1
Reinforcement learning without human annotations
(twitter.com)
by armytricks |
view
|
0 comments
▲
1
Potential-Based Reward Shaping in Reinforcement Learning
(medium.com)
by brandonb |
view
|
0 comments
▲
1
Deep reinforcement learning trading bot 70%-100% returns yearly
(github.com)
by solosquad |
view
|
0 comments
▲
1
Wii Reinforcement Learning
(github.com)
by arvindh-manian |
view
|
0 comments
▲
1
rLLM: Reinforcement Learning for Language Agents
(rllm-project.readthedocs.io)
by jonbaer |
view
|
0 comments
▲
1
Dark Forest Theory and Multi-Agent Reinforcement Learning (2023)
(hal.science)
by hamburgererror |
view
|
0 comments
▲
1
Reinforcement Learning Infrastructure for LLM Agents
(github.com)
by bakigul |
view
|
0 comments
▲
1
Bitwise Consistent On-Policy Reinforcement Learning with VLLM and TorchTitan
(blog.vllm.ai)
by brrrrrm |
view
|
0 comments