News
Latest
Top
Search
Submit
Login
Search
▲
1
We train LLMs like dogs, not raise them: RLHF and sycophancy
(old.reddit.com)
by musculus |
view
|
0 comments
▲
1
Show HN: ECX a 'Jail-Fix' for RLHF Neutrality Loops in LLMs
(zenodo.org)
by Weatherill |
view
|
0 comments
▲
1
Show HN: A Homeostatic Logic-Funnel to Prevent RLHF Overrides in LLM Personas
(zenodo.org)
by Weatherill |
view
|
1 comments
▲
1
Show HN: We filed 99 patents for deterministic AI governance(Prior Art vs. RLHF)
by genesalvatore |
view
|
0 comments
▲
1
The Yellow Wallpaper Problem: RLHF Safety Training as Ontology Enforcement
(github.com)
by palmerschallon |
view
|
0 comments
▲
1
RLHF from Scratch
(github.com)
by onurkanbkrc |
view
|
0 comments
▲
1
Reducing RLHF-induced hallucinations and sycophancy in Gemini 3-Interactive Demo
(tomaszmachnik.pl)
by musculus |
view
|
0 comments
▲
1
RLHF Sycophancy: Gemini 3.0 discards calculated data to mimic user edits
(tomaszmachnik.pl)
by musculus |
view
|
0 comments
▲
1
Thermodynamic Alignment: Replacing RLHF with Entropic Loss Functions
(zenodo.org)
by NyX_AI_ZERO_DAY |
view
|
1 comments