▲ 1 VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO (arxiv.org) by timhigins | Jun 23, 2026 | 0 comments on HN Visit Link