Search | News by Netwrck

Transformers know more than they can tell: Learning the Collatz sequence

(arxiv.org) by Xcelerate | view | 45 comments

Weight-sparse transformers have interpretable circuits [pdf]

(cdn.openai.com) by 0x79de | view | 46 comments

Brain-IT: Image Reconstruction from fMRI via Brain-Interaction Transformer

(AmitZalcher.github.io) by SerCe | view | 10 comments

Out-of-Distribution Generalization in Transformers via Latent Space Reasoning

(arxiv.org) by marojejian | view | 1 comments

Transformers v5 Is Out

(huggingface.co) by unofficialmerve | view | 1 comments

Show HN: Pulse-Field – O(N) AI Architecture (12x faster than Transformers)

(github.com) by makimilan | view | 8 comments

Show HN: Wasda – Experience transformer attention as music

(github.com) by kinders | view | 0 comments

Show HN: MacMind – A transformer neural network in HyperCard on a 1989 Macintosh

(github.com) by hammer32 | view | 0 comments

Ask HN: Analog Model of Transformers

by JPLeRouzic | view | 0 comments

Stronger Normalization-Free Transformers

(arxiv.org) by mfiguiere | view | 0 comments

An AI Startup Looks Toward the Post-Transformer Era

(wsj.com) by fortran77 | view | 1 comments

What's Next for AI? OpenAI's Łukasz Kaiser (Transformer Co-Author) [video]

(youtube.com) by abrichr | view | 0 comments

Z-Image: Efficient Image Gen Model with Single-Stream Diffusion Transformer

(tongyi-mai.github.io) by SerCe | view | 0 comments

Parallel Loop Transformer for Efficient Test-Time Computation Scaling

(arxiv.org) by PaulHoule | view | 0 comments

Symmetric Power Transformers

(manifestai.com) by ashvardanian | view | 0 comments

The Transformer and the Hash: building blocks of 21st century political science

(nothinghuman.substack.com) by ivee | view | 0 comments

Transformer from scratch HTTPS://github.com/Eamon2009/Transformer-language-model

by Eamon_Sippy | view | 0 comments

What would you do if you have AI software that may be transformers alternative?

by adinhitlore | view | 1 comments

A biologically inspired cognitive architecture without Transformers

(github.com) by Brain_cognitive | view | 1 comments

Securing America's grid: a strategic transformer reserve

(breakingdefense.com) by jrpt | view | 0 comments

Get to Grips with Transformers and LLMs

(i-programmer.info) by aquastorm | view | 0 comments

2015 radio interview: AI as "high-level algebra" before Transformers and LLMs

(doomlaser.com) by doomlaser | view | 0 comments

Why are Transformers replacing CNNs? [video]

(youtube.com) by chii | view | 0 comments

Transformers v5.0 by HuggingFace

(huggingface.co) by satvikpendem | view | 0 comments

Porting Nanochat to Transformers

(huggingface.co) by us321 | view | 0 comments

Turbine Transport Transformer

(mitxela.com) by mhb | view | 0 comments

Show HN: PDFClear – Browser-based PDF tools with local AI (WASM+Transformers.js)

(pdfclear.com) by aliansari22 | view | 1 comments

Show HN: Aion-Torch – Adaptive residual scaling for deep Transformers

(github.com) by Rioverde | view | 0 comments

Who Invented Transformer Neural Networks?

(people.idsia.ch) by puttycat | view | 0 comments

Show HN: Run HF Transformers in pure Go (10 MB binary, no Python)

(github.com) by openfluke | view | 0 comments

The Curved Spacetime of Transformer Architectures

(arxiv.org) by luis_likes_math | view | 1 comments

Ouroboros: Dynamic Weight Generation for Recursive Transformers

(arxiv.org) by OsamaJaber | view | 0 comments

Llama 4: A Deep Dive into Liquid Transformers 2.0 and Sovereign AI

(en.landingfymax.com.br) by EvCarvalho | view | 0 comments

Sessa: An alternative to Transformers and Mamba for long-context LLMs

(github.com) by guyfloki | view | 0 comments

Transformers

(chizkidd.github.io) by ibobev | view | 0 comments

Gemma 4 is not your standard transformer

(idlemachines.co.uk) by smaddrellmander | view | 0 comments

OpenMythos: A looped transformer take on how Claude Mythos might work

(firethering.com) by steveharing1 | view | 0 comments

Soul Player C64 – A real transformer running on a 1 MHz Commodore 64

(github.com) by adunk | view | 0 comments

The Trouble with Transformers

(roblh.substack.com) by rob_lh | view | 1 comments

Show HN: Trained a 12M transformer on an ML framework we built from scratch

(github.com) by caliandbust | view | 0 comments

LingBot-Map: Streaming 3D Reconstruction with Geometric Context Transformer

(technology.robbyant.com) by nateb2022 | view | 0 comments

LingBot-Map: Geometric Context Transformer for Streaming 3D Reconstruction

(github.com) by flux_w42 | view | 0 comments

Do LLMs Dream? The Post-Transformers Generation

(sderosiaux.substack.com) by chtefi | view | 0 comments

Sparser, Faster, Lighter Transformer Language Models

(arxiv.org) by matt_d | view | 0 comments

Elastic Looped Transformers for Visual Generation

(arxiv.org) by gmays | view | 0 comments

LegendreGPT: Compressing a transformer into 15.7 MB with Legendre polynomials

(github.com) by amunozo | view | 0 comments

A 164-parameter architecture beats a 6.5M transformer on SCAN by 94 points

(github.com) by Elgoghel | view | 0 comments

Multimodal Embedding and Reranker Models with Sentence Transformers

(huggingface.co) by gmays | view | 0 comments

Open-Source Projects Running Transformers on CPUs to GPUs in Pure-Modern Java

(old.reddit.com) by mikepapadim | view | 1 comments

Void Rescue System – 0.5M decoder for regression transformers

(github.com) by ikerM | view | 0 comments