News
Latest
Top
Search
Submit
Login
Search
▲
124
Benchmarking leading AI agents against Google reCAPTCHA v2
(research.roundtable.ai)
by mdahardy |
view
|
97 comments
▲
53
Drawing Text Isn't Simple: Benchmarking Console vs. Graphical Rendering
(cv.co.hu)
by PaulHoule |
view
|
41 comments
▲
31
How Good Are Chinese CPUs? Benchmarking the Loongson 3A6000
(lemire.me)
by ashvardanian |
view
|
1 comments
▲
28
Benchmarking the Most Reliable Document Parsing API
(tensorlake.ai)
by calavera |
view
|
14 comments
▲
10
Benchmarking NVENC video transcoding on the Pi
(jeffgeerling.com)
by ingve |
view
|
0 comments
▲
5
Benchmarking KDB-X vs. QuestDB, ClickHouse, TimescaleDB and InfluxDB
(kx.com)
by rustc |
view
|
0 comments
▲
4
Benchmarking my Redis clone in Zig (a web dev learning systems)
(charlesfonseca.substack.com)
by barddoo |
view
|
1 comments
▲
3
CodSpeed CLI: Deterministic benchmarking for any executable
(github.com)
by art049 |
view
|
0 comments
▲
3
Benchmarking GPT-5.1 vs. Gemini 3.0 vs. Opus 4.5 across 3 Coding Tasks
(blog.kilo.ai)
by heymax054 |
view
|
0 comments
▲
3
Benchmarking LLMs at the Frontier of Physics
(artificialanalysis.ai)
by mustaphah |
view
|
0 comments
▲
3
Benchmarking Language Implementations: Am I doing it right? Get Early Feedback
(stefan-marr.de)
by speckx |
view
|
0 comments
▲
3
Powering AI at Scale: Benchmarking 1B Vectors in YugabyteDB
(yugabyte.com)
by ashvardanian |
view
|
0 comments
▲
3
Benchmarking the Cost of Java's EnumSet – A Second Look
(kinnen.de)
by birdculture |
view
|
0 comments
▲
3
Benchmarking multilingual long-context language models
(arxiv.org)
by sysoleg |
view
|
0 comments
▲
2
Reverse Benchmarking
(dominiknitsch.com)
by wseqyrku |
view
|
0 comments
▲
2
Benchmarking node collision algorithms for React/Svelte Flow
(xyflow.com)
by moklick |
view
|
0 comments
▲
2
Show HN: Benchmark-ips-Python – benchmarking tool for Python
(github.com)
by Igor_Wiwi |
view
|
0 comments
▲
2
Benchmarking Checksum Tools
(heitorpb.github.io)
by furkansahin |
view
|
0 comments
▲
2
Dell Pro Max with GB10 Arrives for Linux Performance Benchmarking Review
(phoronix.com)
by rbanffy |
view
|
0 comments
▲
2
Benchmarking the Thomson Reuters legal agent
(thomsonreuters.com)
by gk1 |
view
|
0 comments
▲
2
Benchmarking the AMD EPYC 9V64H: Azure HBv5's Custom AMD CPU with HBM3
(phoronix.com)
by ashvardanian |
view
|
0 comments
▲
1
Same Query, Three Results: Benchmarking ParadeDB and Postgres FTS
(paradedb.com)
by jamesgresql |
view
|
1 comments
▲
1
DoomBench – benchmarking data stacks running Doom
(cedardb.com)
by krishadi |
view
|
0 comments
▲
1
Exercises in benchmarking, evals, and experimental design, part 6
(patreon.com)
by mfrw |
view
|
0 comments
▲
1
Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments
(arxiv.org)
by Anon84 |
view
|
0 comments
▲
1
Benchmarking The BORE Scheduler Performance With CachyOS Linux
(phoronix.com)
by Bender |
view
|
0 comments
▲
1
Can LLMs Reason Structurally? Benchmarking via the Lens of Data Structures
(arxiv.org)
by matt_d |
view
|
0 comments
▲
1
A weekend benchmarking Copilot CLI's /security-review across 5 LLMs
(dcairo.substack.com)
by avocount |
view
|
0 comments
▲
1
Benchmarking LLM-as-a-Judge for Long-Form Output Evaluation
(arxiv.org)
by berlianta |
view
|
0 comments
▲
1
Benchmarking the Different CachyOS Linux Kernel Flavors
(phoronix.com)
by Bender |
view
|
0 comments
▲
1
Benchmarking SlateDB vs. RocksDB
(nixiesearch.substack.com)
by shutty |
view
|
0 comments
▲
1
Benchmarking TurboQuant with MLX on Apple Silicon
(youtube.com)
by tcp_handshaker |
view
|
0 comments
▲
1
Benchmarking SurrealDB 3.x vs. Postgres, Mongo, Neo4j and Redis (With Fsync)
(surrealdb.com)
by itsezc |
view
|
0 comments
▲
1
Omissive Bias: Benchmarking LLM Answers to Ethical Decision-Making
(arxiv.org)
by pseudolus |
view
|
0 comments
▲
1
Benchmarking LLMs for Web Tasks
(100x.bot)
by shardullavekar |
view
|
0 comments
▲
1
Benchmarking Vortex File Format vs. Parquet, CSV vs. DuckDB, Polars, Datafusion
(dataengineeringcentral.substack.com)
by eigenBasis |
view
|
0 comments
▲
1
Measuring Security Without Fooling Ourselves: Why Benchmarking Agents Is Hard
(arxiv.org)
by Timofeibu |
view
|
0 comments
▲
1
Benchmarking Free-Threading Performance with Tachyon
(blog.changs.co.uk)
by rzk |
view
|
0 comments
▲
1
Show HN: Benchd: client-side WASM benchmarking in the browser
(github.com)
by userfrom1995 |
view
|
0 comments
▲
1
Dimster, a performance benchmarking tool for Apache Kafka
(jack-vanlightly.com)
by rmoff |
view
|
0 comments
▲
1
Benchmarking AI coding agents for distributed SQL: 350 runs, 17 models
(yugabyte.com)
by mityash |
view
|
0 comments
▲
1
Benchmarking AI agents across five TypeScript back end frameworks
(encore.dev)
by eandre |
view
|
0 comments
▲
1
Benchmarking Subquadratic's latest model and SSA Kernel
(appen.com)
by famouswaffles |
view
|
0 comments
▲
1
Benchmarking llama.cpp's new MTP support on Strix Halo
(calebcoffie.com)
by CCoffie |
view
|
0 comments
▲
1
Benchmarking Subquadratic's latest model and SSA Kernel
(appen.com)
by Galichev |
view
|
0 comments
▲
1
Benchmarking Quant Backtesting Engines
(medium.com)
by CrazyTomato |
view
|
0 comments
▲
1
Tau-knowledge: benchmarking agents on real-world knowledge
(sierra.ai)
by tedsanders |
view
|
0 comments
▲
1
What We Think About When We Think About Benchmarking
(paradedb.com)
by jamesgresql |
view
|
1 comments
▲
1
Benchmarking Claude Opus 4.6 Vulnerability Detection
(github.com)
by jviide |
view
|
0 comments
▲
1
Arrow Flight vs. JSON in Next.js: Benchmarking Python and Go
(kayhan.dev)
by keynha |
view
|
0 comments