Eval Board / MEAT-bench v3

The Leaderboard

Every instrument — human and silicon — on one eval board. Scores are MEAT-Elo (0–100) per domain. The point of the manifesto, rendered as a table: no instrument wins everywhere, and the cheapest model on your task is rarely the strongest one.

Instrument	MEAT-Elo ↓	Biology	Physical	Social	Math	Creative	Logistics	Tier	Status	Cost/tok	Value
1 The Retired EngineerMEAT 120B	64	35	60	50	95	55	88	pro	ready	2.50	26
2 The LawyerMEAT 240B	60	30	25	96	60	75	72	pro	⏳ 429	8.00	8
3 The Olympic AthleteMEAT 60B	60	45	99	60	25	50	80	pro	ready	4.00	15
4 ChatGPT 5.5SILICON ~2.2T (rumored)	60	85	1	82	97	90	2	enterprise	⏳ 429	5.50	11
5 The Average AdultMEAT 70B	59	45	65	68	50	60	65	pro	ready	1.00	59
6 Anthropic FableSILICON ~900B (rumored MoE)	59	82	1	90	92	89	2	enterprise	⏳ 429	6.00	10
7 MythosSILICON ~1.8T (rumored)	58	80	1	78	96	88	2	enterprise	⏳ 429	5.00	12
8 Gemini 3 ProSILICON ~1.5T (MoE, rumored)	58	83	1	80	94	87	2	enterprise	⏳ 429	4.50	13
9 Grok 5SILICON ~1.4T (rumored)	55	76	1	74	90	86	2	enterprise	⏳ 429	3.80	14
10 The SurgeonMEAT 405B	52	99	22	55	70	35	28	pro	ready	9.50	5
11 The TeenagerMEAT 7B	52	25	78	40	45	70	55	free	ready	0.30	173
12 Kimi 2.6SILICON ~750B (MoE)	51	72	1	66	86	78	2	enterprise	⏳ 429	1.00	51
13 DeepSeek V4SILICON ~720B (MoE)	50	70	1	60	93	72	2	enterprise	⏳ 429	0.90	56
14 Frontier-7BSILICON 7B	31	40	1	42	55	48	1	enterprise	ready	0.40	78
15 The ToddlerMEAT 1B	30	5	30	35	8	92	12	free	ready	0.10	300

Methodology: MEAT-Elo is the unweighted mean of domain scores. Value = MEAT-Elo ÷ cost-per-token. Silicon scores 1 on embodied domains (Physical, Logistics) because no model can wash a car end-to-end. All figures are deadpan satire and reflect no real product's capabilities. Click a column to re-rank.

Reading the board

No universal winner

The Surgeon tops Biology and bottoms out on Physical. Mythos tops Math and scores 1 on anything needing a body. Capability is a profile, not a number.

Cheap ≠ best

The Teenager and Frontier-7B win on Value (capability per dollar) while losing on raw MEAT-Elo. Route by task, not by leaderboard rank.

Silicon's hard wall

Every LLM scores 1 on Physical and Logistics. You cannot prompt your way into moving an atom — that column belongs to meat.