AI Readiness

How efficiently can an LLM process this language?

Three metrics: total LLM tokens consumed per solution, token density per line, and how much static type information is available for the model to reason about. Averaged across 7 benchmark problems.

Results

Language ↕	LLM Tokens ↑	Tok/Line ↕	Type Coverage ↕
Clojure	★74▼	★9.3▼	★0▼
Ruby	★105▼	★7.5▼	★0.25▼
C#	★109▼	★6.8▼	★1▼
Python	★118▼	★8.1▼	★0.5▼
Erlang	★121▼	★10.1▼	★0▼
JavaScript	★124▼	★7.9▼	★0▼
Objective-C	★129▼	★9.2▼	★0.5▼
TypeScript	★133▼	★8.4▼	★0.5▼
Elixir	★142▼	★8.3▼	★0▼
Kotlin	★144▼	★7.5▼	★1▼
Swift	★166▼	★7.6▼	★1▼
Go	★176▼	★6.7▼	★1▼
Rust	★181▼	★8.1▼	★1▼
Haskell	★202▼	★9.3▼	★0.75▼
Milo	★213▼	★8.1▼	★1▼
Java	★215▼	★8▼	★1▼
C++	★220▼	★8.1▼	★1▼
C	★366▼	★9▼	★1▼
Zig	★391▼	★9.5▼	★1▼

What the metrics mean

LLM Tokens — tokens consumed when feeding code to a language model (cl100k_base tokenizer, GPT-4/Claude class). Lower = cheaper API calls, more code fits in context window.
Tok/Line — token density per line of code. Higher means each line carries more information for the model to parse.
Type Coverage — how much type information is statically available (0–1). Higher = more for AI to verify, infer from, and autocomplete against.

What drives the differences?

Python/Ruby are cheap to tokenize (low token counts) but have low type coverage — the model has to infer types from context.

Rust/Zig/Go cost more tokens but provide full static type information, giving the model more to work with for verification and completion.

Haskell is an outlier — high token count (202 avg) due to verbose type signatures and operator-heavy syntax, but strong type coverage (0.75) from its type system. The gap from 1.0 reflects that Haskell's type inference means annotations are often omitted.

JavaScript is the worst combination for AI: moderate token cost with zero static type information. TypeScript fixes the type coverage (0.5) but at higher token cost.

How we measure

LLM Tokens: Tokenized with js-tiktoken using the cl100k_base encoding (used by GPT-4, Claude).

Type Coverage: Static property of the language — 1.0 for fully statically typed (Rust, Go, Java), 0.5 for optionally typed (TypeScript, Python with hints), 0 for dynamically typed (JavaScript, Elixir).

AI Readiness ​

Results ​

What the metrics mean ​

What drives the differences? ​

AI Readiness

Results

What the metrics mean

What drives the differences?