Code Size

How much code does it take to express the same idea?

We count three things: lines of code (LOC), tokens (whitespace-separated words), and compression ratio (gzip size / raw size — lower means more repetitive/boilerplate-y code).

Results

Language ↕	Avg Lines ↑	Avg Tokens ↕	Compression Ratio ↕
Clojure	★8▼	★36▼	★0.7▼
Erlang	★12▼	★48▼	★0.6▼
Objc	★14▼	★59▼	★0.5▼
Ruby	★14.3▼	★41.6▼	★0.7▼
Python	★14.6▼	★43.6▼	★0.6▼
Javascript	★15.7▼	★54.3▼	★0.6▼
Typescript	★15.9▼	★58.3▼	★0.6▼
Csharp	★16▼	★54▼	★0.6▼
Elixir	★17.4▼	★50.1▼	★0.6▼
Kotlin	★19.7▼	★57▼	★0.6▼
Haskell	★21.9▼	★107.4▼	★0.5▼
Swift	★22.3▼	★76.7▼	★0.5▼
Rust	★22.6▼	★67.3▼	★0.5▼
Go	★26.6▼	★77.1▼	★0.6▼
Java	★26.6▼	★81.4▼	★0.5▼
Milo	★26.9▼	★101.3▼	★0.5▼
Cpp	★27.9▼	★88.4▼	★0.5▼
C	★39.7▼	★146▼	★0.4▼
Zig	★39.9▼	★158.7▼	★0.4▼

What drives the differences?

Ruby/Python win because:

No boilerplate (no main function, no imports for builtins, no type annotations required)
Rich standard library (Counter, tally, ThreadPoolExecutor — one-liners for complex operations)
Minimal syntax (whitespace blocks, implicit returns)

C/Zig lose because:

No built-in collections (no HashMap, no dynamic array without manual allocation)
Manual memory management
No string operations (character-by-character parsing)
Manual threading/concurrency infrastructure

The interesting middle: Kotlin (18.5) matches JavaScript despite being statically typed. Extension functions and expression-bodied syntax eliminate the ceremony you'd expect from a JVM language.

The gap widens with verbosity

On simple algorithmic problems, languages cluster within 2× of each other. On real-world problems with I/O, JSON, HTTP, and concurrency, the gap stretches to 3-5×. Real programs exercise the standard library, error model, and concurrency primitives — that's where languages diverge most.

How we count

LOC: Non-blank lines. Includes imports and main function boilerplate.

Tokens: Whitespace-separated words. let x: i32 = 5; = 5 tokens.

Compression Ratio: gzip(source) / len(source). Lower means more repetitive code (boilerplate). Higher means more information-dense code.

Code Size ​

Results ​

What drives the differences? ​

The gap widens with verbosity ​

Code Size

Results

What drives the differences?

The gap widens with verbosity