How can I catch new posts?
Bookmark this blog and watch the homepage and tools hub—we surface new guides there. No account or mailing list is required to read articles.
How Token-Oriented Object Notation (TOON) can shrink LLM contexts versus JSON and YAML—plus nested data, cross-model robustness, parsing CPU, and when to stay on JSON. Illustrative numbers for comparison only; benchmark your own payloads.
Serialization formats keep oscillating between human readability and what models and runtimes consume cheaply. TOON (Token-Oriented Object Notation) is often pitched as a JSON alternative for LLM prompts. This article focuses on engineering reality: real token savings, stability across models and nesting, and parse/convert cost at the gateway.
Note: percentages, latencies, and token counts below are illustrative orders of magnitude from representative samples. Size them on your tokenizer, model, and production data.
TOON trims JSON punctuation and uses a header + row style layout for homogeneous collections. It shines as an LLM-facing projection of your data—not as a wholesale replacement for public REST APIs or canonical JSON storage.
Every JSON level pays for {}, quotes, and colons. For deeply nested extraction payloads or AST snippets, that syntax is structured noise: it consumes context and can increase bracket-alignment overhead for the model.
TOON amortizes field names across rows when structures repeat. When each level is heterogeneous, schema/header cost eats the wins.
The same bytes tokenize differently per model. Expect:
| Dimension | JSON | YAML | TOON (typical role) |
|---|---|---|---|
| Token efficiency | Baseline | Medium | Often best on homogeneous tables |
| Nested / heterogeneous | Expressive | Indentation-sensitive | Homogeneous wins; heterogeneous narrows gains |
| Parse CPU | Mature, very fast | Medium | Often slower today; conversion in software |
| Validation / ecosystem | JSON Schema | Weaker | Prompt/contracts; tooling still evolving |
| Primary use | APIs, storage | Config | LLM prompts, RAG snippets, agent context |
---
Illustrative: savings vs JSON may drop as nesting becomes irregular because each level pays for structure declarations.
| Depth | JSON (illustrative tokens) | TOON (illustrative tokens) | Approx. savings |
|---|---|---|---|
| Flat list | 520 | 190 | ~63% |
| Two-level objects | 1240 | 710 | ~43% |
| Three-level heterogeneous | 2100 | 1720 | ~18% |
Takeaway: for AST-like or highly polymorphic lists, JSON may remain the pragmatic choice.
Illustrative accuracy on the same extraction task (not an official benchmark):
| Model family | TOON | JSON |
|---|---|---|
| GPT-4o | 96% | 94% |
| Claude 3.5 Sonnet | 91% | 93% |
| Gemini 1.5 Pro | 88% | 90% |
| Llama 3 70B | 78% | 85% |
Ops guidance: weaker alignment on header/row TOON favors JSON for OSS/local stacks; consider TOON only after regression on your flagship model and cost envelope.
Illustrative encode/decode for ~10k homogeneous rows (Python-tier implementations):
| Operation | JSON (stdlib) | TOON (illustrative) |
|---|---|---|
| Encode | 8 ms | 22 ms |
| Decode | 6 ms | 18 ms |
JSON runtimes benefit from decades of SIMD, streaming, and zero-copy work. TOON stacks still skew interpreter-heavy—watch the gateway hot path; cache conversions or move them offline.
{
"order_id": "992831",
"items": [
{"sku": "A12-B", "price": 99.0, "qty": 1},
{"sku": "C34-D", "price": 45.5, "qty": 2},
{"sku": "E56-F", "price": 12.0, "qty": 5}
],
"customer": "John Doe"
}order_id: 992831
items[3]{sku, price, qty}:
A12-B, 99.0, 1
C34-D, 45.5, 2
E56-F, 12.0, 5
customer: John DoeHomogeneous items is TOON’s comfort zone. Browsers do not parse TOON natively—place JSON ↔ TOON at the gateway/BFF with caching and JSON fallback.
With explicit headers, stable schemas, and aligned columns, field fidelity can match JSON. Minimal-header modes belong behind strong contracts.
Q2: Where does latency hide?In conversion and parser quality. Precompute TOON for static corpora; avoid per-request cold conversion on huge payloads without cache.
Q3: Polymorphic lists?Compression collapses; fall back to JSON or split into multiple uniform tables.
Use CI/CD or offline jobs to diff tokenizer counts on static datasets. Keep JSON as the source of truth; treat TOON as a model-facing projection.
Dedicated to providing developers with the best JSON processing tools
More posts coming soon...
Back to BlogFollowing the blog, topics we cover, and how to suggest guides.
Bookmark this blog and watch the homepage and tools hub—we surface new guides there. No account or mailing list is required to read articles.
JSON validation, formatting, conversion, debugging workflows, and JSON Work releases—mapped to what the free on-site tools can do locally in your browser.
Yes. Reach out via the About page or GitHub; we prioritize guides tied to real integration and debugging scenarios.