A text-first, LLM-friendly serialization format.
Up to 50% fewer tokens
than JSON.
Zero dependencies. Built for the AI era.
Same data structure, dramatically different token costs
Slide to see how token savings scale with data volume
Type hints enabled (+20 tokens)
Type hints (u32, str, bool) add ~20 tokens but enable schema validation, TypeScript generation, and better LLM understanding. Still 32% smaller than JSON!
Experience the future of command-line data analysis with real-time visualization, theme support, and interactive features that make data analysis enjoyable.
npm install -g tonl
tonl stats data.json --interactive
Navigate easily through intuitive menus, use keyboard shortcuts for quick actions, and control your analysis with real-time progress tracking.
Switch instantly between GPT-5, Claude Sonnet 4.5, Gemini 2.5 Pro, Gemini 3 Pro, and Llama-4 tokenizers. Compare token costs for each model in real-time.
Choose from Neon, Matrix, Cyberpunk, and Default themes. Work with a visual experience that matches your personal preference.
View JSON and TONL files side by side, track detailed compression metrics, and monitor cost analysis in real-time.
More than just serialization - a complete data platform
10 advanced optimization strategies providing up to 60% additional compression beyond standard TONL encoding
Dictionary, Delta, RLE, Bit Packing, Column Reorder, Quantizer, Schema Inheritance, Hierarchical Grouping, Tokenizer Aware, Adaptive
Additional compression beyond standard TONL with automatic strategy selection
One-line activation with automatic optimization
tonl encode data.json --optimize --verbose
32-50% fewer tokens than JSON. Reduces LLM API costs with smart delimiter optimization.
users[2]{id,name}
JSONPath-like queries with filters, wildcards, and recursive descent for powerful data extraction.
users[?(@.age > 18)]
Full CRUD operations with change tracking, atomic saves, and rollback capabilities.
doc.set('user.age', 31)
O(1) hash indices and O(log n) BTree indexes. 10-1600x faster than sequential scans.
<0.1ms lookup times
Process multi-gigabyte files with constant memory usage for large-scale data pipelines.
streamQuery('10GB.tonl')
Define schemas, validate data types, set constraints, and auto-generate TypeScript definitions.
age: u32 min:18 max:120
Traverse document trees, iterate collections, and search hierarchies with intuitive APIs.
doc.walk(visitor)
Track modifications with detailed diffs, timestamps, and rollback support for audit trails.
doc.getChanges()
Complete command-line toolkit for encoding, decoding, querying, validation, and formatting.
tonl encode --smart
See how TONL solves common data challenges
Optimize your AI workflows with massive token savings and faster processing times.
Query and analyze large datasets with lightning-fast performance and SQL-like syntax.
Store and manage application configs with human-readable format and validation.
Stream 50GB+ log files with <100MB memory. Perfect for ETL pipelines.
Query test data easily. No database needed. Fast test execution.
Track changes with diff engine. Rollback capability for safe migrations.
Compact cache storage with O(1) endpoint lookups and TTL support.
Fast SKU lookups, price range queries with BTree indexing.
Store articles, metadata. Query by tags, date ranges efficiently.
Snapshots, complete history, detailed change tracking with timestamps.
Efficient data sync. Change notifications. Fast queries for dashboards.
Copy this prompt to enable any LLM to parse TONL data
Add this to your LLM system prompt when sending TONL data
The following data is in TONL format. Parse it as follows:
โข Lines with [count]{fields}: are array headers, data rows follow
โข Lines with {fields}: are object headers, field: value pairs follow
โข Indentation (2 spaces) indicates nesting levels
โข Default delimiter is comma unless #delimiter header specifies otherwise
โข Type hints may appear: field:type (e.g., id:u32, name:str, active:bool)
โ Ignore the :type part, just parse the values
โข Value types: unquoted numbers/booleans, quoted strings, null
โข v2.0 Optimization: May contain #optimize directives (ignore these, they're metadata)
Examples:
Without types (compact):
users[2]{id,name,role}:
1, Alice, admin
2, Bob, user
With types (validation):
users[2]{id:u32,name:str,role:str}:
1, Alice, admin
2, Bob, user
With optimization (v2.0):
#optimize dictionary delta bitpack
users[2]{id:u32,name:str,role:str}:
1, Alice, admin
2, Bob, user
All represent: {"users": [{"id":1,"name":"Alice","role":"admin"}, {"id":2,"name":"Bob","role":"user"}]}
TONL v2.0 provides 60% additional compression while maintaining full LLM compatibility.
๐ก Pro tip: This prompt is ~180 tokens. Adding it to your system prompt is negligible compared to the 60-70% total savings with v2.0 optimization!
Detailed LLM integration guide with more examples and edge cases
View on GitHubReal working code you can use today
// Filter users by condition
const admins = doc.query(
'users[?(@.role == "admin")]'
);
// Get nested values
const names = doc.query(
'users[*].name'
);
// Complex filters
const active = doc.query(
'users[?(@.age > 25 && @.active)]'
);
// Array slicing
const first = doc.query('users[0:5]');
// Update single value
doc.set('users[0].age', 32);
// Add new item
doc.push('users', {
id: 1005,
name: 'Eve Green'
});
// Delete item
doc.delete('users[3]');
// Get changes
const changes = doc.getChanges();
// Create hash index (O(1) lookup)
doc.createIndex('users', 'id', 'hash');
// Create BTree index (range queries)
doc.createIndex('users', 'age', 'btree');
// Ultra-fast indexed lookup
const user = doc.queryIndexed(
'users',
'id',
1001
); // ~0.05ms
// Stream process huge files
await streamQuery(
'large.tonl',
'users[?(@.active)]',
(chunk) => {
// Process each chunk
console.log(chunk);
}
);
// Constant memory usage
// Works with 10GB+ files
Real numbers from production workloads
| Operation | Speed | Complexity | Details |
|---|---|---|---|
| Parse/Decode | ~1.2ms |
O(n)
|
1MB file, 10K records |
| Encode/Serialize | ~0.8ms |
O(n)
|
1MB data, smart optimization |
| Hash Index Lookup | ~0.05ms |
O(1)
|
10M records indexed |
| BTree Range Query | ~0.3ms |
O(log n)
|
100K results from 10M |
| Sequential Scan | ~45ms |
O(n)
|
10M records, no index |
| Stream Processing | ~50MB/s |
O(1) memory
|
Constant 10MB RAM usage |
| Modification | ~0.1ms |
O(1)
|
Single value update |
| File Save (Atomic) | ~5ms |
O(n)
|
1MB file with fsync |
Benchmark Environment: Node.js 20.x on Apple M1 Pro, 16GB RAM. Times shown are median of 1000 iterations. Hash index provides 10-1600x speedup over sequential scans for large datasets.
All commands you need for data operations
tonl encode data.json
Convert JSON to TONL format with smart optimization
tonl decode data.tonl
Convert TONL back to JSON format
tonl query file.tonl
Execute JSONPath queries on TONL files
tonl validate --schema
Validate files against schema definitions
tonl format file.tonl
Reformat files with specific delimiters
tonl stats --tokenizer
Compare token costs across LLM models
tonl generate-types
Auto-generate TypeScript type definitions
tonl --help
View detailed help and usage examples
# Convert JSON to TONL with optimization
tonl encode data.json --smart --stats
# Query for specific data
tonl query data.tonl 'users[?(@.age > 25)]' --output filtered.tonl
# Validate against schema
tonl validate filtered.tonl --schema user-schema.tonl
# Convert back to JSON
tonl decode filtered.tonl --out result.json
TONL vs other popular data formats
| Feature | TONL | JSON | CSV | YAML |
|---|---|---|---|---|
| Token Efficiency | 32-50% smaller | Baseline | N/A | Similar to JSON |
| Nested Structures | โ | โ | โ | โ |
| Human Readable | โ | โ | โ | โ |
| Schema Validation | โ Built-in | Requires JSON Schema | โ | Limited |
| Query API | โ JSONPath | Library needed | โ | Library needed |
| Streaming Support | โ O(1) memory | Limited | โ | โ |
| Type System | 8 types | Dynamic | Strings only | Dynamic |
| LLM Optimized | โ Primary goal | โ | โ | โ |
| Change Tracking | โ Built-in diff | Library needed | โ | โ |
| Indexing (O(1) lookups) | โ Hash + BTree | โ | โ | โ |
Works everywhere - Node.js, Browser, Deno, Bun
<script type="module">
import { encodeTONL, decodeTONL }
from 'https://cdn.jsdelivr.net/npm/tonl@2.4.1/+esm';
const data = { users: [{ id: 1, name: 'Alice' }] };
const tonl = encodeTONL(data);
console.log(tonl);
</script>
<script src="https://unpkg.com/tonl@2.4.1/
dist/browser/tonl.umd.js"></script>
<script>
const tonl = TONL.encodeTONL({
hello: "world"
});
console.log(tonl);
</script>
import { TONLDocument, encodeTONL, EncodeOptions } from 'tonl';
// Full IntelliSense support
const options: EncodeOptions = {
includeTypes: true,
delimiter: ',',
indent: 2
};
const doc = TONLDocument.fromJSON<UserData>(data);
// Type-safe queries and modifications
const admins = doc.query('users[?(@.role == "admin")]');
Tested across 9 different data types
All benchmarks run on Node.js 20.x, Apple M1 Pro, 16GB RAM. Results are median of 1000 iterations.
Complete implementation guides and specifications available
Built for production, designed for efficiency
Save 32-50% on LLM API costs instantly. For apps processing millions of tokens daily, this translates to thousands of dollars saved monthly.
Pure TypeScript implementation with no runtime dependencies. No supply chain risks, no version conflicts, no security vulnerabilities from third-party code.
Hash indexes provide O(1) lookups - 1600x faster than sequential scans. BTree indexes enable sub-millisecond range queries on millions of records.
791+ tests passing with 100% coverage. Advanced optimization system with 10 strategies. TypeScript strict mode. Semantic versioning. Comprehensive documentation. Used in production by multiple teams.
Join teams building efficient AI applications
"TONL has dramatically reduced our LLM API costs. We're processing the same amount of data but paying 32-50% less depending on the data structure. The query API is incredibly powerful, and the zero-dependency design means no supply chain risks."
Everything you need to know about TONL
TONL eliminates JSON's redundant syntax like repeated key names, excessive quotes, and bracket nesting. It uses a table-like format where column names are declared once in a header, then data rows follow with just values. The smart encoder also chooses optimal delimiters (, | ; or tab) based on your data to minimize escaping.
Yes! TONL provides perfect round-trip conversion with JSON. You
can encode JSON to TONL for LLM prompts (saving tokens), then
decode back to JSON without any data loss. The CLI and API make
integration seamless - just add
tonl encode/decode
steps to your pipeline.
Unlike CSV, TONL supports nested objects, arrays, multiple data types, and maintains type information. Unlike JSON, it's optimized for token efficiency. Unlike YAML, it has a formal specification and fast parsers.
TONL also includes powerful features CSV lacks: JSONPath queries, schema validation, indexing for performance, streaming support for huge files, and built-in modification APIs.
Absolutely! TONL v1.0+ is production-ready with 589 passing tests, zero known bugs, and stable APIs. It has zero runtime dependencies, comprehensive documentation, and follows semantic versioning. The TypeScript implementation is fully typed and includes extensive error handling. Many projects are already using it successfully.
Getting started is simple:
npm install tonl
Then import and use in your code, or use the CLI for quick conversions. Check out our documentation and interactive playground to learn more.
Yes! TONL is a text format that works with any LLM that accepts text input - OpenAI GPT, Anthropic Claude, Google Gemini, Meta Llama, etc. The format is designed to be LLM-friendly and human-readable. Token savings are consistent across all major tokenizers (tested with GPT-5, Claude Sonnet 4.5, Gemini 2.5 Pro, Gemini 3 Pro, and Llama 4).
Join developers who are building smarter, more efficient AI applications with TONL