Getting Started

Install and start using TONL in 30 seconds

Installation

npm install tonl

Quick Example

import { TONLDocument } from 'tonl';

// Create from JSON
const doc = TONLDocument.fromJSON({
  users: [{ name: 'Alice', age: 30 }]
});

// Query
const result = doc.query('users[*].name');

// Modify
doc.set('users[0].age', 31);

// Save
await doc.save('data.tonl');

Encoding Options

Compact Mode (Default)

encodeTONL(data)

Maximum token savings (38-50%), no type hints

With Type Hints

encodeTONL(data, { includeTypes: true })

Schema validation enabled (~32% savings)

Custom Delimiter

encodeTONL(data, { delimiter: '|' })

Use pipe, tab, or semicolon delimiters

Smart Encoding

encodeSmart(data)

Auto-selects best delimiter and options

Need more?

Check out the full getting started guide on GitHub.

API Reference

Complete TONLDocument API and core functions

📦 Creation Methods

TONLDocument.fromJSON(data)

Create TONL document from JavaScript object

const doc = TONLDocument.fromJSON({ users: [...] })

TONLDocument.fromTONL(text)

Parse TONL text into document

const doc = TONLDocument.fromTONL(tonlText)

TONLDocument.load(path)

Load TONL file from disk

const doc = await TONLDocument.load('data.tonl')

🔍 Query Methods

query(path)

JSONPath-like queries with filters

doc.query('users[?(@.role == "admin")]')

get(path)

Get single value at path

doc.get('users[0].name') // "Alice"

has(path)

Check if path exists

doc.has('users[0].email') // true/false

✏️ Modification Methods

set(path, value)

Update value at path

doc.set('users[0].age', 31)

delete(path)

Remove field or array element

doc.delete('user.tempField')

push(path, value)

Append to array

doc.push('users', newUser)

merge(path, object)

Deep merge objects

doc.merge('config', updates)

💾 Save & Export

toTONL()

Export as TONL string

const tonlText = doc.toTONL()

toJSON()

Export as JavaScript object

const obj = doc.toJSON()

save(path)

Atomic file save with backup

await doc.save('output.tonl')
View complete API reference (50+ methods)

Query Syntax

JSONPath-like query expressions

user.name
Property access
users[0]
Array indexing
users[*].name
Wildcard (all names)
$..email
Recursive descent (all emails at any depth)
users[?(@.age > 18)]
Filter expression

Filter Operators

Comparison
==, !=, >, <, >=, <=
Logical
&&, ||, !
String
contains, startsWith, endsWith

Advanced Examples

users[?(@.age > 25 && @.active)]
Multiple conditions with AND
users[0:3]
Array slicing (first 3 items)
products[?(@.price < 100)]
Filter by numeric value
users[?(@.name contains "Smith")]
String contains filter
$.store.products[*].price
All product prices
View complete query documentation

CLI Tools

Command-line interface for TONL

🎮 v2.5.2 Interactive CLI Dashboard

INTERACTIVE STATS
tonl stats data.json --interactive
THEME CUSTOMIZATION
tonl stats data.json -i --theme neon
FILE COMPARISON
tonl stats data.json --compare
MULTI-TOKENIZER
tonl stats data.json --tokenizer claude-sonnet-4.5

Experience the future of CLI with real-time analysis and beautiful themes

🌟 Interactive Features

Menu-driven interface with real-time feedback
4 beautiful color themes (neon, matrix, cyberpunk, default)
Side-by-side JSON/TONL file comparison
Live tokenizer switching (GPT-5, Claude Sonnet 4.5, Gemini 2.5 Pro, Gemini-2.0)
Progress visualization and animations
Deep file structure analysis and exploration
ENCODE
tonl encode data.json --smart
DECODE
tonl decode data.tonl
QUERY
tonl query users.tonl 'users[*]'
GET
tonl get data.tonl "user.name"
VALIDATE
tonl validate --schema schema.tonl
STATS
tonl stats data.json --tokenizer claude-sonnet-4.5

🚀 Interactive CLI Usage Guide

Getting Started with Interactive Mode

tonl stats your-data.json --interactive

Launch the interactive dashboard with real-time file analysis and beautiful visual feedback.

Beautiful Themes

tonl stats data.json -i --theme neon # Bright colors
tonl stats data.json -i --theme matrix # Green terminal
tonl stats data.json -i --theme cyberpunk # Futuristic style
tonl stats data.json -i --theme default # Clean terminal

Choose from 4 stunning visual themes for personalized experience.

Interactive Menu Options

1. Analyze another file - Deep dive into different datasets
2. Compare two files - Side-by-side JSON/TONL analysis
3. Change theme - Switch visual styles instantly
4. Change tokenizer - Switch between LLM models
5. Detailed statistics - Comprehensive compression metrics
6. Exit - Clean exit with resource cleanup

Multi-Tokenizer Support

tonl stats data.json --tokenizer gpt-5 # GPT-5
tonl stats data.json --tokenizer claude-sonnet-4.5 # Claude Sonnet 4.5
tonl stats data.json --tokenizer gemini-2.5-pro # Gemini 2.5 Pro
tonl stats data.json --tokenizer gemini-3-pro # Gemini 3 Pro
tonl stats data.json --tokenizer llama-4 # Llama 4
tonl stats data.json --tokenizer o200k # OpenAI o200k
tonl stats data.json --tokenizer cl100k # OpenAI cl100k

Compare token costs across different LLM models in real-time.

View full CLI documentation

LLM Integration Prompt

System prompt for teaching LLMs to read TONL data

📋 Ready-to-Use System Prompt

Copy this into your LLM system prompt when sending TONL formatted data:

The following data is in TONL format. Parse it as follows:

• Lines with [count]{fields}: are array headers, data rows follow
• Lines with {fields}: are object headers, field: value pairs follow
• Indentation (2 spaces) indicates nesting levels
• Default delimiter is comma unless #delimiter header specifies otherwise
• Type hints may appear: field:type (e.g., id:u32, name:str)
  → Ignore the :type part, just parse the values
• Value types: unquoted numbers/booleans, quoted strings, null
• v2.0 Optimization: May contain #optimize directives (ignore these, they're metadata)

Examples:
Without types: users[2]{id,name,role}:
With types: users[2]{id:u32,name:str,role:str}:
With optimization: #optimize dictionary delta bitpack
All parse the same - just read the data values.

This represents: {"users": [{"id":1,"name":"Alice","role":"admin"}, {"id":2,"name":"Bob","role":"user"}]}

TONL v2.0 provides 60% additional compression while maintaining full LLM compatibility.

💡 Why This Works

  • • Prompt is only ~180 tokens
  • • Negligible cost vs 60-70% total data savings
  • • Works with GPT, Claude, Gemini, Llama, etc.
  • • LLMs naturally understand structured text
  • • v2.0 optimization directives are transparent to LLMs
View full LLM integration guide

Schema Validation

Define and validate data structures with TONL schemas

✅ What is Schema Validation?

Schemas let you define data structure, types, and constraints. TONL validates data against schemas and can auto-generate TypeScript types.

Basic Schema Example

@schema v1
@strict true

User: obj
  id: u32 required
  username: str required min:3 max:20
  email: str required pattern:email
  age: u32? min:13 max:150
  roles: list<str> required

users: list<User> required min:1

Validation

import { parseSchema, validateTONL } from 'tonl/schema';

// Load and parse schema
const schema = parseSchema(schemaContent);

// Validate data
const result = validateTONL(data, schema);

if (!result.valid) {
  result.errors.forEach(err => {
    console.error(`${err.field}: ${err.message}`);
  });
}

Type System

Primitive Types

str, u32, i32, u64, i64, f32, f64, bool

Complex Types

obj, list, list<T>, optional (field?)

String Constraints

min, max, pattern, email, url

Number Constraints

min, max, positive, negative, integer
View full schema specification

Streaming API

Process multi-GB files with constant memory usage

🌊 Why Streaming?

Stream processing allows you to work with files larger than available RAM. TONL's line-based format is perfect for streaming - process records one at a time with <100MB memory usage.

Stream Encoding

import { createEncodeStream } from 'tonl/stream';
import { createReadStream, createWriteStream } from 'fs';

// Stream encode large JSON files
createReadStream('huge.json')
  .pipe(createEncodeStream({ smart: true }))
  .pipe(createWriteStream('huge.tonl'));

Stream Query

import { streamQuery } from 'tonl/stream';

// Query huge files efficiently
await streamQuery(
  'large-dataset.tonl',
  'users[?(@.active)]',
  (chunk) => {
    // Process each matching chunk
    console.log(chunk);
  }
);

// Memory stays constant ~10MB

Performance Metrics

~50 MB/s
Processing Speed
<100 MB
Memory Usage
10+ GB
File Size Support
O(1)
Memory Complexity

v2.0 Advanced Optimization

10 optimization strategies for up to 60% additional compression

🚀 Complete Optimization Suite

TONL v2.0 introduces 10 advanced optimization strategies that provide up to 60% additional compression beyond standard TONL encoding.

🎯 10 Strategies

Dictionary, Delta, RLE, Bit Packing, Column Reorder, Quantizer, Schema Inheritance, Hierarchical Grouping, Tokenizer Aware, Adaptive

📈 60% Savings

Additional compression beyond standard TONL with automatic strategy selection

🚀 Zero Effort

One-line activation with automatic optimization

Quick Start

tonl encode data.json --optimize --verbose

Enable all optimization strategies with a single command

Optimization Strategies

📚 Dictionary Encoding

Compress repetitive values using lookup dictionaries

⏭️ Delta Encoding

Compress sequential numeric data (timestamps, IDs, counters)

🔄 Run-Length Encoding

Compress repeated consecutive values

💾 Bit Packing

Optimized binary encoding for booleans and small integers

🔄 Column Reordering

Optimize field order for better compression

🔢 Numeric Quantization

Reduce decimal precision safely

API Usage

import { AdaptiveOptimizer } from 'tonl';

// Create optimizer
const optimizer = new AdaptiveOptimizer();

// Analyze dataset
const analysis = optimizer.analyzeDataset(data);
console.log(`Estimated savings: ${analysis.estimatedSavings}%`);

// Apply optimization
const result = optimizer.optimize(data);
console.log(`Directives: ${result.directives.length}`);

Performance Impact

60%
Additional Savings
70%
Total vs JSON
O(n)
Time Complexity
100%
Backward Compatible
View complete optimization documentation

Implementation Guide

Build TONL libraries in any language

Complete Guides Available

95KB of implementation documentation with algorithms, pseudo-code, and test requirements.

Supported Languages

🐍
Python
🔵
Go
🦀
Rust
Java

📊 Aggregation Functions

Powerful data aggregation with fluent API - count, sum, avg, min, max, groupBy and more

🆕 New in v2.5.2

Full-featured aggregation system with 15+ functions, fluent chaining, and deep integration with TONLDocument queries.

Quick Start

import { TONLDocument } from 'tonl';

const doc = TONLDocument.fromJSON({ users: [...], orders: [...] });

// Count users
doc.count('users[*]');                    // 42

// Sum order totals
doc.sum('orders[*]', 'total');           // 15420.50

// Average age
doc.avg('users[*]', 'age');             // 29.5

// Group by country
doc.groupBy('users[*]', 'country');    // { TR: [...], US: [...] }

// Chained operations
doc.aggregate('users[*]')
  .filter(u => u.active)
  .orderBy('age', 'desc')
  .take(10)
  .toArray();

Available Functions

Basic

count(), sum(field), avg(field), min(field), max(field)

Grouping

groupBy(field), distinct(field), frequency(field)

Statistical

stats(), median(), percentile(n), variance, stdDev

Transform

filter(), map(), reduce(), flatten()

Selection

first(), last(), at(n), take(n), skip(n)

Sorting

orderBy(field, 'asc'|'desc'), partition(fn)

Full Statistics Example

const stats = doc.aggregate('products[*]').stats('price');
// Returns:
// {
//   count: 150,
//   sum: 7499.50,
//   avg: 49.99,
//   min: 9.99,
//   max: 999.99,
//   variance: 12500.25,
//   stdDev: 111.80
// }

🔍 Fuzzy Matching

Levenshtein, Jaro-Winkler, Soundex algorithms for approximate string matching

🆕 New in v2.5.2

Find similar strings even with typos, variations, or phonetic similarities. Perfect for search, name matching, and data cleanup.

Query Operators

users[?(@.name ~= 'john')]
Fuzzy equality match (finds John, Jon, Joan...)
users[?(@.name ~contains 'smith')]
Fuzzy contains (finds Smith, Smyth, Smithe...)
users[?(@.name soundsLike 'Robert')]
Phonetic match using Soundex (finds Robert, Rupert...)

Direct API Usage

import {
  levenshteinDistance,
  fuzzyMatch,
  soundsLike,
  fuzzySearch
} from 'tonl/query/fuzzy-matcher';

// Levenshtein distance
levenshteinDistance('kitten', 'sitting');  // 3

// Fuzzy match with threshold
fuzzyMatch('John', 'Jon', { threshold: 0.8 });  // true

// Phonetic matching
soundsLike('Smith', 'Smyth');  // true

// Search with ranking
fuzzySearch('JavaScrpt', ['JavaScript', 'TypeScript', 'Python']);
// [{ value: 'JavaScript', similarity: 0.9, index: 0 }]

Algorithms

Levenshtein

Edit distance - insertions, deletions, substitutions

Jaro-Winkler

Optimized for short strings with prefix bonus

Dice Coefficient

Bigram overlap for longer text comparison

Soundex/Metaphone

Phonetic encoding for name matching

⏰ Temporal Queries

Filter and compare dates with natural syntax - @now-7d, @today, before, after, sameWeek

🆕 New in v2.5.2

Query data by date ranges, relative times, and calendar periods. Perfect for logs, events, and time-series data.

Temporal Literals

@now
Current timestamp
@today, @yesterday, @tomorrow
Named dates (start of day)
@now-7d, @now+1w, @now-3M
Relative time (d=days, w=weeks, M=months, y=years)
@2025-01-15, @2025-01-15T10:30:00Z
ISO 8601 dates

Query Examples

// Events in the last 7 days
doc.query('events[?(@.date > @now-7d)]');

// Orders from today
doc.query('orders[?(@.createdAt sameDay @today)]');

// Logs before a specific date
doc.query('logs[?(@.timestamp before @2025-01-01)]');

// Tasks due this week
doc.query('tasks[?(@.dueDate sameWeek @now)]');

// Records older than 3 months
doc.query('records[?(@.date daysAgo 90)]');

Temporal Operators

Comparison

before, after, between

Relative

daysAgo, weeksAgo, monthsAgo, yearsAgo

Calendar

sameDay, sameWeek, sameMonth, sameYear

Units

s(sec), m(min), h(hour), d(day), w(week), M(month), y(year)

Direct API Usage

import {
  parseTemporalLiteral,
  isBefore,
  isAfter,
  isSameDay
} from 'tonl/query/temporal-evaluator';

// Parse temporal literal
const lastWeek = parseTemporalLiteral('@now-7d');
console.log(lastWeek.timestamp);  // Unix timestamp

// Compare dates
isBefore(new Date('2025-01-01'), new Date('2025-12-31'));  // true
isAfter(new Date('2025-12-31'), new Date('2025-01-01'));   // true
isSameDay(new Date(), new Date());  // true