Documentation
Learn how L0ss compresses different file types and what each compression level does.
π L0ss Client - Open Source Edition
Want to compress files without uploading them? Try our 100% client-side version that runs entirely in your browser. All compression happens on your deviceβno uploads, works offline, and is fully open source under MIT license.
β‘ Quick Reference
Overview of typical compression ranges for each file type.
Compression Levels
L0ss offers three compression levels, each with different tradeoffs between file size reduction and data preservation.
Minimal - Safe Optimizations
Recommended for: Production files where accuracy is critical
Typical reduction: 10-30%
Data loss: Low - Removes only redundant data (whitespace, comments, metadata)
Reversibility: Often 100% recoverable to original structure
- Removes whitespace and formatting
- Strips comments and documentation
- Removes unnecessary metadata
- Preserves all functional data
Moderate - Balanced Approach
Recommended for: Most use cases - best balance of size and quality
Typical reduction: 30-60%
Data loss: Medium - Removes non-critical data and applies smart optimizations
Reversibility: Partially recoverable (core data intact)
- All minimal optimizations
- Removes low-priority fields
- Rounds numbers to reasonable precision
- Deduplicates similar values
- Simplifies nested structures
Aggressive - Maximum Compression
Recommended for: Testing, prototypes, or when size is paramount
Typical reduction: 60-90%
Data loss: High - Keeps only essential data
Reversibility: Limited (skeleton structure only)
- All moderate optimizations
- Aggressive field removal
- Heavy numeric rounding
- Significant data sampling
- Schema simplification
File Type Compression Methods
JSON Files
JSON compression focuses on removing verbosity while maintaining structure.
Minimal Level:
- Remove whitespace and indentation
- Strip null values (configurable)
- Remove empty arrays/objects
Moderate Level:
- All minimal optimizations
- Advanced key compression - Frequency-based key shortening (e.g., "firstName" β "0")
- Remove optional fields (description, metadata, etc.)
- Round numbers to 2 decimal places
- Deduplicate repeated objects
- Shorten long string values
Aggressive Level:
- All moderate optimizations
- Keep only critical fields (id, name, core data)
- Round to integers where possible
- Sample large arrays (keep every nth item)
- Replace verbose values with abbreviations
Example:
// Original (428 bytes)
{
"users": [
{
"id": 1,
"name": "John Doe",
"email": "john@example.com",
"bio": "Software engineer with 10 years experience...",
"score": 87.6543,
"metadata": {
"created": "2023-01-15",
"updated": "2024-01-15"
}
}
]
}
// Moderate (156 bytes, 64% reduction)
{"users":[{"id":1,"name":"John Doe","email":"john@example.com","score":87.65}]}
CSV Files
CSV compression removes unnecessary columns and rows while preserving data structure. Includes advanced techniques for time-series and sequential data.
Minimal Level:
- Remove empty rows and columns - Eliminates rows/columns with no data
- Trim whitespace - Removes leading/trailing spaces from cells
- Remove duplicate rows - Deduplicates identical rows
Moderate Level:
- All minimal optimizations
- Remove low-variance columns - Removes columns where all values are identical
- Round numeric values - Reduces precision to specified decimal places
- Truncate long text fields - Shortens text fields over 50 characters
- Delta encoding - Stores differences between consecutive values instead of absolute values (ideal for time-series data)
- Dictionary encoding - Replaces repeated string values with integer codes (ideal for categorical data)
Delta Encoding: Inspired by Gorilla time-series database (Facebook) and Daniel Lemire's frame-of-reference encoding. Stores first value as base/reference, then stores deltas (differences) for subsequent values. Example: [100, 105, 110] β [100, +5, +5]. Works best with monotonically increasing values like timestamps, sensor readings, or sequential IDs. Fully reversible via cumulative sum.
References:
β’ "Effective compression using frame-of-reference and delta coding" (Lemire, 2012)
β’ "The Design of Fast Delta Encoding for Delta Compression Based Storage Systems" (ACM TOS, 2024)
β’ Wikipedia: Delta Encoding
- Dictionary encoding - Replaces repeated string values with integer codes (ideal for categorical data)
Dictionary Encoding: Inspired by BtrBlocks (SIGMOD 2023) and SAP HANA dictionary compression. Builds a dictionary of unique values and replaces all occurrences with integer codes. Example: ["USA", "Canada", "USA", "Mexico"] β [0, 1, 0, 2] with dictionary {0: "USA", 1: "Canada", 2: "Mexico"}. Works best with categorical data (countries, statuses, types) where unique values are much less than total values. Achieves 10-100x compression on real-world datasets with high repetition. Fully reversible via dictionary lookup.
References:
β’ BtrBlocks: Efficient Columnar Compression for Data Lakes (SIGMOD 2023)
β’ BtrBlocks on GitHub
β’ Wikipedia: Dictionary Coder
Aggressive Level:
- All moderate optimizations
- Keep only first N columns - Retains first 5 columns, discards rest
- Sample rows - Keeps every nth row (default: every 5th)
- Remove non-essential columns - Keeps only specified column indices
- Remove outliers - Statistical outlier detection using 2 standard deviations
- Statistical sampling - Systematic sampling keeping 30% of rows
Note: Aggressive CSV compression can result in significant data loss. These techniques are inspired by database compression algorithms like BtrBlocks (SIGMOD 2023) which achieves 60-80% reduction on real-world datasets. Always test with sample data before using in production.
JavaScript Files
JavaScript compression uses advanced minification techniques inspired by tdewolff/minify, which "employs all the rules that JSMin does too, but has additional improvements."
Minimal Level:
- Remove comments - Single-line (//) and multi-line (/* */) comments
- Remove whitespace - Collapse multiple spaces, remove unnecessary newlines
- Preserve all functional code - No behavioral changes
Moderate Level:
- All minimal optimizations
- Remove console statements - console.log, console.debug, etc.
- Remove debugger statements - debugger; calls
- Enhanced whitespace removal - More aggressive space optimization around operators
- Optimize semicolons - Remove unnecessary semicolons before closing braces
- Shorten boolean literals - true β !0, false β !1
- Optimize number literals - 1.0 β 1, 0.5 β .5, 1000 β 1e3
- Shorten common patterns - undefined β void 0, Infinity β 1/0
Aggressive Level:
- All moderate optimizations
- Mangle variable names - Shorten identifiers to single letters (optional)
- Remove dead code - Eliminate unreachable code blocks (optional)
Note: Aggressive optimizations like name mangling can break code that relies on variable names (e.g., for reflection or debugging). Test thoroughly before using in production.
HTML Files
HTML compression uses advanced HTML5 specification techniques inspired by tdewolff/minify.
Minimal Level:
- Remove HTML comments
- Remove whitespace between tags
- Collapse multiple spaces to single space
Moderate Level:
- All minimal optimizations
- Normalize DOCTYPE - Replace verbose doctypes with short HTML5 version:
<!DOCTYPE html> - Remove default type attributes - Not needed in HTML5:
<script type="text/javascript">β<script> - Collapse boolean attributes -
disabled="disabled"βdisabled - Remove default attribute values -
<input type="text">β<input> - Remove optional closing tags -
</li>,</p>,</td>,</option>, etc. - Remove optional quotes from attributes
Aggressive Level:
- All moderate optimizations
- Remove meta tags (configurable)
- Remove analytics/tracking scripts (configurable)
- Inline small CSS files (configurable)
Example:
// Before (1032 bytes)
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<form>
<input type="text" required="required">
<button type="submit">Submit</button>
</form>
<ul>
<li>Item 1</li>
<li>Item 2</li>
</ul>
// After moderate (562 bytes, 45.5% reduction)
<!DOCTYPE html><form><input required><button>Submit</button></form><ul><li>Item 1<li>Item 2</ul>
CSS Files
CSS compression uses structural optimization techniques inspired by CSSO (CSS Optimizer).
Minimal Level:
- Remove CSS comments
- Remove whitespace and line breaks
- Minify spacing around braces, colons, semicolons
Moderate Level:
- All minimal optimizations
- Shorten color codes -
#ffffffβ#fff - Remove zero units -
0pxβ0 - Merge duplicate properties - Combine
margin+margin-leftinto single declaration - Shorten repeated values -
margin: 10px 10px 10px 10pxβmargin: 10px - Merge identical selectors - Combine multiple
h1blocks into one - Remove duplicate properties - Keep only last occurrence of duplicate declarations
Aggressive Level:
- All moderate optimizations
- Remove unused CSS rules (configurable)
- Merge duplicate selectors across file (configurable)
Example:
// Before (465 bytes)
body {
margin: 20px 30px;
margin-left: 0px;
}
h1 { font: 200 36px/1.5 sans-serif; }
h1 { color: #ff6600; }
.button {
padding: 10px 10px 10px 10px;
background-color: #ffffff;
}
// After moderate (238 bytes, 48.8% reduction)
body{margin:20px 30px 20px 0}h1{font:200 36px/1.5 sans-serif;color:#f60}.button{padding:10px;background-color:#fff}
SVG Files
SVG compression uses optimization techniques inspired by SVGO (SVG Optimizer) and SVGOMG (Jake Archibald's web GUI for SVGO).
Minimal Level:
- Remove XML declaration - header
- Remove comments -
- Remove metadata - Editor-specific data (Inkscape, Sodipodi, etc.)
- Remove DOCTYPE - declarations
- Remove whitespace - Collapse spaces between tags
Moderate Level:
- All minimal optimizations
- Remove empty groups - Empty
, , containers (SVGO removeEmptyContainers) - Optimize viewBox - Remove redundant width/height when viewBox exists
- Remove hidden elements - display:none, opacity:0, width/height=0 (SVGO removeHiddenElems)
- Cleanup attributes - Remove newlines, trim spaces (SVGO cleanupAttrs)
- Remove empty attributes - Empty class, id, style, etc. (SVGO removeEmptyAttrs)
- Collapse groups - Move single-child group attrs to child (SVGO collapseGroups)
- Convert ellipse to circle - When rx === ry, use shorter circle tag (SVGO convertEllipseToCircle)
- Remove useless stroke/fill - Remove invalid stroke/fill on non-shape elements (SVGO removeUselessStrokeAndFill)
- Optimize path data - Round coordinates, remove spaces, shorten numbers (SVGO convertPathData)
- Round coordinates - Reduce precision to 2 decimal places
- Shorten colors - #ffffff β #fff, named colors β hex
- Remove default attributes - fill="black", opacity="1", stroke-width="1"
- Remove unused namespaces - xmlns:* declarations not used
Aggressive Level:
- All moderate optimizations
- Simplify paths - Reduce precision to 1 decimal place (may affect quality)
- Remove descriptions - Remove <title> and <desc> elements
- Remove unused IDs - IDs that aren't referenced by url(#...) or href
- Simplify transforms - Remove identity transforms, simplify matrix()
Note: Aggressive path simplification can degrade visual quality. Always compare before/after visuals to ensure acceptable results.
SQL Files
SQL compression uses query optimization techniques inspired by tdewolff/minify to reduce verbosity while maintaining query functionality.
Minimal Level:
- Remove single-line comments - -- comment text
- Remove multi-line comments - /* comment blocks */
- Collapse whitespace - Multiple spaces/tabs to single space
- Remove line breaks - Normalize to single-line format
- Trim statement boundaries - Remove leading/trailing whitespace
Moderate Level:
- All minimal optimizations
- Shorten table/column aliases - Long names (user_table β a, product_catalog β b)
- Remove optional keywords - OUTER from joins, PUBLIC schema qualifiers, CASCADE keywords
- Combine INSERT statements - Merge consecutive INSERTs into multi-value format: INSERT INTO t VALUES(1),(2),(3)
- Normalize spacing around operators - = 1 β =1, + 5 β +5
Aggressive Level:
- All moderate optimizations
- Remove schema changes - Strip CREATE, DROP, ALTER statements
- Remove transactions - Remove BEGIN, COMMIT, ROLLBACK
- Remove constraints - Strip PRIMARY KEY, FOREIGN KEY, CHECK
- Sample INSERT rows - Keep only representative subset (every 5th row)
- Keep data operations only - Optionally filter to INSERT/UPDATE/SELECT/DELETE
Expected Reduction: 35% (minimal), 49.8% (moderate), 70.4% (aggressive) depending on query complexity and data volume.
XML/YAML Files
Structured data compression inspired by tdewolff/minify removes verbosity from markup and configuration files.
XML - Minimal Level:
- Remove comments -
- Collapse whitespace - Multiple spaces/tabs to single space
- Remove indentation - Normalize to single-line format
- Trim tag whitespace - Remove spaces between tags
XML - Moderate Level:
- All minimal optimizations
- Remove DOCTYPE - Strip declarations
- Remove CDATA sections - Convert to plain text
- Remove empty elements -
removal - Collapse boolean attributes - attr="attr" β attr
- Remove attribute quotes - id="value" β id=value (when safe)
- Round numeric values - Reduce precision to 2 decimal places
YAML - Minimal Level:
- Remove comments - # YAML comments
- Normalize whitespace - Consistent indentation
- Trim trailing spaces - Remove line-end whitespace
YAML - Moderate Level:
- All minimal optimizations
- Remove document markers - Strip --- and ... markers
- Inline short strings - Convert block scalars (| and >) to inline
- Remove unnecessary quotes - Unquote simple strings
- Shorten null values - Convert null to ~ (shorter YAML null)
- Convert to flow style - Arrays/objects to compact format
Expected Reduction: 10-35% depending on markup verbosity and structure complexity.
Markdown Files
Markdown compression removes redundant formatting and simplifies structure while maintaining readability and rendering compatibility.
Minimal Level:
- Remove extra blank lines - Collapse multiple blank lines to single
- Normalize whitespace - Consistent spaces, no tabs
- Trim trailing spaces - Remove line-end whitespace
- Remove trailing newlines - Clean EOF whitespace
Moderate Level:
- All minimal optimizations
- Simplify reference-style links - Convert [text][ref] to [text](url) inline format
- Remove HTML comments - Strip from Markdown
- Simplify headers - Remove trailing # from ATX-style headers (## Title ## β ## Title)
- Clean code blocks - Remove emphasis markers (*_) that don't render in code
- Remove link reference definitions - Delete unused [ref]: url lines at end of document
Aggressive Level:
- All moderate optimizations
- Shorten image alt text - Truncate to 30 chars + "..." for compression
- Remove redundant emphasis - **bold** inside headers (already bold)
- Compact list formatting - Remove extra spacing between list items
- Remove empty link titles - [text](url "title") β [text](url)
Expected Reduction: 0.4% (minimal), 12.6% (moderate), 20-25% (aggressive) depending on document structure and formatting verbosity.
Plain Text Files
Text file compression focuses on whitespace normalization and duplicate content removal while preserving readability.
Minimal Level:
- Remove trailing whitespace - Strip spaces/tabs at end of lines
- Normalize line endings - Convert CRLF (Windows) β LF (Unix)
- Remove empty lines at EOF - Clean trailing newlines
- Normalize final newline - Ensure single newline at end
Moderate Level:
- All minimal optimizations
- Collapse multiple blank lines - Multiple blank lines β single blank line
- Normalize spaces - Replace tabs with spaces (configurable width)
- Remove duplicate consecutive lines - Identical adjacent lines kept once
- Normalize multiple spaces - Multiple spaces β single space (optional)
Aggressive Level:
- All moderate optimizations
- Remove all blank lines - Strip all empty lines completely
- Trim leading spaces - Remove indentation from all lines
- Remove non-consecutive duplicates - Deduplicate entire file (keep first occurrence)
- Collapse multiple spaces - Multiple spaces β single space throughout
Expected Reduction: 2.5% (minimal), 7.6% (moderate), 11.8% (aggressive) for typical text files with moderate formatting and whitespace.
Research & Techniques
L0ss compression algorithms are inspired by proven open-source projects and academic research in data compression. We stand on the shoulders of giants and give credit where it's due.
JSON Key Compression
Our advanced JSON key compression algorithm uses frequency analysis to assign the shortest codes to the most common keys. This technique can achieve 40-60% additional compression on JSON files with repeated key structures.
Inspired by these projects:
- jsonschema-key-compression by pubkey - Schema-based key shortening that compresses long property names into minimal codes (e.g., "firstName" β "|e")
- compress-json by beenotung - Space-efficient JSON storage through structural compression
- json-deduper by mattkrick - Compresses JSON trees by deduplicating nested objects, strings, and numbers
How it works:
- Frequency Analysis: Scan entire JSON tree to count how often each key appears
- Priority Sorting: Sort keys by frequency (most common first) and length (longest first)
- Smart Encoding: Assign single-character codes (0-9, a-z, A-Z) to top 62 keys, then |a, |b, |c... for additional keys
- Recursive Application: Apply key mappings throughout the entire object tree
Real-world example:
// Before (1012 bytes)
{
"users": [
{
"firstName": "John",
"lastName": "Doe",
"emailAddress": "john@example.com",
"phoneNumber": "555-1234",
"streetAddress": "123 Main St",
"city": "Springfield",
"state": "IL",
"postalCode": "62701"
}
// ... 2 more users with same structure
]
}
// After moderate compression (465 bytes, 54% reduction)
{"e":[{"0":"123 Main St","1":"john@example.com","2":"555-1234","3":"62701","4":"John","5":"Doe","6":"USA","7":"IL","8":"Springfield"},...]}
// Keys compressed: "firstName"β"4", "emailAddress"β"1", "users"β"e"
Multi-Format Minification
Our HTML, CSS, JavaScript, and SVG minification techniques draw from best practices in web optimization.
Inspired by these projects:
- tdewolff/minify - High-performance Go minifiers for HTML5, CSS3, JS, JSON, SVG, and XML
- CSSO - CSS optimizer with structural optimizations, merging declarations and rules
- html-minifier - Configurable JavaScript-based HTML minifier with SVG support
- SVGOMG - Visual SVG compression tool for web optimization
Deduplication Techniques
Finding and removing duplicate data blocks is essential for efficient compression.
Inspired by these projects:
Future Research
We're actively exploring additional compression techniques:
- Lossy text compression: Using thesaurus-based word replacement (inspired by lossytextcompressor)
- Schema-based template compression: Extracting schemas for homogeneous data arrays (HPack algorithm)
- Object structure deduplication: Storing unique nested objects once and using references
- Machine learning compression: Adaptive compression based on file patterns
Recovery Manifests
Every compression generates a recovery manifest that documents what changes were made. This manifest allows you to understand exactly what was removed or modified.
Manifest Contents:
- Operations Applied: List of all transformations
- Original Hash: Verify file integrity
- Compression Stats: Before/after sizes, reduction percentage
- Reversibility Score: How much can be recovered
- Field Mapping: Which fields were removed or modified
Using Manifests:
// Download your manifest
GET /api/manifest/:manifestId
// Response includes recovery information
{
"operations": [
{"type": "remove_whitespace", "reversible": true},
{"type": "remove_field", "field": "metadata", "reversible": false},
{"type": "round_numbers", "precision": 2, "reversible": false}
],
"reversibility": "Partially recoverable"
}
Best Practices
Choosing a Compression Level:
| Use Case | Recommended Level | Reason |
|---|---|---|
| Production backups | Minimal | Preserve all functional data |
| Development datasets | Moderate | Good balance for testing |
| Prototypes/demos | Aggressive | Size matters more than accuracy |
| Transfer large logs | Moderate | Keep essential info, reduce size |
| Archive old data | Minimal | Future recovery may be needed |
Tips for Maximum Compression:
- Remove unnecessary fields first: Clean your data before uploading
- Use appropriate file types: JSON compresses better than XML
- Deduplicate before compression: Remove duplicate entries first
- Download the manifest: Always save recovery information
- Test with samples: Try different levels on a subset first
API Reference
For programmatic access, see the API Endpoints section on the homepage.