「‍」 Lingenic

Format Comparison

HSV vs JSON, CSV, XML, and others

Overview

Format Hierarchy Escaping Binary Human Readable Streaming Parsing
HSV Unlimited Never* Native (DLE) Data only Native Parallel
JSON Unlimited Required Base64 Yes Awkward Sequential
CSV 1 level Quoting Base64 Yes Line-based Sequential
XML Unlimited Entities Base64/CDATA Verbose SAX parsers Sequential
MessagePack Unlimited Never Native Binary Native Sequential
Protocol Buffers Schema Never Native Binary Native Sequential

* Text mode never needs escaping. Binary mode uses DLE transparency (~0.4% overhead).

vs JSON

Simple Object

JSON

{"name": "Alice", "age": 30, "city": "NYC"}

42 bytes

HSV

name␟Alice␞age␟30␞city␟NYC

26 bytes (38% smaller)

Special Characters

JSON (must escape)

{"msg": "He said \"hello\""}

HSV (no escaping)

msg␟He said "hello"

JSON (backslashes)

{"path": "C:\\Users\\data"}

HSV (literal)

path␟C:\Users\data

Multiline Content

JSON (escaped newlines)

{"text": "line 1\nline 2\nline 3"}

HSV (literal newlines)

text␟line 1
line 2
line 3

Streaming

JSON (NDJSON workaround)

{"name": "Alice"}
{"name": "Bob"}
{"name": "Carol"}

Breaks if data contains newlines

HSV (native framing)

␂name␟Alice␜name␟Bob␜name␟Carol␃

Newlines in data are fine. ␂=start ␜=separator ␃=end

Nested Objects

JSON

{"user": {"name": "Alice", "email": "a@b.com"}}

HSV

user␟␎name␟Alice␞email␟a@b.com␏

Legend: ␟ = key:value · ␞ = properties · ␝ = array · ␎/␏ = nested (SO/SI) · ␜ = records · ␂/␃ = start/end

Both JSON and HSV support unlimited nesting. HSV uses SO/SI (Shift Out/Shift In) characters for nesting depth.

vs CSV

AspectCSVHSV
Hierarchy 1 level (rows and columns) Unlimited (SO/SI nesting)
Delimiter Comma (printable, common in data) Control codes (never in data)
Quoting Required for special chars Never
Newlines in data Requires quoting Just works
Named fields Header row convention Built-in key-value
Nested data Not supported Supported

The Quoting Problem

CSV

name,bio
Alice,"Software engineer, loves coding"
Bob,"Said ""hello"" yesterday"

HSV

␂name␟Alice␞bio␟Software engineer, loves coding␜name␟Bob␞bio␟Said "hello" yesterday␃

CSV requires quoting when data contains commas or quotes. HSV never needs quoting.

vs XML

XML

<user>
  <name>Alice</name>
  <age>30</age>
</user>

56 bytes

HSV

name␟Alice␞age␟30

17 bytes (70% smaller)

Entity Escaping

XML

<msg>x &lt; y &amp; a &gt; b</msg>

HSV

msg␟x < y & a > b

Document Structures (HTML-like data)

HSV can represent document trees. Angle brackets are just visible nesting delimiters—SO/SI are invisible ones:

HTML

<div class="box">
  <p>Hello <b>world</b></p>
</div>

HSV

tag␟div␞class␟box␞children␟␎
  tag␟p␞children␟␎
    text␟Hello
    ␝tag␟b␞text␟world
  ␏
␏

Where HSV wins: No entity escaping. Literal <, &, " in text content.

Where HTML wins: Human authoring, view source, 30 years of browser tooling.

AST Serialization

Abstract syntax trees are nested structures with node types and properties—a natural fit for HSV:

JSON AST

{"type": "BinaryExpr",
 "op": "+",
 "left": {"type": "Num", "value": 1},
 "right": {"type": "Num", "value": 2}}

HSV AST

type␟BinaryExpr␞op␟+␞left␟␎type␟Num␞value␟1␏␞right␟␎type␟Num␞value␟2␏

Compiler toolchains, linters, and code formatters can exchange ASTs without escaping operators like \, ", or &&.

Templating Engines

Templates mix literal text with structure. HSV keeps them separate without escape sequences:

Traditional template

Hello {{name}},
Your balance is ${{amount}}.
Click <a href="{{url}}">here</a>.

Escaping needed for quotes, angles, curlies

HSV template

text␟Hello ␝var␟name␝text␟,
Your balance is $␝var␟amount␝text␟.
Click ␝tag␟a␞href␟␎var␟url␏␞text␟here␝text␟.

Structure is separate from content

Template compilation becomes tree manipulation, not string surgery.

vs Binary Formats

AspectMessagePack / ProtobufHSV
Size Smallest Slightly larger
Human readable No (binary) Yes (data visible)
Debuggable Needs tools Any text viewer
Escaping Never Never
Schema Optional/required Not needed
Text tools grep, sed, awk fail Work fine
Embed binary blobs Native DLE transparency

HSV fills the gap: More efficient than JSON/XML, more debuggable than binary, and can still embed raw bytes when needed.

Binary Data

HSV uses DLE (Data Link Escape) for binary transparency—the same technique BISYNC used in 1967:

JSON (must encode)

{"image": "iVBORw0KGgo..."}

Base64: 33% size overhead

HSV (raw bytes)

image␟␐␂<raw PNG bytes>␐␃

DLE: ~0.4% overhead (escape only 0x10)

How it works:

Inside binary mode, only DLE needs escaping. All 17 other control codes become literal data.

Streaming Comparison

MethodDelimiterNewlines in dataFraming
NDJSON Newline Must escape Implicit
JSON with length prefix Byte count OK Binary header
Server-Sent Events data: prefix Must escape Text protocol
HSV ␂/␃ + ␜ OK Native

Parallel Parsing

HSV is the only format here that supports parallel parsing at every data level.

Why other formats require sequential parsing:

FormatWhy Sequential
JSON Escape state (\") spans tokens—can't split safely
CSV Quote state spans cells—newline might be inside quotes
XML Entity state (&amp;) and CDATA sections span boundaries
MessagePack Length-prefixed—must decode header to find next item
Protobuf Varint lengths—must decode sequentially to find boundaries

HSV has no escape state. Every separator byte (FS, GS, RS, US) is always a separator. Split on any of them, parse chunks independently.

Parallel at Every Data Level

File level:     Scan for STX/ETX → split into blocks → parse blocks in parallel
Record level:   Scan for FS (␜) → split into records → parse records in parallel
Property level: Scan for RS (␞) → split into pairs → parse pairs in parallel
Array level:    Scan for GS (␝) → split into items → parse items in parallel

Every split is just "find byte, cut." No state machine. No lookahead. No backtracking.

When to Use What

Use CaseRecommendedWhy
Config files JSON, TOML, YAML Human editing matters
API responses (simple) HSV or JSON HSV smaller, JSON more tooling
Streaming data HSV Native framing, no escaping
Log files HSV Multiline content, greppable
Data export HSV or CSV HSV for nested/complex data
High-performance Protobuf, FlatBuffers Binary is fastest
Deep nesting (10+ levels) HSV, JSON, or XML All support unlimited nesting

Summary

HSV is ideal when:

Stick with JSON when: