The forgotten hierarchy in every computer
When the American Standards Association published ASCII (ANSI X3.4-1963), they included four separator characters specifically designed for hierarchical data:
| Code | Hex | Symbol | Name | Intended Use |
|---|---|---|---|---|
| 28 | 0x1C | [FS] | File Separator | Separate files or major sections |
| 29 | 0x1D | [GS] | Group Separator | Separate groups within a file |
| 30 | 0x1E | [RS] | Record Separator | Separate records within a group |
| 31 | 0x1F | [US] | Unit Separator | Separate fields within a record |
The codepoints were deliberately ordered: as the number decreases, the scope increases.
The ASCII committee envisioned hierarchical data storage decades before JSON, XML, or even CSV became widespread:
"These four can be used to subdivide data into structured groupings... The specific meaning of each separator is left to the application."
— Paraphrased from ANSI X3.4-1963; see RFC 20 (1969)
ASCII also defined transmission control characters for framing data:
| Code | Hex | Name | Purpose |
|---|---|---|---|
| 1 | 0x01 | SOH | Start of Header (control/metadata) |
| 2 | 0x02 | STX | Start of Text (content/data) |
| 3 | 0x03 | ETX | End of Text |
| 4 | 0x04 | EOT | End of Transmission |
| 16 | 0x10 | DLE | Data Link Escape |
The distinction is key: header = control, text = content. Headers carry routing, metadata, and protocol information. Text carries the actual payload. This is the same pattern used in email headers, HTTP headers, and packet headers today.
These were used in serial communication to frame messages. A typical transmission:
IBM's Binary Synchronous Communications protocol faced a problem: what if the data contains control characters?
Their solution: DLE (Data Link Escape). When you need to send arbitrary binary:
Inside a DLE-transparent section, only DLE itself needs escaping. All other bytes—including STX, ETX, and the separators—are literal data.
This elegant solution has been available since 1967. HSV uses it unchanged.
These characters were designed for hierarchical data. But the computing industry forgot them:
Every format reinvented hierarchy using printable characters—requiring escaping, quoting, and complex parsers.
Using printable characters as delimiters means they can appear in data:
The ASCII separators (0x1C–0x1F) never appear in normal text. They don't need escaping. They were designed for exactly this purpose.
HSV uses ASCII control characters as they were intended—all 22 of them:
Note: SOH (header) and STX (text) reflect the original ASCII distinction between control and content. Headers carry control information—routing, metadata, protocol ops. Text carries the actual data. A message can be header-only (pure control), text-only (pure content), or both.
| ASCII | HSV Purpose |
|---|---|
| Framing | |
| SOH (0x01) | Start header (control/metadata) |
| STX (0x02) | Start text (content/data) |
| ETX (0x03) | End block |
| EOT (0x04) | End stream |
| Protocol | |
| ENQ (0x05) | Enquiry (request acknowledgment) |
| ACK (0x06) | Acknowledge (success) |
| NAK (0x15) | Negative acknowledge (error) |
| CAN (0x18) | Cancel operation |
| Flow Control | |
| XON (0x11) | Resume transmission |
| XOFF (0x13) | Pause transmission |
| SYN (0x16) | Sync/keepalive |
| Device Control | |
| DC2 (0x12) | Connect to device/service |
| DC4 (0x14) | Disconnect (preferred stop) |
| Chunked Transfer | |
| ETB (0x17) | End block (more coming) |
| EM (0x19) | End of medium (data exhausted) |
| Nesting | |
| SSA (0x86) | Start nested structure (C1: Start of Selected Area) |
| ESA (0x87) | End nested structure (C1: End of Selected Area) |
| Binary | |
| DLE (0x10) | Binary transparency (escape to raw bytes) |
| SO (0x0E) | Shift Out (binary: enter shifted mode) |
| SI (0x0F) | Shift In (binary: exit shifted mode) |
| Structure | |
| FS (0x1C) | Separate records |
| GS (0x1D) | Separate array elements |
| RS (0x1E) | Separate properties |
| US (0x1F) | Separate key from value |
This isn't a new invention. It's a return to what the ASCII committee designed 60 years ago—including BISYNC's DLE transparency for binary data, and full protocol support for bidirectional communication.
Non-printable: Control codes never appear in normal text, user input, or file content. No escaping needed.
Universal: Every computer system supports ASCII. These bytes work everywhere.
Invisible: The structure doesn't pollute the data. You see content, not syntax.
Hierarchical: Unlimited nesting via SSA/ESA, plus four structure levels and framing. Covers all data structures.
Unicode 1.0 (1991) added Control Pictures (U+2400–U+243F)—visible glyphs for control characters like [FS] [GS] [RS] [US]. These make control characters displayable in documentation and editors, yet they too remain largely forgotten.
Every computer already has these characters. Every programming language can use them. They've been in every text file specification since 1963.
We just forgot to use them.