dms-cr
dms-cr
Crystal parser for DMS, a data syntax with strong typing, ordered maps, multi-line heredocs, and front-matter metadata.
This shard is a line-for-line port of the Python reference (dms-py), which itself follows the Rust reference (dms-rs). All seven ports check against the same fixture corpus.
What DMS looks like
A medium-size tier-0 document, exercising every feature you'd touch in a real config — front matter, comments (line + trailing), nested tables, list-of-tables with the + marker, flow forms, distinct types, and a heredoc with a trim modifier:
+++
title: "DMS feature tour"
version: "1.0.0"
updated: 2026-04-24T09:30:00-04:00
+++
# Hash and // line comments both work.
// Bare keys allow full Unicode; quoted keys take any string.
database:
host: "db.internal"
port: 5432 # bumped after the LB change
pool: { size: 10, idle_timeout_s: 30 } # flow table
servers:
+ name: "web1"
disks:
+ mount: "/"
size_gb: 100
+ mount: "/var"
size_gb: 500
+ name: "web2"
regions: ["us-east-1", "eu-west-1", "ap-south-1"]
sql: """SQL _trim("\n", ">")
SELECT id, email
FROM users
WHERE active = true
SQL
Tier 1 layers structured decorators on top of the value tree. Sigils bind to families published by a dialect; here is dms+html carrying an HTML fragment as a DMS document:
+++
_dms_tier: 1
_dms_imports:
+ dialect: "html"
version: "1.0.0"
+++
+ |html(lang: "en")
+ |head
+ |title "DMS feature tour"
+ |meta(charset: "UTF-8")
+ |body(class: "main")
+ |h1 "Welcome to DMS"
+ |p(class: "lede")
+ "Click "
+ |a(href: "/spec.html") "here"
+ " to read the spec."
Full feature tour, format comparison, and dialect index on the DMS website.
Requirements
- Crystal 1.10 or newer
Install
In your shard.yml:
dependencies:
dms:
gitlab: flo-labs/pub/dms-cr
version: ~> 0.5.2
Then:
shards install
Usage
require "dms"
src = File.read("config.dms")
# Body-only (drops front matter and comments after decode).
body = Dms.decode(src)
# Full document (preserves comments + literal forms for `encode` round-trip).
doc = Dms.decode_document(src)
doc.meta # Dms::Table | Nil — nil when there is no `+++` block
doc.body # the decoded root value (Dms::Value)
doc.comments # Array(Dms::AttachedComment)
doc.original_forms # round-trip side-channel records
# Re-emit DMS source. Raises Dms::EncodeError if doc carries an
# UnorderedTable in full mode (use encode_lite for that case).
out = Dms.encode(doc)
Front-matter-only decode
For callers that need only the document's metadata — config loaders checking _dms_tier, indexers harvesting user keys, dispatchers choosing a downstream decoder — Dms.decode_front_matter parses the +++ ... +++ block and stops, leaving body bytes untokenized. SPEC tier 0 requires this entry point. Validation inside the FM block is identical to a full decode (open/close on their own lines, _-prefix namespace enforced, unterminated FM is an error); body errors are silently skipped.
case meta = Dms.decode_front_matter(src)
when Nil
# document has no `+++` front-matter block
when Dms::Table
# empty Hash => present-but-empty FM (`+++\n+++`),
# distinguishable from `nil` above.
if (title = meta["title"]?).is_a?(String)
puts "title: #{title}"
end
end
The pre-v0.14 names (Dms.parse, Dms.parse_document, Dms.to_dms, Dms.to_dms_lite, …) remain as @[Deprecated] wrappers and continue to work; new code should use decode / encode.
Public API
Top-level entry points on Dms:
| Method | Purpose |
|---|---|
Dms.decode(src) |
Body-only decode → Dms::Value |
Dms.decode_document(src) |
Full decode (body + meta + comments + forms) |
Dms.decode_lite(src) |
Body-only, no comment/form sidecar |
Dms.decode_lite_document(src) |
Lite full decode |
Dms.decode_document_unordered(src) |
Full decode with UnorderedTable (HashMap-style) |
Dms.decode_lite_document_unordered |
Lite + unordered |
Dms.decode_front_matter(src) |
FM-only, body untokenized → Dms::Table? |
Dms.encode(doc) |
Re-emit DMS source (full round-trip) |
Dms.encode_lite(doc) |
Re-emit canonical form (lossy: drops comments/forms) |
Dms::Tier1.decode_t1(src) |
Tier-1 decode → Dms::DocumentT1 |
Dms::ConformanceEncoder.encode(doc) |
DMS → tagged-JSON for the conformance runner |
Capability flags: Dms::SUPPORTS_LITE_MODE, Dms::SUPPORTS_IGNORE_ORDER.
Value shape
| DMS type | Crystal type |
|---|---|
| bool | Bool |
| integer | Int64 |
| float | Float64 |
| string | String |
| local-date | Dms::LocalDate |
| local-time | Dms::LocalTime |
| local-datetime | Dms::LocalDateTime |
| offset-datetime | Dms::OffsetDateTime |
| table | Dms::Table (= Hash(String, Dms::Value)) |
| list | Dms::List (= Array(Dms::Value)) |
| unordered table | Dms::UnorderedTable (subclass of Table) |
The Dms::Value alias unions every variant above. Datetime structs (Dms::LocalDate and friends) wrap the source lexeme as a String — already SPEC-validated by the parser, so you never re-parse to inspect them. Tables use Crystal's insertion-ordered Hash. UnorderedTable is a marker subclass: pattern-match it before Hash in case … when ordering, since the more-specific subtype must win.
Working with comments and original forms
DMS preserves comments through decode → mutate → re-emit (SPEC §Comments). The Document carries them on a side-channel keyed by breadcrumb path; the same shape lets you attach a comment to a value after decoding and have it round-trip through encode:
require "dms"
doc = Dms.decode_document("db:\n port: 8080\n")
# Mutate a value in place.
if (t = doc.body).is_a?(Dms::Table)
if (db = t["db"]?).is_a?(Dms::Table)
db["port"] = 5432_i64
end
end
# Attach a leading line comment to db.port.
doc.comments << Dms::AttachedComment.new(
Dms::Comment.new("# bumped after LB change", Dms::CommentKind::Line),
Dms::CommentPosition::Leading,
["db".as(Dms::PathSeg), "port".as(Dms::PathSeg)],
)
puts Dms.encode(doc)
Forcing a heredoc on emit
Strings parse and re-emit in their source form. To switch a basic-quoted string to a heredoc (or to construct one from scratch), push an OriginalLiteral.string(...) record onto doc.original_forms keyed by the value's path:
form = Dms::StringForm.heredoc(
Dms::HeredocFlavor::BasicTriple, # or LiteralTriple for '''
nil, # label, e.g. "EOF"
[] of Dms::HeredocModifierCall, # _trim(...), _fold_paragraphs(), …
)
doc.original_forms << {
["db".as(Dms::PathSeg), "greeting".as(Dms::PathSeg)],
Dms::OriginalLiteral.string(form),
}
Round-trip rules (SPEC §Round-trip semantics): comments stick to still-present nodes; deleting a node drops its comments; newly inserted nodes start with no comments. The first original_forms entry per path wins, so override the parser-recorded form by replacing rather than appending if the key is already present.
Tier 1: decorators and dialects
Tier-1 source carries dialect imports + decorator calls (|tag(...), @expr(...), etc.). See TIER1.md for the full spec. dms-cr currently ships a tier-1 batch decoder; the encoder side is tracked for a future release.
src = File.read("page.dms.html")
doc = Dms::Tier1.decode_t1(src)
doc.t0 # Dms::Document — the underlying tier-0 tree
doc.imports # Array(Dms::ImportSpec)
doc.decorators # Array(Dms::DecoratorEntry) — sidecar keyed by path
Errors
Decode-side failures raise Dms::DecodeError, which carries one-based line and column getters and formats its message as line:col: message:
begin
doc = Dms.decode_document(src)
rescue e : Dms::DecodeError
STDERR.puts "parse failed at #{e.line}:#{e.column}: #{e.message}"
end
Encode-side failures raise Dms::EncodeError — currently raised only by full-mode encode when the input Document carries an UnorderedTable (those have arbitrary iteration order, so a stable round-trip cannot be promised). Use Dms.encode_lite for canonical emit on unordered Documents.
The pre-v0.3.0 name Dms::ParseError survives as a deprecated alias of Dms::DecodeError.
When to use which decoder
| Goal | Entry point |
|---|---|
| Read config, no re-emit | Dms.decode |
| Read + re-emit, preserving comments / heredoc form | Dms.decode_document + Dms.encode |
| Read only the FM block (dispatch, schema check, index) | Dms.decode_front_matter |
| Tier-1 source (decorators, dialect imports) | Dms::Tier1.decode_t1 |
| Speed over round-trip fidelity | Dms.decode_lite / decode_lite_document |
| Don't care about table order (HashMap-ish) | Dms.decode_document_unordered |
Build & test
shards install # pulls the toml dep used by the bench harness
shards build # produces bin/dms-encoder + bench targets
crystal spec # runs the spec suite
Build targets declared in shard.yml:
| Target | Source | Purpose |
|---|---|---|
dms-encoder |
src/dms-encoder.cr |
DMS → tagged-JSON conformance encoder |
bench-parse-dms |
bench/parse_dms.cr |
Parse-only DMS micro-benchmark |
bench-parse-json |
bench/parse_json.cr |
Parse-only JSON benchmark (baseline) |
bench-parse-yaml |
bench/parse_yaml.cr |
Parse-only YAML benchmark (baseline) |
bench-parse-toml |
bench/parse_toml.cr |
Parse-only TOML benchmark (baseline) |
bench-formats-cr |
bench/bench_formats.cr |
Cross-format wall-clock comparison |
Conformance
The fixture corpus lives in dms-tests (4500+ pairs). Clone it once as a sibling:
cd ..
git clone https://gitlab.com/flo-labs/pub/dms-tests.git
Then build the encoder and run the sweep:
shards build --release dms-encoder
python3 ../dms-tests/run_conformance.py bin/dms-encoder
The dms-encoder binary reads DMS from stdin and writes tagged JSON to stdout, matching the format the conformance runner consumes. dms-tests can also drive every implementation in one shot — see its README for the cross-language workflow.
Companion projects
| Repo / shard | Purpose |
|---|---|
dms |
Spec, fixtures index, and the dialect registry |
dms-tests |
Cross-language conformance corpus + runner |
dms-rs |
Rust reference implementation |
dms-py |
Python reference (this port follows it line-for-line) |
SPEC compliance
Every tier-0 feature in SPEC.md is implemented and exercised by the dms-tests corpus. Behavioural drift between ports is caught at the conformance gate, not at runtime. Tier-1 (decorators / dialects) is partially implemented — batch decode is shipped; encode is in progress.
License
Dual-licensed at your option:
dms-cr
- 0
- 0
- 0
- 0
- 0
- 15 days ago
- April 27, 2026
Apache License 2.0
Sun, 10 May 2026 04:08:29 GMT
