crystal-combine-pdf
= crystal-combine-pdf :toc: macro :toclevels: 2
ISO-compatible Crystal port of the Ruby gem https://github.com/boazsegev/combine_pdf[`combine_pdf`] v1.0.31. Drop-in syntax for Ruby users — CombinePDF.new, CombinePDF.load, CombinePDF.parse plus the << / >> / insert / remove / save / to_pdf / number_pages instance API are wired up and behave the same way.
On top of that we ship two Crystal-specific bonuses on top of the gem :
- A4-aware page numbering — fixes the upstream US Letter hardcoding bug of
number_pages; coordinates are derived from each page's actualMediaBox. - Partition-aware intra-numbering — the one that lets a music-score booklet show "1/4", "2/4" on a 4-page partition while the global "3/12" stays in the bottom-right corner.
🇫🇷 Lisez ce document en français : link:README.fr.adoc[README.fr.adoc]
toc::[]
== Quickstart (mode déclaratif, recommandé pour les livrets)
Trois commandes pour assembler un dossier de PDF en livret :
[source,shell]
cd /chemin/vers/mes-partitions crystal-combine-pdf init # génère .crystal-combine-pdf.yml $EDITOR .crystal-combine-pdf.yml # réordonner, titrer, exclure crystal-combine-pdf # construit mes-partitions.pdf
Voyez la section <<Mode déclaratif>> pour le détail du fichier .crystal-combine-pdf.yml.
== Quickstart (ISO API, mirrors the Ruby gem)
[source,crystal]
require "crystal-combine-pdf"
Build a fresh PDF
pdf = CombinePDF.new
Load + append two existing PDFs
pdf << CombinePDF.load("partition1.pdf") pdf << CombinePDF.load("partition2.pdf")
Optional: prepend a cover page
pdf >> CombinePDF.load("cover.pdf")
Optional: metadata
pdf.title = "Recueil de partitions" pdf.author = "Philippe Nénert"
Optional: number the pages (Crystal-specific bonus)
pdf.number_pages
Save
pdf.save("livret.pdf")
== Why
The Ruby gem combine_pdf is the de facto tool for assembling PDF booklets. Its number_pages helper draws the page number at hardcoded coordinates derived from US Letter (612 × 792 pt). On A4 (595 × 842 pt), Asia A4 or A3 the number lands several centimetres off.
Beyond the format mismatch, an assembled music-score booklet — where each "partition" (sheet) may span 1, 2 or N pages — needs a second layer of numbering: a small "1/4", "2/4", … in the corner of each page of the partition, so the performer knows when to turn the page and how many are left in the current piece.
crystal-combine-pdf does both, in pure Crystal, with the page geometry read from the actual MediaBox of every page.
== Features (v0.1)
- Global page numbering in the bottom-right corner (format customisable, default
"N/T"). - Intra-partition numbering in the top-left corner (format customisable, default
"n/t"), shown only when the partition has more than one page (configurable). - A4-aware, Letter-aware, A3-aware — coordinates computed from each page's
MediaBox, no per-format flag needed. - Skip pages (e.g. covers):
--skip 1,2. - Pure Crystal, single binary, no
pdftk/qpdf/pdftotexton$PATH.
== Roadmap
=== Shipped
- v1.0.31.1 — ✅ ISO-compatible module +
PDFinstance API (<<,>>,insert,remove,pages,page_count,new_page,title=,author=,number_pages,save,to_pdf). - v1.0.31.2 — ✅ Declarative
.crystal-combine-pdf.ymlmode (--init,--refresh,--recursive, default-build). Numbering rewrite: 15 cardinal positions + 4 duplex-aware (outer-*/inner-*), 5 styles (plain/badge/circle/square/oval), cover-aware numbering, watermarks viawatermark. - v1.0.31.3 — ✅ Extended PDF compatibility (xref streams 1.5+, object streams, CCITT scans, JPEGs, LilyPond, …) — every PDF in the wild now readable without qpdf preprocessing.
- v1.0.31.4 — ✅ Title + clickable TOC page inserted at the front of the booklet. Each TOC line is a
/Subtype /Linkannotation that jumps to the target partition.
=== Upcoming
- v1.0.31.5 — PDF bookmarks (
/Outlines) + per-file title header. - v1.0.31.6 — encryption (40/128/256 bits),
secured?. - v1.0.31.7 — PAdES-style signatures.
- (Ongoing) — resource deduplication when merging.
=== Improvement ideas
- Web app — online interface to assemble booklets without installing Crystal: drag-and-drop PDFs, reorder by mouse, preview, one "Build" button that returns the finished PDF. Likely Kemal or Lucky on the server side with a simple HTML+HTMX front. ALOLI hosting.
- Standalone desktop app — cross-platform GUI for the same workflow as the CLI, no terminal needed. To investigate: native webview (
crystal-webviewlib or GTK WebKit), or a ncurses/termbox TUI to keep the single-binary spirit. Starting point worth studying: https://github.com/serge-hulne/Crystal-App-template-for-Windows[serge-hulne/Crystal-App-template-for-Windows] — a Crystal application template targeting Windows, useful for packaging the binary into a distributable desktop app.
== Installation
Add to your shard.yml:
[source,yaml]
dependencies: crystal-combine-pdf: github: aloli-crystal/crystal-combine-pdf version: "~> 1.0.31"
then run shards install.
For the CLI:
[source,shell]
git clone https://github.com/aloli-crystal/crystal-combine-pdf cd crystal-combine-pdf shards build --release cp bin/crystal-combine-pdf ~/bin/ # or use bin-installer
== Mode déclaratif
The recommended way to assemble a booklet of partitions, scores or any list of PDFs.
=== Commands
[source,shell]
crystal-combine-pdf init [-r] [--profile NAME] [--user-config PATH] Initialise .crystal-combine-pdf.yml in the current folder. -r scans subfolders too (parents first, then alpha). --profile : booklet (default) | book | report | slides | minimal --user-config : custom path (defaults to ~/.crystal-combine-pdf.yml)
crystal-combine-pdf refresh [-r] Refresh the files: list of an existing YAML : add new PDFs at the end, comment out entries whose file has disappeared. Preserves comments, current order and inline titles.
crystal-combine-pdf build crystal-combine-pdf No subcommand : reads .crystal-combine-pdf.yml from the current folder, assembles the booklet, applies numbering and watermark, writes the output.
crystal-combine-pdf compress FILE.pdf [-o OUTPUT.pdf | -i] [--deep] Reduce the size of a single PDF (Flate recompression + GC). --deep delegates to ghostscript for image downsampling. -i : in-place, --backup : keep original as .bak.
crystal-combine-pdf -d, --dir DIR Common option : target a different folder than the current.
=== .crystal-combine-pdf.yml example
[source,yaml]
output: mes-partitions.pdf title: "Mes Partitions" author: "Philippe Nénert"
duplex: true # recto-verso
cover: mode: recto-verso # none | recto | recto-verso include_in_numbering: false
numbering: enabled: true global: enabled: true format: "%page%/%total%" style: badge # plain | badge | circle | square | oval position: outer-bottom font_size: 10 color: "#333333" margin: 24 partition: enabled: true format: "%page%/%total%" style: plain position: outer-top font_size: 9 color: "#666666" hide_when_single: true skip_pages: []
watermark:
text: "Confidentiel"
style: diagonal # diagonal | tiled | header | footer | center
font_size: 48
color: "#cccccc"
opacity: 0.15
rotation: 45
Encrypt the OUTPUT booklet (password-protected).
encrypt:
enabled: true
level: aes_256 # rc4_128 | aes_128 | aes_256
user_password: "" # blank = no password to open
owner_password: "" # blank or nil = identical to user
permissions: [print, copy, modify, annotate]
Password tried on every encrypted source PDF (global default).
Override per-entry via the inline mapping syntax in files: below.
input_password: ""
files:
- couverture.pdf
- partition1.pdf: "Allegro — C. Debussy"
- partition2.pdf: "Adagio — F. Chopin"
- brouillon-2025.pdf
Encrypted source PDFs with their own password (inline mapping):
- {path: confidentiel.pdf, password: secret}
- {name: backup.pdf, pass: other-pwd, title: "Secured annex"}
[NOTE]
Security: leave passwords empty in the YAML when it's versioned in Git. Pass them on the CLI:
[source,shell]
crystal-combine-pdf -u 'output-pwd' -w 'owner-pwd' # output encryption crystal-combine-pdf -I 'global-source-pwd' # source decryption
CLI flags always override the YAML. For booklets where each source has its own password, use the inline mapping syntax above.
=== Position vocabulary
[cols="1,4"] |=== | Position type | Values
| Static (fixed regardless of page parity) | top-left, top-center, top-right, bottom-left, bottom-center, bottom-right
| Duplex-aware (alternates per page parity) | outer-top, inner-top, outer-bottom, inner-bottom |===
When duplex: true, outer-* resolves to the page side opposite to the spine (right for odd/recto pages, left for even/verso pages). inner-* is the spine side. When duplex: false, outer-* ≡ right, inner-* ≡ left.
=== Cover semantics
cover.mode: none— no cover, all pages numbered (default).cover.mode: recto— 1 cover page at the front and 1 at the back.cover.mode: recto-verso— 2 cover pages at the front and 2 at the back.- For asymmetric covers, use
cover.front:andcover.back:instead ofcover.mode:.
When include_in_numbering: false (default), neither the front nor the back cover pages display a number, and the booklet's first content page is numbered "1". %total% reflects the content total (not including covers).
=== Encryption
The aloli-crystal/pdf shard (≥ 0.5.2) implements the PDF spec's "Standard Security Handler" (ISO 32000-1 and 32000-2 § 7.6.4) for both reading and writing, with no external dependency. This shard exposes those capabilities at three levels: standalone subcommands, YAML section, and CLI flags.
==== Standalone sub-commands
[source,shell]
crystal-combine-pdf encrypt rapport.pdf -u "secret" -l aes_256 crystal-combine-pdf decrypt protege.pdf -u "secret" -i --backup
encrypt: encrypt an arbitrary PDF (RC4-128, AES-128 or AES-256).decrypt: symmetric — produces a copy without/Encrypt.
Common options (inherited from compress / gs):
-o FILEor-i(in-place) or auto-named output (*-encrypted.pdf,*-decrypted.pdf) if neither is set--backup(with-i): keeps the original as.bak-l rc4_128 | aes_128 | aes_256(encrypt only, defaultaes_256)-u USER_PWDuser password,-w OWNER_PWDowner password-p LISTpermissions (csv:print,copy,modify,annotate; shortcutsnone/all. Default:all)
==== Encrypting the OUTPUT of a build
encrypt: section in .crystal-combine-pdf.yml:
[source,yaml]
encrypt: enabled: true # false disables the section level: aes_256 # rc4_128 | aes_128 | aes_256 user_password: "open-me" # empty = no password to open owner_password: "permissions" # empty or nil = identical to user permissions: [print] # absent = everything granted encrypt_metadata: true
Available levels:
[cols="1,1,3"] |=== | Level | V/R | Compatibility
| rc4_128 | 2/3 | Acrobat ≥ 5 (1999), legacy only (RC4 is broken) | aes_128 | 4/4 | Acrobat ≥ 7 (2005), CryptFilter AESV2 | aes_256 | 5/6 | Acrobat ≥ X / PDF 2.0 (2012), default |===
CLI overrides (take precedence over YAML):
[source,shell]
crystal-combine-pdf -u "secret" # output user_password crystal-combine-pdf -w "owner-pwd" # owner_password crystal-combine-pdf -l aes_128 # level crystal-combine-pdf --encrypt # turn on without YAML section crystal-combine-pdf --no-encrypt # disable even if YAML enables it
==== Reading encrypted source PDFs
Global case (one password tried on every encrypted source):
[source,yaml]
input_password: "source-secret" files:
- confidential-1.pdf
- confidential-2.pdf
CLI equivalent (avoid storing the password in versioned YAML):
[source,shell]
crystal-combine-pdf -I "source-secret"
or: --input-password=source-secret
Per-file case (each source with its own password) — inline mapping:
[source,yaml]
files:
- public.pdf # simple entry
- report.pdf: "Annual report" # with title
- {path: confidential.pdf, password: "secret-A"} # with password
- {name: backup.pdf, pass: "secret-B", title: "Annex"}
Aliases accepted:
path≡name≡filetitle≡labelpassword≡pass≡pwd
Precedence: entry.password (per-file YAML) > input_password (global YAML) > --input-password CLI > empty.
A source PDF's owner password works just as well as the user password (the pdf shard tries both automatically via Algorithm 7 for V<5 and Algorithm 2.A for V=5).
[NOTE]
Security recommendation: if the YAML is versioned (Git), leave the user_password, owner_password, input_password fields and per-file password: entries empty in the file, and pass the secrets via CLI only. For CI/CD workflows, use environment variables and a shell wrapper.
==== Preventing modification (without preventing reading)
That's exactly what permissions do on the encryption side. The PDF is readable by everyone (empty user password), but Acrobat / Foxit / Preview / pdftk refuse any modification until the user enters the owner password.
Recipe:
[source,yaml]
encrypt: enabled: true level: aes_256 user_password: "" # ← empty: free read access owner_password: "secret" # ← required to modify permissions: [print] # ← only printing allowed
Or via CLI:
[source,shell]
crystal-combine-pdf encrypt rapport.pdf -w "secret" -p print
Available permissions:
[cols="1,3"] |=== | Permission | Effect when listed
| print | Print the document (low + high res) | copy | Select / copy text or images | modify | Edit page content | annotate | Add or modify annotations (comments, highlights) |===
CLI shortcuts: none (no permissions), all (every permission — equivalent to omitting the flag).
[IMPORTANT]
Trust model. PDF permissions are honor-system: the spec asks viewers and editors to respect them, and Acrobat/Foxit/Preview/pdftk do. But a determined attacker can strip /Encrypt from a PDF; permissions are not cryptographically enforced.
For cryptographic integrity protection, you need a digital signature (see below).
==== Digital signatures (PAdES) — roadmap
Signatures detect any modification of the PDF using asymmetric cryptography:
. The signer computes a SHA-256 / SHA-384 hash over the PDF bytes (excluding the signature itself). . They encrypt that hash with their private key (PKCS#7 / CMS / CAdES) and embed it in a /Sig dictionary inside the PDF. . Any verifier can recompute the hash and compare it with the hash decrypted using the certificate's public key. . Any single-byte change invalidates the signature — Acrobat shows it in red.
Difference with encryption:
[cols="1,1,1"] |=== | | Encryption (/Encrypt) | Signature (/Sig PAdES) | Prevents reading | ✅ | ❌ | Detects modification | ⚠️ via permissions | ✅ (cryptographic) | Proves authenticity | ❌ | ✅ | Long-term archival | ❌ | ✅ (PAdES B-LTA) |===
PAdES levels (ETSI EN 319 142):
- B-B — basic signature (detached PKCS#7)
- B-T — B-B + TSA timestamp
- B-LT — B-T + long-term validation material (CRL/OCSP)
- B-LTA — B-LT + archive timestamp (indefinitely renewable)
Current state of aloli-crystal/pdf: reading detects /Sig but doesn't validate the signature; writing doesn't produce one. Support is on the roadmap (see prod-crystal/crystal-combine-pdf/ROADMAP.adoc § "PAdES signatures"). Estimated effort: M for B-B (PKCS#7 via OpenSSL), L for B-T (TSA HTTP), XL for B-LT/LTA (CRL/OCSP collection, archival).
In the meantime: for a PDF that needs to prove its integrity in a legal context, sign it after the build with a third-party tool (gpg --detach-sign, Adobe Acrobat, eSign Suite, etc.).
== Sub-commands historiques
The original number, merge, assemble sub-commands remain available for scripted workflows.
=== assemble — the booklet workflow (most users want this)
[source,shell]
Take three partition PDFs, merge them, number the result with
auto-detected partition sizes (so each partition gets its
intra-partition "1/4", "2/4" marks where relevant).
crystal-combine-pdf assemble
partition1.pdf partition2.pdf partition3.pdf
-o livret.pdf
=== merge — pure concatenation, no numbering
[source,shell]
crystal-combine-pdf merge p1.pdf p2.pdf p3.pdf -o output.pdf
=== number — number an already-assembled PDF
[source,shell]
Default: format "N/T" in the bottom-right corner.
crystal-combine-pdf number booklet.pdf
Same, with intra-partition marks. Partition sizes are given in
1-based page order — the sum must equal the total page count.
Here: pages 1-4 = partition 1 (so each gets "1/4" through "4/4"),
pages 5-6 = partition 2, page 7 = partition 3 (single-page → no
intra-partition mark by default).
crystal-combine-pdf number booklet.pdf --partitions 4,2,1
Skip the cover (page 1) so it stays untouched.
crystal-combine-pdf number booklet.pdf --skip 1
Custom format and red colour.
crystal-combine-pdf number booklet.pdf
--global-format "Page %page% of %total%"
--partition-format "(%page% / %total%)"
--color "0.7,0.0,0.0"
--font-size 12
=== Options
[cols="1,3"] |=== | Option | Description
| -o FILE, --output FILE | Output path. Defaults to <input>-numbered.pdf next to the input.
| --partitions N,N,… | Comma-separated list of partition sizes, in 1-based page order. Sum must equal the PDF page count.
| --skip PAGES | Comma-separated list of 1-based page indices to leave un-numbered (typical: --skip 1 to spare the cover).
| --font-size SIZE | Point size of the rendered numbers. Default 10.
| --margin PT | Inset from the page edge, in points. Default 24.
| --color R,G,B | RGB triplet, components 0.0–1.0. Default 0.2,0.2,0.2.
| --global-format FMT | Format string for the global page number. Placeholders: %page%, %total%. Default "%page%/%total%".
| --partition-format FMT | Format string for the intra-partition number. Same placeholders. Default "%page%/%total%".
| --show-single-partitions | Render the intra-partition number even when the partition has a single page. Off by default.
| -h, --help, -v, --version | Standard. |===
== API
=== ISO-compatible API (mirrors the Ruby gem)
[source,crystal]
require "crystal-combine-pdf"
Module entry points
CombinePDF.new # => CombinePDF::PDF (empty) CombinePDF.load(path : String) # => CombinePDF::PDF (from file) CombinePDF.parse(data : Bytes) # => CombinePDF::PDF (from bytes)
Instance API
pdf << other # append (path or PDF) — chainable pdf >> other # prepend (path or PDF) — chainable pdf.insert(location, other) # location: Int32 (-1 = append, 0 = prepend) pdf.remove(page_index) # negative indices supported pdf.pages # => Array(::PDF::Objects::Reference) pdf.page_count # => Int32 pdf.new_page(mediabox = [0, 0, 612, 792], location = -1) pdf.title = "…" pdf.author = "…" pdf.number_pages(partitions, options) # in-place pdf.save("out.pdf") pdf.to_pdf # => Bytes
=== Crystal-specific bonus helpers
[source,crystal]
One-shot multi-PDF concatenation
CombinePDF.merge(["a.pdf", "b.pdf"], "out.pdf")
Number an already-assembled PDF on disk
CombinePDF.number( input: "booklet.pdf", output: "booklet-numbered.pdf", partitions: [4, 2, 1], options: CombinePDF::Options.new( font_size: 11.0, color: {0.2, 0.2, 0.2}, margin: 24.0, skip_pages: [1], ), )
End-to-end booklet assembly (merge + number with auto-detected
partition sizes — each input file = one partition)
CombinePDF.assemble( inputs: ["partition1.pdf", "partition2.pdf", "partition3.pdf"], output: "livret.pdf", )
== Limitations (v1.0.31.1)
- No deduplication of common resources when merging — a font or image embedded in N source PDFs is copied N times to the output. Result is correct but slightly larger than an optimising merger would produce. Targeted for v0.3.
- Numbering is rendered with the standard Type1
Helvetica(no embedded font, no Unicode beyond ASCII digits +/). - The output of
numberuses an incremental update (PDF spec § 7.5.6) ;mergeandassemblewrite a fresh PDF from scratch. Both are handled correctly bypdf::Readerv0.3.4+ and every real-world PDF reader.
== Development
[source,shell]
shards install crystal spec bin/ameba crystal tool format src/ spec/
== License
MIT — see link:LICENSE[LICENSE].
crystal-combine-pdf
- 0
- 0
- 0
- 0
- 4
- 14 days ago
- April 25, 2026
MIT License
Sat, 09 May 2026 13:36:29 GMT