trash-panda-db v0.4.0

Pure Crystal embedded SQL database with Raft replication and crystal-db compatibility. No C bindings.

TrashPandaDB

A pure Crystal embedded SQL database with Raft replication and crystal-db compatibility.

  • No C bindings
  • No system library dependencies (beyond libpcre2 for Crystal itself)
  • Embedded use via crystal-db driver, or standalone replicated cluster via TCP

Table of Contents


Embedded Use

Add to shard.yml:

dependencies:
  trash-panda-db:
    github: your-org/trash-panda-db
    version: "~> 0.4"
require "trash_panda_db"

DB.open("trashpanda:/path/to/data.tpdb") do |db|
  db.exec "CREATE TABLE IF NOT EXISTS kv (k TEXT PRIMARY KEY, v TEXT)"
  db.exec "INSERT INTO kv VALUES (?, ?)", "hello", "world"
  db.query_one "SELECT v FROM kv WHERE k = ?", "hello", as: String  # => "world"
end

# In-memory (no persistence)
DB.open("trashpanda::memory:") { |db| ... }

All crystal-db patterns work: connection pools, transactions, prepared statements, query_one, query_all, etc.


SQL Support

Feature Supported
CREATE TABLE, DROP TABLE Yes
INSERT, UPDATE, DELETE Yes
SELECT with WHERE, ORDER BY, LIMIT, OFFSET Yes
JOIN (INNER, LEFT, CROSS) Yes
GROUP BY, HAVING Yes
Aggregate functions (COUNT, SUM, AVG, MIN, MAX) Yes
Subqueries Yes
BEGIN / COMMIT / ROLLBACK Yes
SAVEPOINT / RELEASE / ROLLBACK TO SAVEPOINT Yes
PRIMARY KEY, NOT NULL, UNIQUE, DEFAULT Yes
CAST, LIKE, IN, IS NULL, BETWEEN Yes
REGEXP Yes
Parameter binding (?) Yes

Value types: NULL, INTEGER (Int64), REAL (Float64), TEXT, BLOB.


Storage

  • Page-based: 4 KB pages. Database header occupies page 0.
  • WAL: Writes go to a -wal file first. Checkpointed to the main file when ≥ 64 pages accumulate.
  • JSON serialization: The entire database is serialized as a single JSON blob spread across pages. Simple; not space-optimised.
  • Crash recovery: WAL is replayed on open. A clean checkpoint removes the WAL file.

Record size

There is no hard per-record size limit — the JSON blob spans as many pages as needed. However, there are real practical constraints:

  • The whole database lives in memory. Every flush serializes the entire dataset; every open deserializes it. Large records mean large allocations on every write.
  • BLOBs are JSON-encoded as integer arrays, so a 1 MB blob becomes ~4–5 MB of JSON text.
  • In replicated mode, each write's SQL is stored verbatim in the Raft log. A large INSERT with a big value produces a proportionally large log entry.

Keep individual values well under 1 MB. TrashPandaDB is designed for structured relational data — not a blob store. For large binary objects, store a path or reference in TrashPandaDB and keep the data on disk or object storage.


Raft Replication

TrashPandaDB ships a standalone server binary that runs a Raft node, exposes a JSON-over-TCP client API, and transparently forwards writes from followers to the leader.

Standalone Server

Build the binary on the host (requires Crystal ≥ 1.20):

crystal build src/trashpandadb.cr -o bin/trashpandadb

Build the container image (binary must exist first):

podman build -t trash-panda-raft -f Containerfile .
# or: docker build -t trash-panda-raft -f Containerfile .

Peer Discovery

Explicit peers — specify each peer's Raft and client address manually:

podman run trash-panda-raft \
  --node-id n1 \
  --raft   0.0.0.0:9001 \
  --client 0.0.0.0:9002 \
  --peer n2=db2.internal:9001  --peer n3=db3.internal:9001 \
  --client-peer n2=db2.internal:9002  --client-peer n3=db3.internal:9002

DNS Peer Discovery

Point --dns-peers at a DNS name whose A record lists every node's IP. Each container resolves the record at startup, excludes its own IP, and wires up the remaining IPs as peers. The node ID is set automatically to the container's own IP.

podman run trash-panda-raft \
  --raft   0.0.0.0:9001 \
  --client 0.0.0.0:9002 \
  --dns-peers db-cluster.example.com

Cluster-size guard: by default the server refuses to start unless the A record resolves to at least 3 IPs. More nodes are fine. Change the minimum with --dns-minimum-cluster-size N:

ERROR: --dns-minimum-cluster-size is 3 but 'db-cluster.example.com' resolved to only 2 addresses (10.0.0.1, 10.0.0.2).
Update the DNS record or lower --dns-minimum-cluster-size.

This means scaling the cluster up is as simple as adding IPs to the DNS record and restarting nodes with an updated --dns-minimum-cluster-size.

DNS options:

Flag Default Description
--dns-peers HOSTNAME DNS A-record hostname for peer discovery
--dns-raft-port PORT 9001 Raft RPC port for discovered peers
--dns-client-port PORT 9002 Client API port for discovered peers
--dns-minimum-cluster-size N 3 Minimum node count required; startup fails if the A record resolves to fewer IPs

Example with Podman (--add-host simulates a multi-A DNS record):

podman network create --subnet 10.91.0.0/24 raft-demo

HOSTS="--add-host raft-cluster:10.91.0.11 --add-host raft-cluster:10.91.0.12 --add-host raft-cluster:10.91.0.13"

for i in 1 2 3; do
  ip="10.91.0.1${i}"
  podman run -d --name raft-n${i} --network raft-demo --ip $ip \
    $HOSTS -p 1900${i}:9002 trash-panda-raft \
    --raft 0.0.0.0:9001 --client 0.0.0.0:9002 \
    --dns-peers raft-cluster
done

Expanding the Cluster

TrashPandaDB supports transparent single-server membership changes — add one node at a time, safely, without downtime.

To add a fourth node to a running 3-node cluster:

trashpandadb \
  --node-id n4 \
  --raft   0.0.0.0:9001 \
  --client 0.0.0.0:9002 \
  --join   n1.internal:9002 \
  --data-dir /var/lib/trashpandadb

--join points at the client port of any existing cluster node. The new node:

  1. Starts with elections suppressed (it has no peers yet)
  2. Sends a join request — forwarded to the current leader automatically
  3. The leader commits an add log entry (replicated to a quorum of existing members)
  4. Once committed, the new node enables elections and begins participating normally

Going from 3 → 5 nodes: start n4 with --join, wait for it to join, then start n5 with --join. Each change is committed one at a time. Only one membership change may be in flight at once; a second concurrent --join retries automatically until the first commits.

Safety: quorum overlap is guaranteed because only one node is added at a time, so the old and new majorities always share at least one member. A 3→5 expansion goes 3→4→5 internally, never risking a split-brain.

Client API

Each node listens on a TCP client port (default 9002). Send one JSON line per connection; the response is one JSON line.

status

{"action":"status"}
//  {"ok":true,"role":"Leader","node_id":"n1","leader_id":"n1","term":2,
//    "members":{"n1":{"raft":"10.0.0.1:9001","client":"10.0.0.1:9002"},...}}

join — add a new node to the cluster (forwarded to the leader automatically)

{"action":"join","node_id":"n4","raft_addr":"10.0.0.4:9001","client_addr":"10.0.0.4:9002"}
//  {"ok":true}

propose — write (any node; followers forward to leader transparently)

{"action":"propose","sql":"INSERT INTO kv VALUES ('k','v')"}
//  {"ok":true,"rows_affected":1,"last_id":1}

query — linearisable read (leader only)

{"action":"query","sql":"SELECT v FROM kv WHERE k = 'k'"}
//  {"ok":true,"cols":["v"],"rows":[["v"]]}

local_query — read from local state machine (any node; may be slightly behind)

{"action":"local_query","sql":"SELECT * FROM kv"}
//  {"ok":true,"cols":["k","v"],"rows":[["k","v"]]}

Quick test with netcat:

echo '{"action":"propose","sql":"CREATE TABLE t (id INTEGER PRIMARY KEY, v TEXT)"}' | nc -q1 127.0.0.1 19001
echo '{"action":"propose","sql":"INSERT INTO t VALUES (1, '\''hello'\'')"}' | nc -q1 127.0.0.1 19002
echo '{"action":"local_query","sql":"SELECT * FROM t"}' | nc -q1 127.0.0.1 19003

Installing

Standalone binary (any Linux, x86_64 or aarch64)

Download the static binary for your architecture from the latest release. It has no runtime dependencies — copy it anywhere and run it:

# x86_64
curl -Lo trashpandadb https://github.com/dirless/trash-panda-db/releases/latest/download/trashpandadb-x86_64
chmod +x trashpandadb
./trashpandadb --raft 0.0.0.0:9001 --client 0.0.0.0:9002 --data-dir ./data

RPM (Fedora, RHEL, AlmaLinux, Rocky)

Download the RPM for your architecture from the latest release and install it:

# x86_64
sudo rpm -i trash-panda-db-0.4.0-1.x86_64.rpm

# aarch64
sudo rpm -i trash-panda-db-0.4.0-1.aarch64.rpm

This creates a trashpandadb system user, installs the binary to /usr/bin/trashpandadb, and drops a systemd unit and config file:

# Optional: edit ports or set DNS peers
sudo vi /etc/trashpandadb/env

# Start and enable on boot
sudo systemctl enable --now trashpandadb

# Check logs
journalctl -u trashpandadb -f

The default config listens on 0.0.0.0:9001 (Raft) and 0.0.0.0:9002 (client). For a replicated cluster, uncomment DNS_PEERS in /etc/trashpandadb/env.


Building

Requires Crystal ≥ 1.20. With just installed:

just build-dev    # fast debug build
just build        # optimised release build
just install      # release build + install to /usr/local/bin (requires sudo)
just build-image  # build the container image (no host Crystal required)

Or directly:

shards install
crystal build src/trashpandadb.cr -o bin/trashpandadb --release

Testing

crystal spec --no-color          # full suite (418 examples, ~37s)
crystal spec spec/sql_spec.cr    # SQL engine only
crystal spec spec/persistence_spec.cr
crystal spec spec/replication/raft_node_spec.cr

The Podman integration test (spec/replication/podman_spec.cr) is skipped automatically when podman is not in PATH.

Repository

trash-panda-db

Owner
Statistic
  • 0
  • 0
  • 0
  • 0
  • 1
  • about 3 hours ago
  • May 19, 2026
License

Links
Synced at

Tue, 19 May 2026 02:40:58 GMT

Languages