unclone

An unofficial PyClone and PhyClone Clone for Clonal Analysis

UnClone

build Lines of Code Static Badge DOI

UnClone - An unofficial PyClone and PhyClone Clone for Clonal Analysis

unclone is an unofficial reimplementation of PyClone-VI and PhyClone. Written in Crystal CLI with a Rust kernel.

Scope

  • vi: PyClone-VI style variational inference
  • phy run: PhyClone-style tree trace generation (JSONL)
  • phy map: MAP-like summary from phy trace
  • phy consensus: topology + clade consensus summary
  • phy topology-report: topology support summary from phy trace
  • TSV input with PyClone-VI-compatible core fields
  • deterministic runs with fixed seeds
  • Rust-side parallelism with --kernel-threads

Mode maturity

  • PyClone-VI mode is near-parity with the original PyClone-VI implementation.
  • phy commands are experimental.
  • phy commands are not intended to reproduce upstream PhyClone results exactly.

Release Binaries

GitHub release builds may include separate x86_64 baseline and v3 archives.

  • baseline: built with x86-64-v2; use this for wider compatibility on older x86_64 CPUs.
  • v3: built with x86-64-v3; use this on newer x86_64 CPUs with AVX2/FMA support for faster VI hot paths.

If unsure, start with baseline. On Apple Silicon, the aarch64 build already uses the platform baseline SIMD/FMA support and does not need a separate v3 variant.

Build

Requirements:

  • Crystal
  • Rust / Cargo
  • make

Main workflows are exposed through the Makefile.

make build

For an optimized Crystal CLI build, use:

make build release=1

The resulting binary is bin/unclone.

By default, make build avoids build-machine-specific CPU tuning. This keeps the resulting binary suitable for ordinary local use, CI, and redistribution to similar systems.

For a local machine only, you can ask Rust to tune the kernel for the current CPU:

make build release=1 cpu=native

For distribution builds, prefer an explicit CPU level instead of native:

make build release=1 cpu=x86-64-v2
make build release=1 cpu=x86-64-v3

Additional Rust codegen flags can still be passed through RUSTFLAGS:

RUSTFLAGS="-C opt-level=3" make build release=1 cpu=native

Test

make test

Lint

make lint

Run

Variational inference:

./bin/unclone vi -i ../pyclone-vi/examples/synthetic.tsv -o out.tsv

Deterministic VI run:

./bin/unclone vi -i ../pyclone-vi/examples/synthetic.tsv -o out.tsv -c 4 -d beta-binomial -g 21 -r 2 --max-iters=200 --precision=1000 --seed=7 --kernel-threads=1 --restart-parallelism=1 --print-freq=0

Phy workflow:

./bin/unclone phy run -i input.tsv -o trace.jsonl --num-iters=50 --num-chains=2 --num-particles=16 --burnin=1000 --seed=7
./bin/unclone phy map -i trace.jsonl -o map.json
./bin/unclone phy consensus -i trace.jsonl -o consensus.json --consensus-threshold=0.5
./bin/unclone phy topology-report -i trace.jsonl -o topology_report.json

Expected input columns are:

  • mutation_id
  • sample_id
  • ref_counts
  • alt_counts
  • major_cn
  • minor_cn
  • normal_cn

Optional columns:

  • tumour_content default 1.0
  • error_rate default 0.001

Larger VI run:

./bin/unclone vi -i ../pyclone-vi/examples/tracerx.tsv -o out.tsv -c 40 -d beta-binomial -r 2 --precision=200 --seed=7 --print-freq=0

Common Options

  • -i, --in-file: input TSV
  • -o, --out-file: output TSV
  • -c, --num-clusters: cluster cap
  • -d, --density: binomial or beta-binomial
  • --seed: fixed seed for reproducibility
  • --print-freq: progress output frequency

VI-only:

  • -g, --num-grid-points: CCF grid size
  • -r, --num-restarts: number of restarts
  • --max-iters: maximum VI iterations
  • --mix-weight-prior: Dirichlet prior weight
  • --precision: beta-binomial precision
  • --kernel-threads: Rust kernel parallelism
  • --restart-parallelism: outer restart parallelism
  • --debug-init-file: debug-only JSON file with pi, theta, z arrays for same-initial-state validation

Phy run only:

  • -b, --burnin: burn-in iterations
  • --num-iters: main-chain MCMC iterations
  • --num-chains: number of chains
  • --num-particles: particle count
  • --thin: trace thinning interval
  • --resample-threshold: SMC ESS resampling threshold
  • -p, --proposal: bootstrap, fully-adapted, or semi-adapted
  • -s, --subtree-update-prob: subtree PG probability
  • --num-samples-data-point: data-point Gibbs passes per iteration
  • --num-samples-prune-regraph: prune-regraph passes per iteration
  • --concentration-update, --no-concentration-update: concentration update toggle
  • --concentration-value: initial concentration value
  • --grid-size: CCF grid size for the exact outlier model
  • -l, --outlier-prob: fallback outlier prior
  • -t, --max-time: maximum runtime in seconds
  • -c, --cluster-file: optional cluster assignment TSV

Python helper selection:

  • UNCLONE_PYTHON: default Python executable for helper scripts

Notes:

  • vi --python-compatible still expects a Python 3 executable with NumPy support for default_rng

Diagnostics

Restart diagnostics for VI:

PCV_DEBUG_RESTART_METRICS_FILE=restart_metrics.csv \
./bin/unclone vi -i ../pyclone-vi/examples/tracerx.tsv -o out.tsv -c 40 -d beta-binomial -r 2 --precision=200 --seed=7 --print-freq=1

This writes one row per restart with:

  • restart
  • seed
  • final_elbo
  • used_clusters
  • is_best

Optional kernel profiling:

PCV_PROFILE=1 \
./bin/unclone vi -i ../pyclone-vi/examples/tracerx.tsv -o out.tsv -c 4 -d beta-binomial -r 1 --precision=200 --seed=7 --print-freq=0

This prints aggregated timings to stderr for:

  • initial ELBO
  • update_z
  • update_pi
  • update_theta
  • iterative ELBO recomputation

Debug-only initial value injection:

./bin/unclone vi -i ../pyclone-vi/examples/synthetic.tsv -o out.tsv -c 4 -g 21 -r 1 --debug-init-file=init.json --print-freq=0

The JSON file must contain flat pi, theta, and z arrays matching:

  • pi: num_clusters
  • theta: num_clusters * num_samples * num_grid_points
  • z: num_mutations * num_clusters

This hook is intended for implementation comparison and fairness checks, not normal runs.

Current status

  • Crystal CLI and Rust kernel are wired end to end
  • VI entry point is available and near-parity with the original PyClone-VI implementation
  • phy run/map/consensus/topology-report workflow is available with JSONL/JSON outputs
  • VI restart selection uses best ELBO
  • Rust hot paths use Rayon when enabled
  • output rows are inference-derived and cluster IDs are compactly renumbered
  • tests cover Rust units, Crystal specs, and a deterministic golden output check

PhyClone Notes

  • The phy implementation is separate from upstream PhyClone.
  • Exact matching of PhyClone posterior probabilities, sampler behavior, traces, and output files is not planned.
  • In phy run, num_iters is total iterations and recorded trace length follows post---burnin / --thin.
  • consensus output includes clade support and a consensus_tree reconstruction, while representative topology fields remain for compatibility.
  • loss prior supports cellular_prevalence-informed assignment via --cluster-file metadata.
  • trace and post-process outputs use unclone's JSONL/JSON formats.

Attribution And License

unclone is an unofficial reimplementation of the methods below. Cite the original papers, not unclone.

Upstream is GPL v3 or later; unclone is GPL v3 or later as well.

Repository

unclone

Owner
Statistic
  • 0
  • 0
  • 0
  • 0
  • 1
  • about 6 hours ago
  • June 21, 2026
License

GNU General Public License v3.0 only

Links
Synced at

Sun, 21 Jun 2026 05:09:45 GMT

Languages