fastasc v0.1.0
fastasc
Remove completely/partially invariant sites from nucleotide aligned-FASTA for Lewis's ascertainment bias correction ( +ASC ).

Brief description
This is a simple Crystal-program and written under Crystal 1.17.1.
This program takes aligned-FASTA as input and remove invariant sites for phylogenetic inference with Lewis's ASC methods.
IUPAC nucleotide code in Upper/lower case and '?' can be accepted.
Gap ('.' and '-'), 'N', 'n', and '?' are treated as unknown (all 4 nucleotides are equally likely).
- Acceptable characters
AaTtGgCcRrYySsWwKkMmBbDdHhVvNn-.?
- Example input
cat ./test_data/test.fasta | awk 'NR%2==0{print $0}'
##ATGCatgccA
##ATGCatgccA
##ATGCatgcc.
##AAAYwnacy?
##ATR-wygyyT
- Example output
fastasc -f ./test_data/test.fasta | awk 'NR%2==0{print $0}'
##TGgA
##TGgA
##TGg.
##AAa?
##TRgT
Treating Whole-Genome SNP data can consume RAM.
On my PC, processing an aligned-FASTA with 28 sequences with 37 Mb length took about 5.7 GB of RAM and 50 seconds in real time.
Installation
Go to fastasc directory and type:
mkdir ./bin && crystal build --release -o ./bin/ ./src/fastasc.cr
This will compile the source code to an executable binary file (./bin/fastasc).
Crystal language can be easily installed by following Crystal Docs.
Usage
Usage: fastasc -f INPUT.fa > OUTPUT.fa
or
cat INPUT.fa | fastasc -f - > OUTPUT.fa
Remove completely/partially invariant sites from nucleotide aligned fasta for +ASC subst model
-v, --version Show version
-h, --help Show this help message
-f fasta, --fasta=fasta Path to input aligned fasta. Set '-' for STDIN.
Input format
fastasc takes aligned-FASTA as input.
Outputs of mafft, vcf2phylip.py, and vcf2alnfasta can be accepted.
fastasc
- 0
- 0
- 0
- 0
- 0
- about 1 hour ago
- November 30, 2025
MIT License
Sun, 30 Nov 2025 06:30:54 GMT