crystal-chars-vs-codepoints-bench

Simple benchmark between using `chars` and `codepoints` in Crystal.

chars vs codepoints benchmark

I was curious if (like in some other languages) evaluating numeric representations of strings was faster than their character representations.

The short answer is: in Crystal codepoints is negligibly faster.

The following is example output performed on a ThinkPad E480.

Benchmark #1: ../bin/hamming --chars
  Time (mean ± σ):      3.263 s ±  0.029 s    [User: 2.276 s, System: 0.975 s]
  Range (min … max):    3.247 s …  3.375 s    50 runs
 
Benchmark #2: ../bin/hamming --codepoints
  Time (mean ± σ):      3.251 s ±  0.040 s    [User: 2.265 s, System: 0.974 s]
  Range (min … max):    3.228 s …  3.395 s    50 runs
 
Summary
  '../bin/hamming --codepoints' ran
    1.00 ± 0.02 times faster than '../bin/hamming --chars'

The results are fairly stable, codepoints always beat chars, but the difference is tiny. I wouldn't look here for an optimization, especially if you need to actually do anything with the characters (like display).

What it's doing

The code under benchmark (src/hamming.cr) is calculating the Hamming distance between two files of 100,000,000 characters each. The test (test/bench) is a bash script that generates test data (using src/gen_input.cr) and runs hyperfine for a performance comparision.

The idea for the using Hamming distance, and this benchmark in general, came from this exercism.io exercise where my solution got some reasonable questions from the Crystal mentor reviewing it.

Running the tests

  1. Install hyperfine
  2. Clone this repo
  3. Build the crystal bits with shards build --release
  4. Run test/bench
  5. ...
  6. Profit!
Repository

crystal-chars-vs-codepoints-bench

Owner
Statistic
  • 1
  • 0
  • 0
  • 0
  • 0
  • almost 4 years ago
  • February 18, 2021
License

MIT License

Links
Synced at

Wed, 22 Jan 2025 06:06:36 GMT

Languages