SIMDish: Learning SIMD with Crystal

This project was generated by AI for use as an educational resource for me.

SIMDish is an educational library that demonstrates how to use SIMD (Single Instruction, Multiple Data) instructions from Crystal language through C FFI (Foreign Function Interface). This project serves as a learning tool to understand how SIMD acceleration works and how to integrate C code with Crystal.

What is SIMD?

SIMD (Single Instruction, Multiple Data) is a computing technique that allows a single instruction to process multiple data elements in parallel. Modern CPUs include SIMD instruction sets like:

SSE (Streaming SIMD Extensions)
AVX (Advanced Vector Extensions)
AVX2
AVX-512

This library uses SIMD instructions to process 8 single-precision floating-point numbers (Float32) simultaneously.

Cross-Platform Support with SIMDe

SIMDish uses SIMDe (SIMD Everywhere) to provide cross-platform SIMD support. SIMDe is a header-only library that provides portable implementations of SIMD intrinsics on hardware that doesn't natively support them.

Benefits of using SIMDe:

Works on multiple platforms (x86, ARM, PowerPC, etc.)
Automatically uses native SIMD instructions when available
Falls back to portable implementations when needed
No need to write different code for different platforms

Project Structure

simdish/
├── c/                      # C implementation of SIMD functions
│   ├── simd_add.c          # Vector addition using SIMD
│   ├── simd_sub.c          # Vector subtraction using SIMD
│   ├── simd_mul.c          # Vector multiplication using SIMD
│   ├── simd_div.c          # Vector division using SIMD
│   ├── simdish.h           # Header file with function declarations
│   ├── Makefile            # For compiling C code into a shared library
│   └── simde/              # SIMDe library (submodule)
├── src/
│   └── simdish.cr          # Crystal FFI bindings and wrapper methods
├── spec/
│   └── simdish_spec.cr     # Tests for the library
└── examples/
    └── basic.cr            # Basic usage example

How It Works

C Implementation: The core SIMD functions are implemented in C using SIMDe.
Compilation: The C code is compiled into a shared library (.so file).
FFI Binding: Crystal's FFI is used to bind to the compiled C functions.
Crystal Wrapper: User-friendly Crystal methods wrap the FFI calls.

Installation

Add the dependency to your shard.yml:

dependencies:
  simdish:
    github: kojix2/simdish

Run shards install
Initialize and update the SIMDe submodule:

cd lib/simdish
git submodule update --init

Compile the C code:

cd lib/simdish/c
make

Usage

Before running the examples or tests, ensure that the library path is set correctly. This can be done by setting the DYLD_LIBRARY_PATH and LD_LIBRARY_PATH environment variables:

# For macOS
DYLD_LIBRARY_PATH=$(pwd)/c crystal examples/basic.cr

# For Linux
LD_LIBRARY_PATH=$(pwd)/c crystal examples/basic.cr

require "simdish"

# Create Float32 arrays with 8 elements (size must be a multiple of 8)
a = Slice[1.0_f32, 2.0_f32, 3.0_f32, 4.0_f32, 5.0_f32, 6.0_f32, 7.0_f32, 8.0_f32]
b = Slice[8.0_f32, 7.0_f32, 6.0_f32, 5.0_f32, 4.0_f32, 3.0_f32, 2.0_f32, 1.0_f32]

# SIMD addition
result = SIMDish.add(a, b)
puts result # => [9.0, 9.0, 9.0, 9.0, 9.0, 9.0, 9.0, 9.0]

# SIMD subtraction
result = SIMDish.sub(a, b)
puts result

# SIMD multiplication
result = SIMDish.mul(a, b)
puts result

# SIMD division
result = SIMDish.div(a, b)
puts result

Understanding the Code

C Implementation with SIMDe

Let's look at how vector addition is implemented using SIMDe:

// From c/simd_add.c
#define SIMDE_ENABLE_NATIVE_ALIASES
#include "simde/x86/avx2.h"

void simd_add(const float *a, const float *b, float *result) {
    // Load 8 float values into SIMD registers
    simde__m256 va = simde_mm256_loadu_ps(a);
    simde__m256 vb = simde_mm256_loadu_ps(b);
    
    // Add 8 pairs of floats in parallel with a single instruction
    simde__m256 vresult = simde_mm256_add_ps(va, vb);
    
    // Store the result back to memory
    simde_mm256_storeu_ps(result, vresult);
}

Key SIMDe functions used:

simde_mm256_loadu_ps: Load 8 float values into a 256-bit register
simde_mm256_add_ps: Add two 256-bit registers (8 floats) in parallel
simde_mm256_storeu_ps: Store 8 float values from a register to memory

Crystal FFI Binding

# From src/simdish.cr
@[Link(ldflags: "#{__DIR__}/../c/libsimdish.so")]
lib LibSIMDish
  # SIMD vector addition for Float32 arrays
  fun simd_add(a : Float32*, b : Float32*, result : Float32*) : Void
  
  # Other functions...
end

Crystal Wrapper

# From src/simdish.cr
def self.add(a : Slice(Float32), b : Slice(Float32)) : Slice(Float32)
  raise ArgumentError.new("Arrays must have the same size") if a.size != b.size
  raise ArgumentError.new("Array size must be a multiple of 8") if a.size % 8 != 0

  result = Slice(Float32).new(a.size)
  
  # Process 8 elements at a time
  (0...a.size).step(8) do |i|
    LibSIMDish.simd_add(a.to_unsafe + i, b.to_unsafe + i, result.to_unsafe + i)
  end
  
  result
end

Performance Considerations

SIMD operations can significantly improve performance for numerical computations:

Processing 8 elements at once can theoretically provide up to 8x speedup
Actual performance gains depend on various factors including memory access patterns
SIMD is most effective for compute-bound operations on large arrays
SIMDe provides optimal performance by using native instructions when available

Platform Support

Thanks to SIMDe, this library works on multiple platforms:

x86/x86_64 (Intel, AMD)
ARM/ARM64 (Apple M1/M2, Raspberry Pi, etc.)
PowerPC
WASM (WebAssembly)
And more!

Requirements

Array sizes must be multiples of 8 (because we process 8 Float32 values at once)
C compiler (gcc, clang, etc.)

Development

Clone the repository
Initialize the SIMDe submodule: git submodule update --init
Modify C code in the c directory
Run make in the c directory to compile
Run crystal spec to test

Contributing

Fork it (https://github.com/kojix2/simdish/fork)
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Add some feature')
Push to the branch (git push origin my-new-feature)
Create a new Pull Request

Author

kojix2 - creator and maintainer

Repository

simdish

Owner

kojix2

Statistic

1
0
1
0
0
3 months ago
April 6, 2025

License

MIT License

Links

Synced at

Thu, 31 Jul 2025 06:41:34 GMT

Languages

Crystal 59.61% C 32.01% Makefile 8.39%