alea

Repeatable pseudo-random sampling, CDF over most known probability distributions.

Alea

Build Status Crystal Shard Docs PRs Welcome Deps License

Alea is a collection of utilities to work with most known probability distributions, written in pure Crystal.

Features:

Note: This project is in early development state and many distributions are still missing, as well as cumulative distribution functions, so keep in mind that breaking changes may occur frequently.

Why Crystal?

Crystal compiles to really fast native code without sacrificing any of the modern programming languages standards providing a nice and clean interface.

Installation

  1. Add the dependency to your shard.yml:
  dependencies:
    alea:
      github: nin93/alea
  1. Run shards install

Usage

require "alea"

PRNGs

The algorithms in use for generating 64-bit uints and floats are from the xoshiro (XOR/shift/rotate) collection, designed by Sebastiano Vigna and David Blackman: really fast generators promising exquisite statistical properties as well.

Implemented engines:

  • XSR128 backed by:
    • xoroshiro128++ as #next_u
    • xoroshiro128+ as #next_f
  • XSR256 backed by:
    • xoshiro256++ as #next_u
    • xoshiro256+ as #next_f

Digits stand for the storage of their state in bits. Their period is thus 2^128 -1 for XSR128 and 2^256 -1 for XSR256.

The + versions are slightly faster, but since they have a bias on the right-most bits, they are only used for generating random floats, which lose those bits while shifting to obtain the mantissa.

More informations are detailed in: http://prng.di.unimi.it/.

See the benchmarks for a comparison between these engines.

Sampling

Random is the interface provided to perform sampling:

random = Alea::Random.new
random.normal # => -0.36790519967553736

It also accepts an initial seed to reproduce the same seemingly random events across runs:

seed = 9377u64
random = Alea::Random.new(seed)
random.exp # => 2.8445710982736148

By default, the PRNG in use by Random is XSR128. You can, though, pass the desired engine as an argument to the constructor. Here is an example using XSR256:

random = Alea::Random.new(Alea::XSR256)
random.float # => 0.6533582874035311
random.prng  # => Alea::XSR256

# or seeded as well
random = Alea::Random.new(193, Alea::XSR256)
random.float # => 0.80750616724688

All PRNGs in this library inherits from an abstract class PRNG; you are then allowed to build your own custom PRNG by inheriting the above parent class and passing it to Random just like in the previous example:

class MyGenerator < Alea::PRNG
  def next_u : UInt64
    # must be implemented
  end

  def next_f : Float64
    # must be implemented
  end

  def jump : self
    # must be implemented
  end

  ...
end

random = Alea::Random(MyGenerator)

Unsafe methods

Plain sampling methods (such as #normal, #gamma) performs checks over arguments passed to prevent bad data generation or inner exceptions. In order to avoid them (checks might be slow) you must use their unsafe version by prepending next_ to them:

random = Alea::Random.new
random.normal(loc: 0, sigma: 0)      # raises Alea::UndefinedError: sigma is 0 or negative.
random.next_normal(loc: 0, sigma: 0) # these might raise internal exceptions.

Timings are definitely comparable, though. See the benchmarks for direct comparisons between those methods.

Supported Distributions

Current sampling methods are implemented for the following distributions:

  • Beta
  • Chi-Square
  • Exponential
  • Gamma
  • Laplace
  • Log-Normal
  • Normal
  • Poisson
  • Uniform

Cumulative Distribution Functions

CDF is the interface used to calculate the Cumulative Distribution Functions. Given X ~ D and a fixed quantile x, CDFs are defined as the functions that associate x to the probability that the real-valued random X from the distribution D will take a value less or equal to x.

Arguments passed to CDF methods to shape the distributions are analogous to those used for sampling:

Alea::CDF.normal(0.0)                       # => 0.5
Alea::CDF.normal(2.0, loc: 1.0, sigma: 0.5) # => 0.9772498680518208
Alea::CDF.chisq(5.279, df: 5.0)             # => 0.6172121213841358

Supported Distributions

Current CDFs estimations are implemented for the following distributions:

  • Chi-Square
  • Exponential
  • Gamma
  • Laplace
  • Log-Normal
  • Normal
  • Poisson
  • Uniform

References

Fully listed in LICENSE.md:

  • NumPy random module for pseudo-random sampling methods
  • JuliaLang random module for ziggurat methods
  • IncGammaBeta.jl for incomplete gamma functions

Contributing

  1. Fork it (https://github.com/nin93/alea/fork)
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Contributors

Owner
github statistic
  • 1
  • 0
  • 0
  • 0
  • 2 days ago
  • April 20, 2020
License

Other

Links
Synced at

Tue, 02 Jun 2020 06:36:43 GMT