suffix_tree

A Crystal implementation of a Suffix Tree

Suffix Tree - (pure crystal-lang)

UNDER CONSTRUCTION

GitHub version CI

Because everyone sucks at describing a suffix tree, and its implementation. This will be a suffix tree by example, heavily commented so that hopefully it doesnt suck!

In computer science, a suffix tree (also called PAT tree or, in an earlier form, position tree) is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow particularly fast implementations of many important string operations. The construction of such a tree for the string takes time and space linear in the length of. Once constructed, several operations can be performed quickly, for instance locating a substring in, locating a substring if a certain number of mistakes are allowed, locating matches for a regular expression pattern etc. Suffix trees also provide one of the first linear-time solutions for the longest common substring problem. These speedups come at a cost: storing a string's suffix tree typically requires significantly more space than storing the string itself. (Wikipedia Allisons)

           tree                       substrings

tree-->|---mississippi                m .. mississippi
       |
       |---i-->|---ssi-->|---ssippi   i .. ississippi
       |       |         |
       |       |         |---ppi      issip,issipp,issippi
       |       |
       |       |---ppi                ip, ipp, ippi
       |
       |---s-->|---si-->|---ssippi    s .. ssissippi
       |       |        |
       |       |        |---ppi       ssip, ssipp, ssippi
       |       |
       |       |---i-->|---ssippi     si .. sissippi
       |               |
       |               |---ppi        sip, sipp, sippi
       |
       |---p-->|---pi                 p, pp, ppi
               |
               |---i                  p, pi
--- Suffix Tree for "mississippi" ---
example from => http://www.allisons.org/ll/AlgDS/Tree/Suffix/
If the non-empty suffixes are sorted:

T11 = i
T8  = ippi
T5  = issippi
T2  = ississippi
T1  = mississippi
T10 = pi
T9  = ppi
T7  = sippi
T4  = sissippi
T6  = ssippi
T3  = ssissippi

Installation

Add this to your application's shard.yml:

dependencies:
  suffix_tree:
    github: johnjansen/suffix_tree

Usage

require "suffix_tree"

Contributing

  1. Fork it ( https://github.com/johnjansen/suffix_tree/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Contributors

Repository

suffix_tree

Owner
Statistic
  • 0
  • 0
  • 0
  • 0
  • 0
  • about 7 years ago
  • July 4, 2017
License

MIT License

Links
Synced at

Sun, 15 Sep 2024 15:01:41 GMT

Languages