grok.cr
grok.cr
Regexes for mere mortals!
This is heavily influenced by the grok processor in Elasticsearch and the grok filter in logstash.
Grok has been written by Jordan Sissel (the creator of logstash). Grok serves as a library for regular expressions so you do not have to remember all the awkward regex syntaxes and just need to remember names that get mapped to patterns.
This is a port of the functionality to crystal that uses the same grok patterns in its standard library than logstash or the Elasticsearch grok processor.
Installation
- Add the dependency to your
shard.yml
:
dependencies:
grok:
github: spinscale/grok.cr
version: 0.0.1
- Run
shards install
Usage
This is the easiest usage
require "grok"
grok = Grok.new [ "This is a %{DATA:my_field}" ]
map = grok.parse "This is a test"
map["my_field"] == "test"
This is a simple example, but take a log line from an http log file like
1.1.1.1 - auth_user [12/Dec/2019:12:45:45 -0700] "GET / HTTP/1.1" 200 633 "http://referer.org" "Secret Browser"
This one can easily be parsed into a map by using
grok = Grok.new ["%{COMBINEDAPACHELOG}"]
result = grok.parse %q(1.1.1.1 - auth_user [12/Dec/2019:12:45:45 -0700] "GET / HTTP/1.1" 200 633 "http://referer.org" "Secret Browser")
# result will be a map of
# {"clientip" => "1.1.1.1", "ident" => "-", "auth" => "auth_user",
# "timestamp" => "12/Dec/2019:12:45:45 -0700", "verb" => "GET",
# "request" => "/", "httpversion" => "1.1", "rawrequest" => nil,
# "response" => "200", "bytes" => "633",
# "referrer" => "\"http://referer.org\"",
# "agent" => "\"Secret Browser\""
# }
You can also come up with parsing your custom log lines from your own applications
grok = Grok.new [ "%{SYSLOGBASE2} %{WORD:action} on %{WORD:interface} to %{IP:ip} port %{INT:port} interval %{INT:interval} %{GREEDYDATA:message}" ]
result = grok.parse "Dec 29 22:41:02 mako dhclient[11675]: DHCPDISCOVER on enp59s0f1 to 255.255.255.255 port 67 interval 3 (xid=0x4d444363)"
# {
# "timestamp" => "Dec 29 22:41:02", "timestamp8601" => nil,
# "facility" => nil, "priority" => nil, "logsource" => "mako",
# "program" => "dhclient", "pid" => "11675", "action" => "DHCPDISCOVER",
# "interface" => "enp59s0f1", "ip" => "255.255.255.255", "port" => "67",
# "interval" => "3", "message" => "(xid=0x4d444363)"
# }
You can also directly convert to types if you want
grok = Grok.new [ "This is a %{INT:my_field:integer}" ]
map = grok.parse "This is a 123"
map["my_field"] == 123_i32
Supported types are integer
, long
, double
, float
and boolean
to support the existing types of other implementations.
If you want to know what other patterns are available, check out the patterns.cr file or the grok-patterns directory, which lists all the patterns and its accompanying regexes.
Bugs and issues
If you find a bug, please ensure to provide a reproducible example.
Contributing
- Fork it (https://github.com/spinscale/grok.cr/fork)
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Ensure that
./bin/ameba
passes - Create a new Pull Request
Contributors
- Alexander Reelsen - creator and maintainer
grok.cr
- 3
- 0
- 1
- 0
- 1
- almost 5 years ago
- December 18, 2019
MIT License
Wed, 06 Nov 2024 18:04:25 GMT