metascraper
metascraper
Metascraper is a little lib for web scraping purposes.
You give it an URL, and it lets you easily get its title, images, description, videos.
Installation
Add this to your application's shard.yml
:
dependencies:
metascraper:
github: malina/metascraper
Usage
require "metascraper"
Initialize a Metascraper instance for an URL, like this:
page = Metascraper.new("https://github.com/malina/metascraper")
puts page.title
Accessing scraped data
page.url # URL of the page
page.images # enumerable collection, with every img found on the page
page.title # title of the page from the head section, as string
page.description # returns the meta description, or the first long paragraph if no meta description is found
page.content # primary readability page content
You can also access most of the scraped data as a hash:
page.to_hash
Contributing
- Fork it ( https://github.com/malina/metascraper/fork )
- Create your feature branch (git checkout -b my-new-feature)
- Commit your changes (git commit -am 'Add some feature')
- Push to the branch (git push origin my-new-feature)
- Create a new Pull Request
Contributors
- malina Alexandr Shumov - creator, maintainer
Repository
metascraper
Owner
Statistic
- 11
- 1
- 0
- 0
- 1
- almost 7 years ago
- August 8, 2016
License
MIT License
Links
Synced at
Sun, 22 Dec 2024 18:42:45 GMT
Languages