Pluggable file text and metadata extraction service.
Add this line to your application's Gemfile:
And then execute:
Or install it yourself as:
$ gem install ddr-extraction
The gem has no external dependencies of its own. Consult the documentation for each extraction tool used by your configuration.
require "ddr-extraction Ddr::Extraction.load_defaults!
There are rake tasks for downloading Tika and FITS to expected locations.
rake tika:download rake fits:download
::. do |config| config.adapters.default = :tika # Use Tika as the default adapter config.adapters.tika.path = "/path/to/tika-app.jar" config.adapters.fits.path = "/path/to/fits.sh" end
>> extractor = Ddr::Extraction.build_extractor >> text = extractor.extract(:text, "spec/fixtures/sample.docx") >> puts text.read This is a sample document.
- Fork it ( https://github.com/[my-github-username]/ddr_extractor/fork )
- Create your feature branch (
git checkout -b my-new-feature)
- Commit your changes (
git commit -am 'Add some feature')
- Push to the branch (
git push origin my-new-feature)
- Create a new Pull Request