What is hydra-file_characterization?

Provides a wrapper for file characterization.

Product Owner & Maintenance

hydra-file_characterization is a Core Component of the Samvera community. The documentation for what this means can be found here.

Getting Started

If you are using Rails add the following to an initializer (./config/initializers/hydra-file_characterization_config.rb):

Hydra::FileCharacterization.configure do |config|
  config.tool_path(:fits, '/path/to/fits')
Hydra::FileCharacterization.characterize(File.read(filename), File.basename(filename), :fits)
  • Why file.read? To highlight that we want a string. In the case of ActiveFedora, we have a StringIO instead of a file.
  • Why file.basename? In the case of Fits, the characterization takes cues from the extension name.

You can call a single characterizer...

xml_string = Hydra::FileCharacterization.characterize(File.read("/path/to/my/file.rb"), 'file.rb', :fits)

...for this particular call, you can specify custom fits path...

xml_string = Hydra::FileCharacterization.characterize(contents_of_a_file, 'file.rb', :fits) do |config|
  config[:fits] = './really/custom/path/to/fits'

...or even make the path callable...

xml_string = Hydra::FileCharacterization.characterize(contents_of_a_file, 'file.rb', :fits) do |config|
  config[:fits] = lambda {|filename|  }

...or even create your custom characterizer on the file...

xml_string = Hydra::FileCharacterization.characterize(contents_of_a_file, 'file.rb', :my_characterizer) do |config|
  config[:my_characterizer] = lambda {|filename|  }

You can also call multiple characterizers at the same time.

fits_xml, ffprobe_xml = Hydra::FileCharacterization.characterize(contents_of_a_file, 'file.rb', :fits, :ffprobe)

Registering New Characterizers

This is possible by adding a characterizer to the Hydra::FileCharacterization::Characterizers' namespace.


  1. bundle install
  2. Increase the version number in lib/hydra/file_characterization/version.rb
  3. Increase the same version number in .github_changelog_generator
  4. Update CHANGELOG.md by running this command:
  github_changelog_generator --user samvera --project hydra-file_characterization --token YOUR_GITHUB_TOKEN_HERE
  1. Commit these changes to the master branch
  2. Run rake release


