hydra-file_chracterization
Hydra::FileCharacterization as (extracted from Sufia and Hydra::Derivatives)
Purpose
To provide a wrapper for file characterization
How To Use
If you are using Rails add the following to an initializer (./config/initializers/hydra-file_characterization_config.rb):
Hydra::FileCharacterization.configure do |config|
config.tool_path(:fits, '/path/to/fits')
end
To use the characterizer:
characterization_xml = Hydra.characterize(file.read, file.basename, :fits)
# This does not work at this point
fits_xml, ffprobe_xml = Hydra.characterize(file.read, file.basename, :fits, :ffprobe)
- Why
file.read
? To highlight that we want a string. In the case of ActiveFedora, we have a StringIO instead of a file. - Why
file.basename
? In the case of Fits, the characterization takes cues from the extension name.
Registering New Characterizers
This is possible by adding a characterizer to the Hydra::FileCharacterization::Characterizers
' namespace.
To Consider
How others are using the extract_metadata method
Todo Steps
- ~~Given a filename, characterize the file and return a raw XML stream~~
- ~~Provide method for converting a StringIO and original file name to a temp file with comparable, then running the characterizer against the tempfile~~
- ~~Provide a configuration option for the fits path; This would be the default for the characterizer~~
- Update existing Sufia implementation
- Deprecrate Hydra::Derivatives direct method call
- Instead call the characterizer with the content
- Allow characterization services to be chained together
- ~~This would involve renaming the Characterizer to something else (i.e. Characterizers::Fits)~~
- Provide an ActiveFedora Datastream that maps the raw XML stream to a datastructure