- Ingesting Content
- Developing, Testing, and Contributing
NewspaperWorks is a gem (Rails "engine") for Hyrax -based digital repository applications to support ingest, management, and display of digitzed newspaper content.
NewspaperWorks is not a stand-alone application. It is designed to be integrated into a new or existing Hyrax (2.5.x) application, providing content models, ingest workflows, and feature-rich UX for newspaper repository use-cases.
- models for Title, Issue, Page, and Article
- batch ingest via command line
- OCR and ALTO creation
- newspaper-specific metadata fields
- full-text search
- calendar-based issue browsing
- advanced search
- OCR keyword match highlighting
- viewer with page navigation and deep zooming
A complete list of features can be found here.
A set of helpful documents to help you learn more and deploy NewspaperWorks can be found on the Project Wiki, including a PCDM model diagram, metadata schema, batch ingest instructions, and more details on installing, developing, and testing the code.
- Ruby >=2.4
- Rails ~>5.1
- Hyrax ~>2.5
- ...and various Samvera dependencies that entails.
- A Hyrax-based Rails application
- ImageMagick policy XML may need to be more permissive in both resources and source media types allowed. See template policy.xml.
See the wiki for more details on how to install and configure dependencies.
NewspaperWorks easily integrates with your Hyrax 2.5.x applications.
gem 'newspaper_works'to your Gemfile.
rails generate newspaper_works:generate
- Set config options as indicated below...
Application/Site Specific Configuration
Config changes made by the installer:
config.search_builder_classis set to a new
CustomSearchBuilerto support newspapers search features.
- Additional facet fields for newspaper metadata are added to
- Newspaper resource types added to
(It may be helpful to run
git diff after installation to see all the changes made by the installer.)
Configuration changes you should make after running the installer:
- Enables geolocation tagging of content
- how to create a Geonames username
config.work_requires_files = false
config.iiif_image_server = true
config.fits_path = /location/of/fits.sh
config.public_file_server.enabled = true
NewspaperWorks supports a range of different ingest workflows:
- single-item ingest via the UI
- batch ingest of NDNP materials (page-level digitization) via command line
- batch ingest of PDF issues via command line
- batch ingest of TIFF or JP2 master files via command line
The ingest process creates a full complement of derivatives for each Page object, including:
- OCR text
- word-coordinate JSON
For more information on derivatives, see the wiki.
Developing, Testing, and Contributing
Detailed information regarding development and testing environments setup and configuration can be found here
A Vagrant VM is available for users and developers to quickly and easily deploy the latest NewspaperWorks codebase using Vagrant and VirtualBox. See samvera-newspapers-vagrant for more.
Additionally, the NewspaperWorks Demo Site is available for those interested in testing out NewspaperWorks as deployed in a vanilla Hyrax application. (NOTE: The demo site may not be running the latest release of NewspaperWorks.)
We encourage anyone who is interested in newspapers and Samvera to contribute to this project. How can I contribute?
This gem is part of a project developed in a collaboration between The University of Utah, J. Willard Marriott Library and Boston Public Library, as part of a "Newspapers in Samvera" project grant funded by the Institute for Museum and Library Services.
The development team is grateful for input, collaboration, and support we receive from the Samvera Community, related working groups, and our project's advisory board.
- Samvera Newspapers Group - The Samvera Newspapers Interest groups meets on the first Thursday of every month to discuss the Samvera newspapers project and general newspaper topics.
- Newspapers in Samvera IMLS Grant (formerly Hydra) - The official grant award for the project.
- National Digital Newspapers Program NDNP
Contact any contributors above by email, or ping us on Samvera Community Slack channel(s)
This software has been developed by and is brought to you by the Samvera community. Learn more at the Samvera website.