NewspaperWorks

Code: Build Status Coverage Status

Docs: Apache 2.0 License Contribution Guidelines

Jump in: Slack Status

Overview

NewspaperWorks is a gem (Rails "engine") for Hyrax -based digital repository applications to support ingest, management, and display of digitzed newspaper content.

NewspaperWorks is not a stand-alone application. It is designed to be integrated into a new or existing Hyrax (2.5.x) application, providing content models, ingest workflows, and feature-rich UX for newspaper repository use-cases.

NewspaperWorks supports:

  • models for Title, Issue, Page, and Article
  • batch ingest via command line
  • OCR and ALTO creation
  • newspaper-specific metadata fields
  • full-text search
  • calendar-based issue browsing
  • advanced search
  • OCR keyword match highlighting
  • viewer with page navigation and deep zooming

A complete list of features can be found here.

Documentation

A set of helpful documents to help you learn more and deploy NewspaperWorks can be found on the Project Wiki, including a PCDM model diagram, metadata schema, batch ingest instructions, and more details on installing, developing, and testing the code.

Requirements

Dependencies

See the wiki for more details on how to install and configure dependencies.

Installation

NewspaperWorks easily integrates with your Hyrax 2.5.x applications.

  • Add gem 'newspaper_works' to your Gemfile.
  • Run bundle install
  • Run rails generate newspaper_works:generate
  • Set config options as indicated below...

Application/Site Specific Configuration

Config changes made by the installer:

  • In app/controllers/catalog_controller.rb, the config.search_builder_class is set to a new CustomSearchBuiler to support newspapers search features.
  • Additional facet fields for newspaper metadata are added to app/controllers/catalog_controller.rb.
  • Newspaper resource types added to config/authorities/resource_types.yml.

(It may be helpful to run git diff after installation to see all the changes made by the installer.)

Configuration changes you should make after running the installer:

in config/intitializers/hyrax.rb:

  • set config.geonames_username
  • set config.work_requires_files = false
  • set config.iiif_image_server = true
  • set config.fits_path = /location/of/fits.sh

in config/environments/production.rb:

  • set config.public_file_server.enabled = true

Ingesting Content

NewspaperWorks supports a range of different ingest workflows:

The ingest process creates a full complement of derivatives for each Page object, including:

  • TIFF
  • JP2
  • PDF
  • OCR text
  • word-coordinate JSON

For more information on derivatives, see the wiki.

Developing, Testing, and Contributing

Detailed information regarding development and testing environments setup and configuration can be found here

A Vagrant VM is available for users and developers to quickly and easily deploy the latest NewspaperWorks codebase using Vagrant and VirtualBox. See samvera-newspapers-vagrant for more.

Additionally, the NewspaperWorks Demo Site is available for those interested in testing out NewspaperWorks as deployed in a vanilla Hyrax application. (NOTE: The demo site may not be running the latest release of NewspaperWorks.)

Contributing

We encourage anyone who is interested in newspapers and Samvera to contribute to this project. How can I contribute?

Acknowledgements

Sponsoring Organizations

This gem is part of a project developed in a collaboration between The University of Utah, J. Willard Marriott Library and Boston Public Library, as part of a "Newspapers in Samvera" project grant funded by the Institute for Museum and Library Services.

The development team is grateful for input, collaboration, and support we receive from the Samvera Community, related working groups, and our project's advisory board.

More Information

Contact

Contact any contributors above by email, or ping us on Samvera Community Slack channel(s)

Institute of Museum and Library Services Logo

University of Utah Logo

Boston Public Library Logo

This software has been developed by and is brought to you by the Samvera community. Learn more at the Samvera website.

Samvera Logo