License Scout

License Scout is a utility that discovers and aggregates the licenses for your software project's transitive dependencies.

Currently supported Dependency Types and Dependency Managers are:

Dependency Type Supported Dependency Managers
chef_cookbook berkshelf
erlang rebar
elixir mix
golang dep, godep, glide
habitat habitat
nodejs npm
perl cpan
ruby bundler
rust cargo

Installation

Ruby Gem

gem install license_scout

Dependencies

  • If you wish to scan for berkshelf dependencies, you'll need to manually install the Berkshelf gem in the same Ruby as License Scout
  • If you wish to scan for mix or rebar dependencies, you'll need to install Erlang OTP 18.3 or greater.
  • If you wish to scan for cargo dependencies, you'll need to manually install cargo

Habitat

hab pkg install chef/license_scout
hab pkg binlink chef/license_scout license_scout

Usage

License Scout's default behavior is to scan the current directory and return a breakdown of all the licenses it can find.

my_project $ license_scout

+------+------------+------------+---------+
| Type | Dependency | License(s) | Results |
+------+------------+------------+---------+
...

LicenseScout will exit 0 if it was able to find licenses for all your dependencies. Otherwise, it will exit 1.

Under the covers, License Scout leverages Licensee (the same Ruby Gem GitHub uses to detect OSS licenses). In addition to using Licensee to scan your source code for licenses, License Scout will go a step further and attempt to determine if the metadata provided by the Dependency Manager specifies which license each dependency uses. At the end of the process, License Scout will provide you a Dependency Manifest following information:

  1. The name of the license(s) (the SPDX ID if the a recognized open source license).
  2. The name of the file where the License Scout found the license.
  3. The contents of the license file (if available).

In addition to the printout provided to STDOUT, License Scout will also save a JSON manifest of all your dependencies to disk.

{
  "license_manifest_version": 2,
  "generated_on": "<DATE>",
  "name": "<YOUR_PROJECT>",
  "dependencies": [...]
}

For more information about the structure of JSON manifest, please check out the full JSON Schema.

Result Types

License Scout will provide a summary of the licenses it finds to STDOUT. These results are intended to provide direction as to which actions may or may not be necessary to generate a Dependency Manifest that meets all of your compliance requirements. To do this it categorizes its findings into the following results.

Result Description
Flagged License Scout was able to determine the license for this software dependency, and it is one of the licenses you have explicitly flagged. You should either remove the dependency or add an Exception.
Missing License Scout could not find any license files or license metadata associated with this dependency. You should contact the maintainer and/or specify a Fallback License.
Not Allowed License Scout was able to determine the license for this software dependency, but it is not one of the licenses you have explicitly allowed. You should either remove the dependency or add an Exception.
OK There were no issues.
Undetermined License Scout found a license file but was unable to determine (with sufficient confidence) what license that file represents. License Scout was also unable to determine the license using Dependency Manager metadata. You should contact the maintainer and/or specify a Fallback License.

Advanced Usage

Configuration File(s)

You can control License Scout's behavior by providing one or more YAML configuration files, available either locally or via HTTP, to the --config-files option of the CLI.

$ license_scout --config-files http://example.com/license_scout/common.yml,./.license_scout.yml

License Scout evalutes these files in the order they are provided, allowing you to hydrate configuration by composing multiple files together. For example, you can have a single organization-wide configuration file that specifies what licenses are allowed and project-specific configuration file that specifies exceptions and which directories to scan.

How multiple configuration files are handled

License Scout uses mixlib-config to handle it's configuration. When loading multiple configuration files, mixlib-config (and thus License Scout) will not perform deep merges of Arrays. That means that License Scout will not merge (for example) allowed_licenses (or flagged_licenses) from two different configuration files; it will only take the allowed_licenses value from the configuration that is loaded last. This logic does not apply to the fallbacks or exceptions, because those are defined as config_contexts. It does apply to the individuals types specified within the fallbacks or exceptions however.

Allowed and Flagged Licenses

License Scout provides you with the ability to provide a list of licenses that are explicitly allowed, or a list of licenses that should be flagged for further scrutiny.

  • When you specify a list of allowed_licenses, License Scout will exit 1 if it detects a dependency with a license other than one on the list.
  • When you specify a list of flagged_licenses, License Scout will exit 1 if it finds a dependency with that license.

To add a license to the list of allowed or flagged licenses, you need only provide the array of licenses as a string in your configuration file. A configuration may have a list of allowed licenses or flagged licenses, it cannot have both. License Scout does not support regular expressions or glob-patterms for allowed_licenses or flagged_licenses.

allowed_licenses:
  - Apache-2.0

# OR

flagged_licenses:
  - Apache-2.0

License Scout will compare these string values to the licenses it finds within the dependencies. License Scout does its best to resolve everything down to valid SPDX IDs, so you should specify licenses using their SDPX ID.

Warning: Because we cannot control how maintainers specify licenses in their metadata, there may be a situation where License Scout cannot correctly detect the intended SPDX ID. In this case, you may need to temporarily provide a temporary Fallback License in your configuration. If you encounter this situation, we encourage you to open an Issue with us.

Dependency Exceptions

If you specify a list of allowed or flagged licenses, there may be a dependency that does not adhere to the specified license(s) for which you wish to make an exception. License Scout allows you to specify Exceptions to these lsits as part of your Configuration File.

---
allowed_licenses:
  - Apache-2.0

exceptions:
  ruby:
    - name: bundler
      reason: Used only during .gem creation
    - name: json (1.8.3)

Exceptions are organized by type (e.g. ruby - see Table above). There are two elements to each exception: a name and a reason.

Property Description
name Can be specified by dep-name or dep-name (dep-version) where dep-name is the name of the dependency as it exists in the Dependency Manifest and dep-version can be a traditional version, git reference, or type-specific version specification such as $pkg_version-$pkg_release for Habitat.
reason An optional string that will be included in the Dependency Manifest for documentation purposes.

Simple glob-style pattern matching is supported for Exceptions, so you can have an Exception for a large collection of dependencies without enumerating them all.

---
exceptions:
  chef_cookbook:
    - name: apache2 (5.*)
      reason: Allowed by TICKET-001
  habitat:
    - name: core/bundler (1.15.1-*)
      reason: Only used for .gem creation
  ruby:
    - name: aws-sdk-*
      reason: Exception granted by Bobo T. Clown on 2018/02/31

Fallback Licenses

In situations where License Scout is unable to determine the license for a particular dependency, either because Licensee was not able to identify any of the license files or the Dependency Manager did not provide any metadata that incidated how the dependency was licensed, you'll need to provide a Fallback License in your configuration. Like Exceptions, Fallback Licenses are grouped by type.

fallbacks:
  golang:
    - name: github.com/dchest/siphash
      license_id: CC0-1.0
      license_content: https://raw.githubusercontent.com/dchest/siphash/master/README.md
Property Description
name The name of the dependency as it appears in the JSON manifest.
license_id The ID of the license as it appears in the JSON manifest.
license_content A URL to a file where the raw text of the license can be downloaded.

In addition to including any files Licensee identified as potential license files (but couldn't identify), License Scout will also include the Fallback License you specified in the Dependency Manifest.

Searching Nested Subdirectories

License Scout's default behavior is to only look for dependency manager files in the root of the directories that you configure. This default behavior provides greater control over the dependencies that you want to appear in your report. For example, you may not want to enforce license acceptance on an internal-only tool that is included in a project.

License Scout will also scan subdirectories for all dependency manager files and generate a full report on all dependencies that the project uses. To do this, either specify the --include-sub-directories command line flag, or set include_subdirectories to true in your configuration file.

A common use case for this functionality is to run license_scout from the root of a project and get a full report for that project.

license_scout --include-sub-directories

Habitat Channel Configuration

By default License Scout searches for Habitat package in the stable channel. If your build process publishes packages to another channel by default, you can use the channel_for_origin habitat configuration option:

habitat:
  channel_for_origin:
    - origin: yourorigin
      channel: dev
    - origin: someotherorigin
      channel: prod

Exporting a Dependency Manifest to another format

By default, License Scout creates the Dependency Manifest as a JSON file. We do this because it provides a single document that can be easily processed into many different forms. License Scout has the ability to also export that JSON file into other formats.

Usage

license_scout export [PATH_TO_JSON_FILE] --format FORMAT

Support Formats

Format Description
csv Export the contents of the JSON file into a CSV.

Configuration

Value Description Default
directories The fully-qualified local paths to the directories you wish to scan The current working directory.
include_subdirectories Whether or not to include all nested sub-directories of directories in the search. false
name The name you want to give to the scan result. The basename of the first directory to be scanned.
output_directory The path to the directory where the output JSON file should be saved. The current working directory.
log_level What log information should be included in STDOUT info
allowed_licenses Only allow dependencies to have these licenses. []
flagged_licenses An array of licenses that should be flagged for removal or exception. []
exceptions An array of Exceptions. []
environment A hash of additional Environment Variables to pass to mixlib-shellout {}
escript_bin The path to the escript binary you wish to use when shelling out to Erlang. escript
ruby_bin The path to the ruby binary you wish to use when shelling out to Ruby. ruby
cpanm_root The path to where the cpanminus install cache is located. ~/.cpanm

Contributing

This project is maintained by the contribution guidelines identified for chef project. You can find the guidelines here:

https://github.com/chef/chef/blob/master/CONTRIBUTING.md

Pull requests in this project are merged when they have two :+1:s from maintainers.

Maintainers