BjnInventory

This gem is designed to help materialize a standardized inventory of devices according to:

  • A model you specify
  • Multiple inventory sources you configure
  • A mapping between each source type and your standard model

The materialized inventory can easily be converted to different formats:

  • Raw JSON or YAML
  • Ansible dynamic inventory

Installation

Add this line to your application's Gemfile:

gem 'bjn_inventory'

And then execute:

$ bundle

Or install it yourself as:

$ gem install bjn_inventory

Usage

An inventory is a list of devices, created from a specification, that specifies:

  • A device model
  • One or more sources, each of which has:
    • A source-specific filename, URL or other location information
    • A source-specific origin, which maps source data to the standard model
  • Context data
    • These are just JSON files that you can refer to in your map and model

Example usage (for an Ansible dynamic inventory script--but see ansible-from):

require 'bjn_inventory'

manifest = JSON.parse(File.read('inventory.json'))
inventory = BjnInventory.new(manifest)
ansible = {
  group_by: [
    environment,
    region,
    roles
  ],
  groups: {
    webservers: ['www', 'stage-www']
  }
}
# Adds a method to Array to convert to Ansible dynamic inventory output
puts inventory.to_ansible(ansible)

inventory.json

{ "model": "/etc/inventory/device.json",
  "context": "/etc/inventory/data/context",
  "sources": [
    { "file": "/etc/inventory/data/device42.json",
      "rules": "/etc/inventory/maps/device42.rb" },
    { "file": "/etc/inventory/data/aws.json",
      "rules": "/etc/inventory/maps/aws.rb" }
  ]
}

Model

When bjn_inventory produces an inventory, each device conforms to the device model. They have only the fields specified in the model, and defaults are filled in according to the model.

Your inventory sources may not conform to the model. That's where there are [Rules#rules] to map the source entries into proper devices, according to the model you provide.

The model generally takes the form of a JSON file, but can be embedded directly in the inventory specification.

Merge Rules

When two devices need to be merged (for example, you are invoking BjnInventory::Inventory#by(key) and they have the same value for the key field), a new Device object is created with field values taken from the second, merged with the first (similar to a Hash merge). This merge is done according to the device model and the following rules:

  • Non-nil values take precedence over nil
  • Hashes are merged shallowly according to a standard Ruby hash merge
  • Arrays are concatenated, except duplicate values from the second device are not added
  • The second device's other values take precedence

The resulting merged device's #origin method returns an array of the different origins used to merge it together.

Sources

Each source you specify is read, in order. When you invoke BjnInventory::Inventory#by(key), all sources are used. If two entries have the same key, they are merged together using the merge rules#merge-rules. Order is strictly preserved here, so sources listed later in the list have data precedence over those listed earlier (like Ruby merge logic).

Right now, only two kinds of sources are supported: inline, with the entries key, where the inventory entries are specified directly in the source; or with the file key, where the file must contain a JSON array of objects.

In other words, it's assumed that you're separately downloading your source of inventory into JSON files for bjn_inventory to operate on. In the future, a download command or plugin may be allowed here.

This package also provides a downloader command for AWS EC2. Use aws-ec2-source to download and minimally process an EC2 instance list to provide an inventory source.

Rules

The mapping rules allow you to specify field values in the model and what calculation to perform from a source to derive them in the following DSL (domain-specific language). The rules consist of either text or a filename.

The origin command takes a string, which specifies (for your convenience) the origin type of the resulting devices. For example, your AWS mapping rules might use origin 'aws' to identify resulting devices as having come from the AWS source. You can reuse rules files for different sources, or not, so it's up to you whether this origin represents a particular source of data or a particular kind of device.

The map command takes a hash with a field name as a key and a mapping rule type (such as ruby or jsonpath). For example, the following rule specifies that the name field in your device model comes from either the fqdn field from the source entry, or the name field, if fqdn is nil:

map name: ruby { |data| data['fqdn'] || data['name'] }

As an alternative to this syntax, you can mention the field name directly and omit the ruby keyword:

name { |data| data['fqdn'] || data['name'] }

These are examples of ruby rules, which take the form of a Ruby block. This block accepts up to two arguments: the first is the data from the source entry (the "raw" data) in the form of a Hash, with the default values from the device model added; the second is the current BjnInventory::Device object. For example, if your model contains the name and domain fields, the following rule specifies a fqdn field that uses them:

fqdn { |_data, device| device.name + '.' + device.domain }

There are other rule types available:

A jsonpath rule uses a JSONPath expression in a string to set the device field. For example, the following rule sets the name field using the value of the Name tag (assuming AWS tagging, e.g. when using aws-ec2-source):

name jsonpath '$.tags[?(@["key"]=="Name")].value'

A synonym rule simply makes the field a synonym for another device model field. For example, the following rule makes the field management_ip exactly synonymous with the ip_address field:

management_ip synonym :ip_address

An always rule simply uses a constant value for that field. For example, the following rule sets the value of system_type to ec2_instance unconditionally:

system_type always 'ec2_instance'

Commands

Formatters

The inventory can be output in different formats. Two formatters are provided which can display the inventory as model-conformant devices: either as a JSON array or an object keyed by an identifying field (inventory-model); or in the Ansible Dynamic Inventory format.

The refresh_inventory_data formatter formats the inventory into groups and devices in a file tree. A devices.json index is produced, with all devices, and each device also gets a file in devices/<key>.json. In addition the groups are listed in a groups.json index, and each has a list of devices in groups/<group>.json. This facilitates sharing the inventory over the web, as well (though bjn_inventory is not itself a network service).

Downloaders

The overall design of the software encourages you to download entries from inventory sources in a "close to raw" manner, and let the source merging and mapping rules transform the entries into devices. Because the core inventory generation offers no mapping, the downloader is also the point at which to filter out entries that should not be devices.

For convenience, AWS downloaders are provided for instances (aws-ec2-source), classic ELB (aws-elb-source) and RDS (aws-rds-source) resources. Run each with valid AWS credentials in your ~/.aws/credentials file to see what the output looks like. They each accept a --filter argument: in the EC2 downloader's case, this is passed to the AWS API to filter instances according to the attributes and tags it offers; in the case of the ELB downloader, the syntax is the same, but it is enforced by an internal rules- matching library. The RDS downloader also does filtering based only on tags, using the --tag-filters option.

Service Maps

A common use case for device inventories is to be able to present service endpoints calculated dynamically from inventory. These could be used as-is, imported into a service discovery system, etc. To facilitate this, a special formatter is provided which maps devices and groups into "service endpoints", which are defined by a service map.

A service map is a JSON object stored in a file (and provided to the service-map command via the --map option), where the keys are service prefixes and the values are objects. It can be arbitrarily deeply nested, with the deepest object in the tree being a service specifier.

A service specifier consists of the special field hosts, the value of which is a list of groups. Each device in all the groups is added to the service under the preceding prefix. The other fields in the service specifier are arbitrary; each is passed through unchanged, so that the "leaf" of the service map consists of an object where the service specifier's key is joined with each of the specifier fields by a dot (.); and the hosts field has each device's endpoint listed with the join_with character (by default, a comma; or as a JSON array).

The groups are determined by a group specification: similar to the ansible-from command, a JSON object (in the file given by the --groups option) with a "group_by" key, the value of which is a list of fields to create groups by. Note that if you use the ansible-from command, you can use the same groups file; but Ansible's "groups" key, which specifies groups of groups, is ignored.

In the service prefix (the nested keys and objects which precede the "leaf" of the service map), device fields can be given with a dollar sign ($) prepended; these will be interpolated with the actual devices' field values. This also means that trees can be copied into the map multiple times, once for every unique value of the field in the relevant groups.

The above description sounds complicated, but the result is a fairly simple way to map service endpoints into an arbitrarily complex tree, and a couple of examples are probably best to illustrate the point. Let's say you have three devices in your inventory: two in the us-west-2 region and one in the eu-west-1 region. In your us-west-2 region, one instance has the web role and one has the db role. In the us-west-1 region, you just have a webserver. Your whole inventory looks like this (for example, if you run inventory-model --manifest manifest.json:

[
  { "name": "web-01",
    "roles": ["web"],
    "region": "us-west-2" },
  { "name": "db-01",
    "roles": ["db"],
    "region": "us-west-2" },
  { "name": "web-02",
    "roles": ["web"],
    "region": "eu-west-1" }
]

You create the following service map:

{
  "services": {
    "$region": {
      "www": {
        "hosts": ["web"],
        "port": 80
      }
      "$cluster": {
        "qwerty": {
          "hosts": ["qwerty"],
          "hosts_override": "$region.$cluster.com"
        }
      }
    }
  },
  "monitor": {
    "$region": {
      "nagios": {
        "hosts": ["web", "db"],
        "key": "monitoring-key.rsa"
      }
    }
  }
}

When you run service-map --map map.json --manifest manifest.json --hosts-field name, you get the following output:

{
  "services": {
    "us-west-2": {
      "www.hosts": "10.0.1.1",
      "www.port": 80,
      "qwerty_cluster": {
        "qwerty.hosts": "us-west-2.qwerty_cluster.com"
      }
    },
    "eu-west-1": {
      "www.hosts": "10.0.10.1",
      "www.port": 80,
      "qwerty_cluster": {
        "qwerty.hosts": "eu-west-1.qwerty_cluster.com"
      }
    }
  },
  "monitor": {
    "us-west-2": {
      "nagios.hosts": "10.0.1.1,10.0.1.2",
      "nagios.key": "monitoring-key.rsa"
    },
    "eu-west-1": {
      "nagios.hosts": "10.0.10.1",
      "nagios.key": "monitoring-key.rsa"
    }
  }
}

If you need to have a key in your service map that starts with a dollar sign, use two dollar signs instead.

TODO

  • The refresh_inventory_data formatter needs to changed to be more parallel with the other formatters (--ansible should be --groups, it should probably be named inventory-files or something.
  • The aws-rds-source should be refactored to take the --filters argument and use BjnInventory::Util::Filter::JsonAws like aws-elb-source.

Design Decisions

  • No calculated fields in the model.
  • No filtering: you can either filter the sources, or you can filter the inventory after producing it
  • No validation of model fields (exactly, though merge is sensitive to the model)
  • For now, no fancy file finding (filenames in manifests must be absolute, or correct with respect to wherever you're running the software). Possibly this might change with the addition of the ability to pass a filename to BjnInventory::Inventory.new().

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports can be sent to Ops Tools .