DataBuilder

The goal of DataBuilder is to apply an expressive means of constructing a data set based on information stored in YAML files.

Installation

To get the latest stable release, add this line to your application's Gemfile:

gem 'data_builder'

To get the latest code:

gem 'data_builder', git: 'https://github.com/jeffnyman/data_builder'

After doing one of the above, execute the following command:

$ bundle

You can also install DataBuilder just as you would any other gem:

$ gem install data_builder

Usage

DataBuilder is using my DataReader gem to provide base-level functionality. Unlike DataReader, DataBuilder will assume some defaults.

Loading with Default Path

Consider the following file and directory setup:

project_dir\
  config\
    config.yml

  data\
    stars.yml

  env\
    environments.yml

  example-data-builder.rb

All the code shown below would go in the example-data-builder file.

With the above class in place and the above directory structure, you could do something as simple as this:

require "data_builder"

data = DataBuilder.load 'stars.yml'

puts data

Here I'm relying on the fact that DataBuilder applies a default directory of data. I then use the load method of DataReader to call up a file in that directory.

Loading with Specified Path

You can set a specific data path with DataBuilder as such:

require "data_builder"

DataBuilder.data_path = 'env'

Here you can inform DataBuilder where it can find the data files using data_path. As you've seen, if you don't specify a directory then DataBuilder will default to using a directory named data.

After setting the directory you must load a file. This can be accomplished by calling the load method.

data = DataBuilder.load 'environments.yml'

puts data

Here the data variable would contain the contents of the environments.yml file.

However, everything said so far is really just using DataBuilder as an overlay for DataReader.

Data About

Where DataBuilder steps in is when you want to use the data. DataBuilder provides a data_about method that will return the data for a specific key from any data files that have been loaded.

The most common way to use this is to include or extend the DataBuilder module. Let's say that you have a data directory and in that directory you have a file called default.yml. The YAML file has the following contents:

alpha centauri:
  warpFactor: 1
  velocity: 2
  distance: 4.3

epsilon eridani:
  warpFactor: 1
  velocity: 2
  distance: 10.5

Now let's use DataBuilder to get the information from it. You can extend or include DataBuilder as part of another class.

Extending DataBuilder

class Testing
  extend DataBuilder
end

data = Testing.data_about('alpha centauri')

Including DataBuilder

class Testing
  include DataBuilder
end

testing = Testing.new
data = testing.data_about('alpha centauri')

The Data Key

In both cases of extending or including, I'm using a variable to store the results of the call. Those results will be the data pulled from the default.yml file. Of note, however, is all that will be pulled is the data from the "alpha centauri" key because that is what you specified in the call to data_about.

Those examples show data_about being passed a string and the reason for that is because the value "alpha centauri" has a space in it. However, if that was not the case -- if the key were, say, "alpha_centauri" -- then you could use a symbol instead, like this:

data = testing.data_about(:alpha_centauri)

Default Files

You might wonder how DataBuilder knew to look for default.yml since I didn't use a load method in these examples. If you do not specify a filename the logic will attempt to use a file named default.yml in the specific data path you have specified or in the default path of data.

Another option is that you can set an environment variable called DATA_BUILDER_SOURCE. When this variable exists and is set, the value it is set to will be used instead of the default.yml file. Keep in mind that the "data source" here refers to the file, not the keys within a file.

Namespaced Data

To organize your data into a rough equivalent of namespaces, and to load that data accordingly, you can do something like this:

class Testing
  include DataBuilder
end

testing = Testing.new

data = testing.data_about('stars/epsilon eridani')

When DataBuilder sees this kind of construct, it will take the first part (before the /) as a filename and the second part as the key to look up in that file. So the above command would look for a file called stars.yml in the data path provided (in this case, the default of data) and then grab the data from the key entry labeled "epislon eridani".

Aliases

Given the examples, you can see that data_about was chosen as the method name to make what's occurring a bit more expressive. You can use the following aliases for data_about:

data_from
data_for
using_data_for
using_data_from

The reason for these aliases is, again, to make the logic expressive about its intent. This is particularly nice if you fit DataBuilder in with the context of a fluent API.

Scenarios in Cucumber

If you are using Cucumber, there is another to specify the data file to load. You can apply a tag to a given scenario. The tag should take the form of @databuilder_NAME where NAME is replaced with the name of the data file you want to be loaded for the scenario.

As an example, if you add the tag @databuilder_stars then the file stars.yml will be loaded. If you want to use the tags you have to add the following code in a hook:

Before do |scenario|
  DataBuilder.data_files_for(scenario)
end

You can use data_for_scenario in place of data_files_for if you feel that reads better.

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec:all to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

The default rake command will run all tests as well as a RuboCop analysis.

To install this gem onto your local machine, run bundle exec rake install.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/jeffnyman/data_builder. The testing ecosystem of Ruby is very large and this project is intended to be a welcoming arena for collaboration on yet another testing tool. As such, contributors are very much welcome but are expected to adhere to the Contributor Covenant code of conduct.

To contribute to DataBuilder:

Fork the project.
Create your feature branch. (git checkout -b my-new-feature)
Commit your changes. (git commit -am 'new feature')
Push the branch. (git push origin my-new-feature)
Create a new pull request.

Author

Jeff Nyman

Credits

This code is loosely based upon the DataMagic gem. I created a new version largely to avoid the name "magic", which I don't think any tool should be promoting. I'm also cleaning up the code and documentation.

License

DataBuilder is distributed under the MIT license. See the LICENSE file for details.