PROIEL treebank utility library


Gem Version Build Status


This is a utility library for reading and manipulating treebanks that use the PROIEL annotation scheme and the PROIEL XML-based interchange format.


To install this library you need Ruby 2.1 or newer.

gem install proiel

Getting started

The recommended way to use this library in your application is with bundler. Create a Gemfile with the following content:

source ''
gem 'proiel', '~> 1.0'

and then execute


To download a sample treebank, initialize a new git repository and add the PROIEL treebank as a submodule:

git init
mkdir vendor
git submodule add --depth 1 vendor/proiel-treebank

Here is a skeleton programme to get you started. Save this as myproject.rb:

#!/usr/bin/env ruby
require 'proiel'

tb =
Dir[File.join('vendor', 'proiel-treebank', '*.xml')].each do |filename|
  puts "Reading #{filename}..."

tb.sources.each do |source|
  source.divs.each do |div|
    div.sentences.each do |sentence|
      sentence.tokens.each do |token|
        # Do something

You can now run this as:

bundle exec ruby myproject.rb

See the wiki for more information.


proiel aims to adhere to Semantic Versioning 2.0.0. This means that a patch version or minor version should not break backward compatibility of a public API, and that breaking changes should only be introduced with new major versions. When specifying a dependency on this gem it is best practice to use a pessimistic version constraint with two digits of precision:

spec.add_dependency 'proiel', '~> 1.0'


Check out the git repository from GitHub and run bin/setup to install all development dependencies. Then run rake to run the tests.

You can also run bin/console for an interactive prompt to experiment with.

To install a development version of this gem, run bundle exec rake install.

To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the gem to


Documentation can be generated using YARD:



Bug reports and pull requests are welcome on GitHub at