Gem Version Build Status Benchmark Status Pull Requests

Overview

Serialbench is a comprehensive benchmarking suite that evaluates the performance of popular Ruby serialization libraries across multiple formats. It provides detailed performance comparisons and analysis to help developers make informed decisions when choosing serialization libraries for their Ruby applications.

Supported Formats: XML, JSON, YAML, TOML, and more

Key Metrics: Parsing speed, generation speed, memory usage, streaming capabilities, and feature completeness

Multi-Environment Support: Docker and ASDF-based multi-Ruby version benchmarking with automated result aggregation and HTML site generation

Supported serialization libraries

Format Name Version Description

XML

Ox

v2.14.23

C extension XML parser

XML

LibXML

v4.1.2

Ruby bindings for libxml2

XML

Nokogiri

v1.18.8

XML/HTML parser with XPath and CSS selectors

XML

Oga

v3.4

Pure Ruby XML parser with XPath support

XML

REXML

v3.4.1

Ruby’s standard library XML parser

JSON

Oj

v3.16.11

JSON parser with multiple parsing modes

JSON

YAJL

v1.4.3

JSON library with streaming capabilities

JSON

JSON

v2.12.2

Ruby’s standard library JSON parser

YAML

Psych

v5.1.2

Ruby’s standard library YAML parser

YAML

Syck

v1.5.1.1

Legacy YAML parser

TOML

Tomlib

v0.7.3

TOML parser implemented in C

TOML

TOML-RB

v2.2.0

Pure Ruby TOML parser

TOML

tomlrb

v2.0.3

A Racc based TOML Ruby parser (Only supports parsing, no support for dumping/writing.)

Data formats and schema

Serialbench generates structured YAML output for benchmark results, with different formats for single-environment and multi-environment runs.

The data formats include:

  • Single benchmark results: Individual benchmark run output

  • Result set data structure: Multi-platform benchmark aggregation

  • JSON schema specification: Complete schema validation rules

  • Configuration file formats: Docker and ASDF configuration examples

Prerequisites

System requirements

  • Ruby: 3.0 or later (3.3+ recommended for best performance)

  • Operating system: Linux, macOS, or Windows

  • Architecture: x86_64 or ARM64

Library dependencies

System dependencies (required for some native extensions):

# macOS with Homebrew
$ brew install libxml2 libxslt

# Ubuntu/Debian
$ sudo apt-get install libxml2-dev libxslt1-dev build-essential

# CentOS/RHEL/Fedora
$ sudo yum install libxml2-devel libxslt-devel gcc gcc-c++

Serialbench fetches Ruby build definitions from the GitHub API. While this works without authentication, GitHub imposes strict rate limits on unauthenticated requests (60 requests per hour). For better reliability, especially in CI/CD environments or when running multiple benchmarks, set the GITHUB_TOKEN environment variable.

Rate limits:

  • Without token: 60 requests per hour

  • With token: 1,000 requests per hour

Setting the token:

# For local usage
export GITHUB_TOKEN=ghp_your_token_here

# For GitHub Actions (automatic)
# GitHub Actions automatically provides ${{ secrets.GITHUB_TOKEN }}
# No manual configuration needed in workflows

Creating a token:

  1. Go to https://github.com/settings/tokens

  2. Click "Generate new token (classic)"

  3. Select scope: public_repo (for accessing public repositories)

  4. Copy the generated token

  5. Store it securely in your environment

Note
The token is optional. Serialbench will work without it but may hit rate limits when updating Ruby build definitions frequently.

Installation

Add this line to your application’s Gemfile:

gem 'serialbench'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install serialbench

Command line interface

Serialbench provides a comprehensive Thor-based CLI with four main subcommands for managing environments, benchmarks, result sets, and Ruby builds.

Main Commands Overview

$ serialbench
Serialbench - Benchmarking Framework for Ruby Serialization Libraries

USAGE:
  serialbench COMMAND [SUBCOMMAND] [OPTIONS]

COMMANDS:
  environment   Manage benchmark environments (Docker, ASDF, Local)
  benchmark     Manage individual benchmark runs
  resultset     Manage benchmark resultsets (collections of runs)
  ruby-build    Manage Ruby-Build definitions for validation
  version       Show version information
  help          Show this help message

EXAMPLES:
  # Create a Docker environment
  serialbench environment new docker-test docker

  # Run multi-environment benchmarks
  serialbench environment multi-execute asdf --config=serialbench-asdf.yml
  serialbench environment multi-execute docker --config=serialbench-docker.yml

  # Create and execute a benchmark
  serialbench benchmark create my-benchmark
  serialbench benchmark execute my-benchmark.yml

  # Create a result set for comparison
  serialbench resultset create comparison-set
  serialbench resultset add-result comparison-set results/my-benchmark

  # Generate static sites
  serialbench benchmark build-site results/my-benchmark
  serialbench resultset build-site resultsets/comparison-set

Environment management

The environment subcommand manages environment configurations and executes benchmarks across different Ruby environments.

$ serialbench environment help
Commands:
  serialbench environment execute ENVIRONMENT_CONFIG BENCHMARK_CONFIG RESULT_PATH  # Execute benchmark in environment
  serialbench environment help [COMMAND]                                           # Describe subcommands or one specific subcommand
  serialbench environment new NAME KIND RUBY_BUILD_TAG                             # Create a new environment configuration
  serialbench environment prepare ENVIRONMENT_CONFIG                               # Prepare environment for benchmarking

Benchmark management

The benchmark subcommand handles individual benchmark runs and site generation.

$ serialbench benchmark help
Commands:
  serialbench benchmark _docker_execute ENVIRONMENT_CONFIG_PATH BENCHMARK_CONFIG_PATH  # (Private) Execute a benchmark run
  serialbench benchmark build-site RUN_PATH [OUTPUT_DIR]                               # Generate HTML site for a run
  serialbench benchmark create [NAME]                                                  # Generate a run configuration file
  serialbench benchmark execute ENVIRONMENT_CONFIG_PATH BENCHMARK_CONFIG_PATH          # Execute a benchmark run
  serialbench benchmark help [COMMAND]                                                 # Describe subcommands or one specific subcommand
  serialbench benchmark list                                                           # List all available runs

The _docker_execute command is a private command used internally by the execute command to run benchmarks in Docker environments.

Result set management

The resultset subcommand manages collections of benchmark runs for comparison analysis.

$ serialbench resultset help
Commands:
  serialbench resultset add-result RESULT_PATH RESULTSET_PATH     # Add a run to a resultset
  serialbench resultset build-site RESULTSET_PATH [OUTPUT_DIR]    # Generate HTML site for a resultset
  serialbench resultset create NAME PATH                          # Create a new resultset
  serialbench resultset help [COMMAND]                            # Describe subcommands or one specific subcommand
  serialbench resultset list                                      # List all available resultsets
  serialbench resultset remove-result RESULTSET_PATH RESULT_PATH  # Remove a run from a resultset

ruby-build management

The ruby-build subcommand manages Ruby build definitions and version information.

Serialbench uses ruby-build definitions of Ruby interpreter types and versions for identification.

$ serialbench ruby-build help
Commands:
  serialbench ruby_build cache-info      # Show information about the Ruby-Build definitions cache
  serialbench ruby_build help [COMMAND]  # Describe subcommands or one specific subcommand
  serialbench ruby_build list [FILTER]   # List available Ruby-Build definitions
  serialbench ruby_build show TAG        # Show details for a specific Ruby-Build definition
  serialbench ruby_build suggest         # Suggest Ruby-Build tag for current Ruby version
  serialbench ruby_build update          # Update Ruby-Build definitions from GitHub
  serialbench ruby_build validate TAG    # Validate a Ruby-Build tag

Workflow examples

Docker-based testing

Note
This works.
# 1. Prepare Docker environment
$ bundle exec serialbench environment prepare config/environments/docker-ruby-3.1.yml

# 2. Run benchmark
$ bundle exec serialbench environment execute config/environments/docker-ruby-3.1.yml config/benchmarks/short.yml results/runs/docker-ruby-3.1-results

# 3. Create a resultset
$ bundle exec serialbench resultset create docker-comparison results/sets/docker-comparison

# 3a. (Optional) Build the site from the result if you want to visualize results
$ bundle exec serialbench benchmark build-site results/runs/docker-ruby-3.1-results/ --output_dir=_site_result

# 4. Add the result to the resultset
$ bundle exec serialbench resultset add-result results/sets/docker-comparison/ results/runs/docker-ruby-3.1-results/

# 5. Build the site from the resultset
$ bundle exec serialbench resultset build-site results/sets/docker-comparison/

# 6. Open the generated site
$ open _site/index.html

ASDF-based testing

Warning
THIS IS NOT YET WORKING.
# 1. Validate configuration
$ bundle exec serialbench benchmark validate serialbench-asdf.yml

# 2. Prepare Ruby environments
$ bundle exec serialbench benchmark prepare asdf --config=serialbench-asdf.yml

# 3. Run benchmarks across all Ruby versions
$ bundle exec serialbench benchmark execute asdf --config=serialbench-asdf.yml

# 4. Results are automatically merged and dashboard generated
$ open asdf-results/_site/index.html

Configuration Files

Environment configuration

Environment configuration files define how benchmarks are executed in different runtime environments.

Environment configuration for Docker (config/environments/docker-ruby-3.4.yml)
---
name: docker-ruby-3.4
kind: docker
created_at: '2025-06-13T15:18:43+08:00'
ruby_build_tag: "3.4.1"
description: Docker environment for Ruby 3.4 benchmarks
docker:
  image: 'ruby:3.4-slim'
  dockerfile: '../../docker/Dockerfile.ubuntu'
Environment configuration for ASDF (config/environments/asdf-ruby-3.3.yml)
---
name: ruby-332-asdf
kind: asdf
created_at: '2025-06-12T22:53:24+08:00'
ruby_build_tag: 3.3.2
description: ASDF environment
asdf:
  auto_install: true

Benchmark configuration

Benchmark configuration files control what tests to run and how to run them.

Short configuration (CI-friendly) (config/benchmarks/short.yml)
name: short-benchmark

data_sizes:
- small

formats:
- xml
- json
- yaml
- toml

iterations:
  small: 5
  medium: 2
  large: 1

operations:
- parse
- generate
- streaming

warmup: 2
Full configuration (Comprehensive) (config/benchmarks/full.yml)
name: full-benchmark

data_sizes:
- small
- medium
- large

formats:
- xml
- json
- yaml
- toml

iterations:
  small: 20
  medium: 5
  large: 2

operations:
- parse
- generate
- streaming
- memory

warmup: 3

Results structure

Individual run results

Results are stored in a structured directory format, with each run containing raw benchmark data and execution logs.

The directory is located at results/runs/{name}/, where {name} is the name of the environment used for the benchmark.

results/runs/docker-ruby-33-results/
├── results.yaml                    # Raw benchmark data
└── benchmark.log                   # Execution log

ResultSet structure

ResultSets aggregate multiple benchmark runs for comparison. They are stored in a structured directory format at results/sets/{name}/, where {name} is the name of the result set.

results/sets/ruby-version-comparison/
└── resultset.yml                  # Result set configuration

Benchmark categories

Parsing performance

Measures the time required to parse serialized data into Ruby objects.

  • Small files: ~1KB configuration-style documents

  • Medium files: ~1MB API responses with 1,000 records

  • Large files: ~10MB data exports with 10,000 records

Generation performance

Tests how quickly libraries can convert Ruby objects into serialized strings.

Streaming performance

Evaluates streaming event-based parsing performance for libraries that support it, which processes data sequentially and is memory-efficient for large files.

Memory usage analysis

Profiles memory allocation and retention during serialization operations using the memory_profiler gem.

Interactive Dashboard Features

The generated HTML sites provide comprehensive interactive dashboards with:

Navigation and Filtering

  • Format tabs: Dedicated views for XML, JSON, YAML, and TOML

  • Operation sections: Parsing, generation, streaming, and memory usage

  • Dynamic filtering: Platform, Ruby version, and environment selection

  • Real-time updates: Charts update instantly based on filter selections

Visualization Capabilities

  • Chart.js integration: Interactive performance charts with hover details

  • Multi-scale handling: Automatic Y-axis scaling for different performance ranges

  • Color-coded data: Consistent color schemes across serializers and environments

  • Responsive design: Optimized for desktop and mobile viewing

User Experience

  • Theme toggle: Light and dark mode with persistent preferences

  • Keyboard navigation: Full accessibility support

  • Fast loading: Optimized JavaScript for quick dashboard initialization

  • Export capabilities: JSON data export for further analysis

Development

Running Tests

$ bundle exec rake
$ bundle exec rspec

Adding a new serializers

To add support for additional serialization libraries:

  1. Create a new serializer class in lib/serialbench/serializers/{format}/

  2. Inherit from the appropriate base class (BaseXmlSerializer, BaseJsonSerializer, etc.)

  3. Implement the required methods: parse, generate, name, version

  4. Add the serializer to the registry in lib/serialbench/serializers.rb

  5. Update documentation and tests

Contributing

  1. Fork the repository

  2. Create your feature branch (git checkout -b feature/my-new-feature)

  3. Commit your changes (git commit -am 'Add some feature')

  4. Push to the branch (git push origin feature/my-new-feature)

  5. Create a new Pull Request

Known issues

Syck YAML serializer segmentation faults

The Syck YAML serializer at version 1.5+ is known to cause segmentation faults on Ruby 3.1 and later versions. Serialbench automatically detects this problematic configuration and:

  • Displays a warning message when Syck is detected on Ruby 3.1+

  • Skips Syck benchmarks to prevent crashes

  • Continues with other YAML serializers (Psych)

Syck overrides YAML constant

On occasion after Syck is loaded, the constant YAML may be redefined to Syck, which can cause issues in other parts of the codebase. This can cause YAML output to fail when using libraries that expect YAML to have the Psych API.

In benchmark_cli.rb there is therefore such code to ensure that YAML is defined as Psych when writing to file is needed:

# Restore YAML to use Psych for output, otherwise lutaml-model's to_yaml
# will have no output
Object.const_set(:YAML, Psych)

Copyright Ribose.

This gem is developed, maintained and funded by Ribose

The gem is available as open source under the terms of the 2-Clause BSD License.