Shikibu

Shikibu (紫式部) - Named after Lady Murasaki Shikibu, author of The Tale of Genji

Lightweight durable execution framework for Ruby - no separate server required

CI License: MIT Ruby 3.3+ GitHub

Overview

Shikibu is a lightweight durable execution framework for Ruby that runs as a library in your application - no separate workflow server required. It provides automatic crash recovery through deterministic replay, allowing long-running workflows to survive process restarts and failures without losing progress.

Perfect for: Order processing, distributed transactions (Saga pattern), and any workflow that must survive crashes.

Shikibu is a Ruby port of Edda (Python), providing the same core concepts and patterns in idiomatic Ruby.

Key Features

  • Lightweight Library: Runs in your application process - no separate server infrastructure
  • 🔄 Durable Execution: Deterministic replay with workflow history for automatic crash recovery
  • 🎯 Workflow & Activity: Clear separation between orchestration logic and business logic
  • 🔁 Saga Pattern: Automatic compensation on failure with on_failure blocks
  • 🌐 Multi-worker Execution: Run workflows safely across multiple servers or containers
  • 📦 Transactional Outbox: Reliable event publishing with guaranteed delivery
  • ☁️ CloudEvents Support: Native support for CloudEvents protocol via Rack middleware
  • ⏱️ Event & Timer Waiting: Free up worker resources while waiting for events or timers
  • 📬 Channel-based Messaging: Actor-model style communication with competing and broadcast modes
  • 📡 PostgreSQL LISTEN/NOTIFY: Real-time event delivery without polling
  • 🌍 Rack Integration: Works with Rails, Sinatra, Hanami, and any Rack-compatible framework
  • 🔧 Sidekiq/ActiveJob: Background worker integration for Rails applications

Architecture

Shikibu runs as a lightweight library in your applications, with all workflow state stored in a shared database:

┌─────────────────────────────────────────────────────────────────────┐
│                       Your Ruby Applications                         │
├──────────────────────┬──────────────────────┬──────────────────────┤
│   order-service-1    │   order-service-2    │   order-service-3    │
│   ┌──────────────┐   │   ┌──────────────┐   │   ┌──────────────┐   │
│   │   Shikibu    │   │   │   Shikibu    │   │   │   Shikibu    │   │
│   │   Workflow   │   │   │   Workflow   │   │   │   Workflow   │   │
│   └──────────────┘   │   └──────────────┘   │   └──────────────┘   │
└──────────┬───────────┴──────────┬───────────┴──────────┬───────────┘
           │                      │                      │
           └──────────────────────┼──────────────────────┘
                                  │
                         ┌────────▼────────┐
                         │ Shared Database │
                         │ (SQLite/PG/MySQL)│
                         └─────────────────┘

Key Points:

  • Multiple workers can run simultaneously across different pods/servers
  • Each workflow instance runs on only one worker at a time (automatic coordination)
  • wait_event and sleep free up worker resources while waiting
  • Automatic crash recovery with stale lock cleanup and workflow auto-resume

Quick Start

require 'shikibu'

# Register compensation functions (global registry)
Shikibu.register_compensation(:refund_payment) do |ctx, order_id:|
  PaymentService.refund(order_id)
end

class OrderSaga < Shikibu::Workflow
  workflow_name 'order_saga'

  def execute(order_id:, amount:)
    # Activity results are recorded in history
    result = activity :process_payment do
      PaymentService.charge(order_id, amount)
    end

    # Compensation on failure (Saga pattern)
    on_failure :refund_payment, order_id: order_id

    { status: 'completed', order_id: order_id, payment: result }
  end
end

# Configure Shikibu
Shikibu.configure do |config|
  config.database_url = 'sqlite://workflow.db'
  config.service_name = 'order-service'
end

# Start workflow
result = Shikibu.run(OrderSaga, order_id: 'ORD-123', amount: 99.99)

What happens on crash?

  1. Activities already executed return cached results from history
  2. Workflow resumes from the last checkpoint
  3. No manual intervention required

Installation

Add to your Gemfile:

gem 'shikibu'

Then run:

bundle install

Or install directly:

gem install shikibu

Database Support

Database Use Case Multi-Pod Support Production Ready
SQLite Development, testing, single-process ⚠️ Limited ⚠️ Limited
PostgreSQL Production, multi-process/multi-pod ✅ Yes ✅ Recommended
MySQL Production, multi-process/multi-pod ✅ Yes ✅ Yes (8.0+)

Important: For multi-process or multi-pod deployments (K8s, Docker Compose with multiple replicas), use PostgreSQL or MySQL.

Database Drivers

# Gemfile

# SQLite (included by default)
gem 'sqlite3', '~> 2.0'

# PostgreSQL
gem 'pg', '~> 1.5'

# MySQL
gem 'mysql2', '~> 0.5'

Database Migrations

Shikibu automatically applies database migrations on startup:

# Default: auto-migration enabled
storage = Shikibu::Storage::SequelStorage.new(database_url, auto_migrate: true)

# Or via App configuration
app = Shikibu::App.new(database_url: 'postgres://...', auto_migrate: true)

Features:

  • Automatic: Migrations run during initialization
  • dbmate-compatible: Uses the same schema_migrations table as dbmate CLI
  • Multi-worker safe: Safe for concurrent startup across multiple pods/processes

The database schema is managed in the durax-io/schema repository, shared between Shikibu (Ruby), Edda (Python), and Romancy (Go).

Core Concepts

Workflows and Activities

Activity: A unit of work that performs business logic. Activity results are recorded in history.

Workflow: Orchestration logic that coordinates activities. Workflows can be replayed from history after crashes.

class UserOnboarding < Shikibu::Workflow
  workflow_name 'user_onboarding'

  def execute(email:)
    # Activity - results are recorded
    user = activity :create_user do
      UserService.create(email: email)
    end

    # Another activity
    activity :send_welcome_email do
      EmailService.send_welcome(user[:id])
    end

    { status: 'completed', user_id: user[:id] }
  end
end

Activity IDs: Activities are automatically identified with IDs like "create_user:1" for deterministic replay.

Durable Execution

Shikibu ensures workflow progress is never lost through deterministic replay:

  1. Activity results are recorded in a history table
  2. On crash recovery, workflows resume from the last checkpoint
  3. Already-executed activities return cached results from history
  4. New activities continue from where the workflow left off

Key guarantees:

  • Activities execute exactly once (results cached in history)
  • Workflows can survive arbitrary crashes
  • No manual checkpoint management required

Compensation (Saga Pattern)

When a workflow fails, Shikibu automatically executes compensation functions for already-executed activities in reverse order:

# Register compensation functions (supports crash recovery)
Shikibu.register_compensation(:cancel_reservation) do |ctx, order_id:|
  InventoryService.cancel_reservation(order_id)
end

Shikibu.register_compensation(:refund_payment) do |ctx, order_id:|
  PaymentService.refund(order_id)
end

class OrderSaga < Shikibu::Workflow
  workflow_name 'order_saga'

  def execute(order_id:, amount:)
    # Step 1: Reserve inventory
    activity :reserve_inventory do
      InventoryService.reserve(order_id)
    end
    on_failure :cancel_reservation, order_id: order_id

    # Step 2: Process payment
    activity :process_payment do
      PaymentService.charge(order_id, amount)
    end
    on_failure :refund_payment, order_id: order_id

    # Step 3: If this fails, compensations run in reverse order:
    # → refund payment → cancel reservation
    activity :confirm_order do
      OrderService.confirm(order_id)
    end

    { status: 'completed' }
  end
end

Event & Timer Waiting

Workflows can wait for external events or timers without consuming worker resources:

class PaymentWorkflow < Shikibu::Workflow
  workflow_name 'payment_workflow'

  def execute(order_id:)
    # Wait for payment completion event
    event = wait_event('payment.completed', timeout: 3600)

    { order_id: order_id, payment: event[:data] }
  end
end

Timer waiting with sleep:

class ReminderWorkflow < Shikibu::Workflow
  workflow_name 'reminder_workflow'

  def execute(user_id:)
    # Wait 3 days
    sleep(3 * 24 * 60 * 60)

    # Check if user completed onboarding
    unless UserService.completed_onboarding?(user_id)
      EmailService.send_reminder(user_id)
    end
  end
end

Key behavior:

  • wait_event and sleep release the workflow lock
  • Workflow resumes on any available worker when event arrives or timer expires
  • No worker is blocked while waiting

Channel-based Messaging

Shikibu provides channel-based messaging for workflow-to-workflow communication:

class JobWorker < Shikibu::Workflow
  workflow_name 'job_worker'

  def execute(worker_id:)
    # Subscribe with competing mode - each job goes to ONE worker only
    subscribe('jobs', mode: :competing)

    loop do
      job = receive('jobs')
      process_job(job.data)
      recur(worker_id: worker_id)  # Continue processing
    end
  end
end

class NotificationHandler < Shikibu::Workflow
  workflow_name 'notification_handler'

  def execute(handler_id:)
    # Subscribe with broadcast mode - ALL handlers receive each message
    subscribe('notifications', mode: :broadcast)

    loop do
      msg = receive('notifications')
      send_notification(msg.data)
      recur(handler_id: handler_id)
    end
  end
end

Delivery modes:

  • competing: Each message goes to exactly ONE subscriber (job queue/task distribution)
  • broadcast: Each message goes to ALL subscribers (notifications/fan-out)

Publishing messages:

# Publish to channel (all subscribers or one competing subscriber)
publish('jobs', { task: 'send_report', user_id: 123 })

# Direct message to specific workflow instance
send_to(target_instance_id, 'approval', { approved: true })

PostgreSQL LISTEN/NOTIFY

When using PostgreSQL, Shikibu can use LISTEN/NOTIFY for real-time event delivery instead of polling:

# Automatically enabled for PostgreSQL URLs
app = Shikibu::App.new(database_url: 'postgres://localhost/workflows')

# Explicitly enable/disable
app = Shikibu::App.new(
  database_url: 'postgres://localhost/workflows',
  use_listen_notify: true  # or false to disable
)

Benefits:

  • Near-instant workflow resumption (vs polling intervals)
  • Reduced database load
  • Works transparently with existing code

For SQLite/MySQL, Shikibu falls back to polling-based updates.

Rack Integration

Shikibu provides Rack middleware for CloudEvents endpoints:

# config.ru
require 'shikibu'

Shikibu.configure do |config|
  config.database_url = ENV['DATABASE_URL']
  config.service_name = 'order-service'
end

# Mount Shikibu middleware
use Shikibu::Middleware::RackApp

run MyApp

Rails Integration

# config/initializers/shikibu.rb
Shikibu.configure do |config|
  config.database_url = ENV['DATABASE_URL']
  config.service_name = Rails.application.class.module_parent_name.underscore
end

# config/routes.rb
Rails.application.routes.draw do
  mount Shikibu::Middleware::RackApp.new => '/workflows'
end

Sidekiq Integration

# app/jobs/workflow_job.rb
class WorkflowJob
  include Sidekiq::Job

  def perform(workflow_class, input)
    klass = workflow_class.constantize
    Shikibu.run(klass, **input.symbolize_keys)
  end
end

# Usage
WorkflowJob.perform_async('OrderSaga', { order_id: 'ORD-123', amount: 99.99 })

Multi-worker Execution

Multiple workers can safely process workflows using database-based exclusive control:

app = Shikibu::App.new(
  database_url: 'postgresql://localhost/workflows',
  service_name: 'order-service',
  worker_id: "worker-#{Process.pid}"
)

Features:

  • Each workflow instance runs on only one worker at a time
  • Automatic stale lock cleanup (5-minute timeout)
  • Crashed workflows automatically resume on any available worker

Cross-Framework Compatibility

Shikibu shares the same database schema with:

This means you can:

  • Use multiple languages in the same system
  • Migrate workflows between frameworks
  • Share workflow state across services

Each framework identifies its workflows via the framework column (ruby, python, go).

Development

This project uses just as a command runner.

just              # Show available commands
just install      # Install dependencies
just test         # Run unit tests
just test-file FILE  # Run specific test file
just lint         # Run RuboCop
just fix          # Auto-fix lint issues
just check        # Run lint + tests

# Integration tests (requires Docker)
just test-integration  # Run all integration tests
just test-pg           # PostgreSQL only
just test-mysql        # MySQL only

Requirements

  • Ruby 3.3+
  • SQLite3, PostgreSQL, or MySQL

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support