ruby-claw ๐ฆ
ยท GitHub
AI Agent framework for Ruby. Built on ruby-mana.
What is Claw?
Claw turns ruby-mana's embedded LLM engine into a full agent with persistent memory, interactive chat, and session recovery. Think of it as the agent layer on top of mana's execution engine.
gem install ruby-claw
Features
Interactive TUI
Running claw launches a full-screen terminal UI (built on Charm Ruby's bubbletea) with 4 zones: top status bar, left chat panel, right status panel, and bottom command bar.
Claw.chat still works for the legacy REPL mode:
require "claw"
Claw.chat
- Auto-detects Ruby code vs natural language
- Streaming output with markdown rendering
!prefix forces Ruby eval- Session persists across restarts
Persistent Memory
Claw stores memories as human-readable Markdown in .ruby-claw/:
.ruby-claw/
MEMORY.md # Long-term facts (editable!)
session.md # Conversation summary
system_prompt.md # Custom agent personality
values.json # Variable snapshots
definitions.rb # Method definitions
log/
2026-03-29.md # Daily interaction log
traces/
20260405_103000.md # Execution traces
evolution/
20260405_accept.md # Evolution logs
gems/ # Editable gem source (after claw init)
The LLM can remember facts that persist across sessions:
claw> remember that the API uses OAuth2
claw> # ... next session ...
claw> what auth does our API use?
# => "OAuth2 โ I remembered this from a previous session"
Runtime Persistence
Variables and method definitions survive across sessions:
claw> a = 42
claw> def greet(name) = "Hello #{name}"
claw> exit
$ claw # restart
claw> a # => 42
claw> greet("world") # => "Hello world"
Memory Compaction
When conversation grows large, old messages are automatically summarized in the background.
Incognito Mode
Temporarily disable memory loading and saving:
Claw.incognito do
~"translate <text> to French, store in <french>"
# No memories loaded, nothing remembered
end
Claw::Memory.incognito? # => true inside the block
Keyword Memory Search
With many memories (>20), only the most relevant are injected into prompts.
Reversible Runtime
Snapshot and rollback the entire agent state (context, memory, variables, filesystem):
claw> /snapshot before-refactor
โ snapshot #2 created (before-refactor)
claw> # ... make changes ...
claw> /rollback 2
โ rolled back to snapshot #2
REPL commands:
| Command | Description |
|---------|-------------|
| /snapshot [label] | Snapshot all resources |
| /rollback <id> | Rollback to a snapshot |
| /diff [id_a id_b] | Show diff between snapshots |
| /history | List all snapshots |
| /status | Show current resource state |
| /evolve | Run a self-evolution cycle |
| /role <name> | Switch agent role/identity |
| /forge <method> | Promote a method to a formal tool |
Plan Mode
/plan toggles plan mode. When active, the LLM generates a step-by-step plan without executing any tools. The user reviews the proposed steps, then confirms execution -- which runs in a safe fork so the original state is preserved if anything goes wrong.
Roles
Role files are Markdown documents stored in .ruby-claw/roles/. Each role defines an agent identity (system prompt, constraints, tool permissions).
/role <name>switches the active agent identity at runtimeclaw initcreates a default role
Benchmark
claw benchmark run executes the benchmark suite -- 9 built-in tasks spanning the mana, claw, runtime, and evolution layers. Each task runs 3 times, and scoring covers:
- Correctness -- did the agent produce the right result?
- Rounds efficiency -- how many LLM round-trips were needed?
- Token efficiency -- total token usage
- Tool path accuracy -- did the agent call the expected tools in the expected order?
claw benchmark diff <a> <b> compares two benchmark reports side by side. Auto-triggers an evolution cycle on score regression or 3 consecutive failures.
Multi-Agent
runtime.fork_async(prompt:, vars:, role:) spawns a child agent that runs in an isolated thread with deep-copied variables and an optional git worktree for filesystem isolation.
Child lifecycle methods:
child.join-- block until the child finisheschild.cancel!-- abort the childchild.diff-- inspect changes made by the childchild.merge!-- merge the child's results back into the parent
All operations are thread-safe with Mutex protection.
Execution Traces
Every LLM interaction is logged as a Markdown file in .ruby-claw/traces/:
# Task: compute average of numbers
- Model: claude-sonnet-4-20250514
- Steps: 2
- Total tokens: 1100 in / 350 out
- Total latency: 1400ms
## Step 1
- Latency: 800ms
- Tokens: 500 in / 200 out
### Tool calls
- **read_var**(name: "numbers") -> [1, 2, 3]
Tool System
Claw has a three-layer tool architecture:
- Core tools (always loaded):
read_var,write_var,call_func,eval,remember,search_tools,load_tool - Project tools (on-demand):
.ruby-claw/tools/*.rbโ indexed at startup, loaded viaload_tool - Hub tools (remote): community tools from a ruby-claw-toolhub, downloaded on demand
Create a project tool:
# .ruby-claw/tools/format_report.rb
class FormatReport
include Claw::Tool
tool_name "format_report"
description "Format raw data into a readable report"
parameter :data, type: "Hash", required: true, desc: "Raw data"
parameter :style, type: "String", required: false, desc: "brief or detailed"
def call(data:, style: "brief")
# ...
end
end
The agent discovers tools via search_tools and loads them via load_tool. Use /forge <method_name> to promote an eval-defined method into a formal tool class.
Web Console
claw console launches a local web UI at http://127.0.0.1:4567 for observability and operations:
- Dashboard โ version, tool/memory/snapshot counts
- Prompt Inspector โ view and edit the assembled system prompt
- LLM Monitor โ real-time event stream via Server-Sent Events
- Trace Explorer โ browse execution traces
- Memory Manager โ add/remove long-term memories
- Tool Manager โ view core tools, load/unload project tools
- Snapshot Manager โ create snapshots, rollback state
All data is served via a REST API (/api/status, /api/traces, /api/memory, etc.).
Project Scaffolding
Initialize a project with editable gem source for self-evolution:
claw init
Creates:
.ruby-claw/
gems/
ruby-claw/ # Editable source
ruby-mana/
tools/ # Project tool classes
roles/ # Agent role definitions
benchmarks/ # Benchmark reports
system_prompt.md # Customizable agent personality
MEMORY.md
.git/ # Filesystem snapshots
Self-Evolution
The agent can improve its own code:
claw> /evolve
โก running evolution cycle...
โ accepted: Improve error message specificity
Flow: read traces โ LLM diagnoses improvement โ fork runtime โ apply change โ run tests โ keep or rollback.
Evolution logs are written to .ruby-claw/evolution/.
CLI Subcommands
| Command | Description |
|---|---|
claw |
Launch the TUI (default) |
claw init |
Scaffold a new project |
claw status |
Show current resource state |
claw history |
List all snapshots |
claw rollback <id> |
Rollback to a snapshot |
claw trace [id] |
View execution traces |
claw evolve |
Run a self-evolution cycle |
claw benchmark run |
Run the benchmark suite |
claw benchmark diff <a> <b> |
Compare two benchmark reports |
claw console |
Launch the web console UI |
claw version |
Print version |
claw help |
Show help |
Configuration
Claw.configure do |c|
c.memory_pressure = 0.7 # Compact when tokens > 70% of context window
c.memory_keep_recent = 4 # Keep last 4 conversation rounds during compaction
c.compact_model = nil # nil = use main model for summarization
c.persist_session = true # Save/restore session across restarts
c.memory_top_k = 10 # Max memories to inject when searching
c.on_compact = ->(summary) { puts summary }
c.tools_dir = nil # Custom tools directory (default: .ruby-claw/tools)
c.hub_url = nil # Remote tool hub URL
c.console_port = 4567 # Web console port
end
# Mana config (inherited)
Mana.configure do |c|
c.model = "claude-sonnet-4-6"
c.api_key = "sk-..."
end
Architecture
Claw extends mana via its tool registration interface โ no monkey-patching:
# Claw registers the "remember" tool into mana's engine
Mana.register_tool(remember_tool_definition) { |input| ... }
# Claw injects long-term memories into mana's system prompt
Mana.register_prompt_section { |context| memory_text }
- ruby-mana = Embedded LLM engine (
~"..."syntax, binding manipulation, tool calling) - ruby-claw = Agent framework (chat REPL, memory, persistence, knowledge)
Claw depends on mana. You can use mana standalone for embedding LLM in Ruby code, or add claw for interactive agent features.
License
MIT