Module: TreeHaver

Defined in:
lib/tree_haver.rb,
lib/tree_haver/node.rb,
lib/tree_haver/tree.rb,
lib/tree_haver/point.rb,
lib/tree_haver/parser.rb,
lib/tree_haver/version.rb,
lib/tree_haver/language.rb,
lib/tree_haver/backends/ffi.rb,
lib/tree_haver/backends/mri.rb,
lib/tree_haver/backends/java.rb,
lib/tree_haver/backends/rust.rb,
lib/tree_haver/backends/prism.rb,
lib/tree_haver/backends/psych.rb,
lib/tree_haver/grammar_finder.rb,
lib/tree_haver/path_validator.rb,
lib/tree_haver/backends/citrus.rb,
lib/tree_haver/backends/markly.rb,
lib/tree_haver/language_registry.rb,
lib/tree_haver/library_path_utils.rb,
lib/tree_haver/backends/commonmarker.rb,
lib/tree_haver/citrus_grammar_finder.rb,
lib/tree_haver/rspec/dependency_tags.rb

Overview

TreeHaver is a cross-Ruby adapter for code parsing with 10 backends.

Provides a unified API for parsing source code across MRI Ruby, JRuby, and TruffleRuby using tree-sitter grammars or language-specific native parsers.

Backends

Supports 10 backends:

  • Tree-sitter: MRI ©, Rust, FFI, Java

  • Native parsers: Prism (Ruby), Psych (YAML), Commonmarker (Markdown), Markly (GFM)

  • Pure Ruby: Citrus (portable fallback)

Platform Compatibility

Not all backends work on all Ruby platforms:

| Backend      | MRI | JRuby | TruffleRuby |
|--------------|-----|-------|-------------|
| MRI (C ext)  | ✓   | ✗     | ✗           |
| Rust         | ✓   | ✗     | ✗           |
| FFI          | ✓   | ✓     | ✗           |
| Java         | ✗   | ✓     | ✗           |
| Prism        | ✓   | ✓     | ✓           |
| Psych        | ✓   | ✓     | ✓           |
| Citrus       | ✓   | ✓     | ✓           |
| Commonmarker | ✓   | ✗     | ?           |
| Markly       | ✓   | ✗     | ?           |
  • JRuby: Cannot load native C/Rust extensions; use FFI, Java, or pure Ruby backends

  • TruffleRuby: FFI doesn’t support STRUCT_BY_VALUE; magnus/rb-sys incompatible with C API; use Prism, Psych, Citrus, or potentially Commonmarker/Markly

Examples:

Basic usage with tree-sitter

# Load a language grammar
language = TreeHaver::Language.from_library(
  "/usr/local/lib/libtree-sitter-toml.so",
  symbol: "tree_sitter_toml"
)

# Create and configure a parser
parser = TreeHaver::Parser.new
parser.language = language

# Parse source code
tree = parser.parse("[package]\nname = \"my-app\"")
root = tree.root_node

# Use unified Position API (works across all backends)
puts root.start_line      # => 1 (1-based)
puts root.source_position # => {start_line:, end_line:, start_column:, end_column:}

Using language-specific backends

# Parse Ruby with Prism
TreeHaver.backend = :prism
parser = TreeHaver::Parser.new
parser.language = TreeHaver::Backends::Prism::Language.ruby
tree = parser.parse("class Example; end")

# Parse YAML with Psych
TreeHaver.backend = :psych
parser = TreeHaver::Parser.new
parser.language = TreeHaver::Backends::Psych::Language.yaml
tree = parser.parse("key: value")

# Parse Markdown with Commonmarker
TreeHaver.backend = :commonmarker
parser = TreeHaver::Parser.new
parser.language = TreeHaver::Backends::Commonmarker::Language.markdown
tree = parser.parse("# Heading\nParagraph")

Using language registration

TreeHaver.register_language(:toml, path: "/usr/local/lib/libtree-sitter-toml.so")
language = TreeHaver::Language.toml

Using GrammarFinder for automatic discovery

# GrammarFinder automatically locates grammar libraries on the system
finder = TreeHaver::GrammarFinder.new(:toml)
finder.register! if finder.available?
language = TreeHaver::Language.toml

Selecting a backend

TreeHaver.backend = :mri          # Force MRI (ruby_tree_sitter)
TreeHaver.backend = :rust         # Force Rust (tree_stump)
TreeHaver.backend = :ffi          # Force FFI
TreeHaver.backend = :java         # Force Java (JRuby)
TreeHaver.backend = :prism        # Force Prism (Ruby)
TreeHaver.backend = :psych        # Force Psych (YAML)
TreeHaver.backend = :commonmarker # Force Commonmarker (Markdown)
TreeHaver.backend = :markly       # Force Markly (GFM)
TreeHaver.backend = :citrus       # Force Citrus (pure Ruby)
TreeHaver.backend = :auto         # Auto-select (default)

See Also:

Defined Under Namespace

Modules: Backends, LanguageRegistry, LibraryPathUtils, PathValidator, RSpec, Version Classes: BackendConflict, CitrusGrammarFinder, Error, GrammarFinder, Language, Node, NotAvailable, Parser, Point, Tree

Constant Summary collapse

CITRUS_DEFAULTS =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Default Citrus configurations for known languages

These are used by parser_for when no explicit citrus_config is provided and tree-sitter backends are not available (e.g., on TruffleRuby).

{
  toml: {
    gem_name: "toml-rb",
    grammar_const: "TomlRB::Document",
    require_path: "toml-rb",
  },
}.freeze
NATIVE_BACKENDS =

Native tree-sitter backends that support loading shared libraries (.so files) These backends wrap the tree-sitter C library via various bindings. Pure Ruby backends (Citrus, Prism, Psych, Commonmarker, Markly) are excluded.

%i[mri rust ffi java].freeze
VERSION =

Traditional location for VERSION constant

Returns:

  • (String)

    the version string

Version::VERSION

Class Method Summary collapse

Class Method Details

.backendObject

Examples:

TreeHaver.backend  # => :auto


347
348
349
350
351
352
353
354
355
356
357
358
359
360
# File 'lib/tree_haver.rb', line 347

def backend
  @backend ||= case (ENV["TREE_HAVER_BACKEND"] || :auto).to_s # rubocop:disable ThreadSafety/ClassInstanceVariable
  when "mri" then :mri
  when "rust" then :rust
  when "ffi" then :ffi
  when "java" then :java
  when "citrus" then :citrus
  when "prism" then :prism
  when "psych" then :psych
  when "commonmarker" then :commonmarker
  when "markly" then :markly
  else :auto
  end
end

.backend=(name) ⇒ Symbol?

Set the backend to use

Examples:

Force FFI backend

TreeHaver.backend = :ffi

Force Rust backend

TreeHaver.backend = :rust

Parameters:

  • name (Symbol, String, nil)

    backend name (:auto, :mri, :rust, :ffi, :java, :citrus)

Returns:

  • (Symbol, nil)

    the backend that was set



370
371
372
# File 'lib/tree_haver.rb', line 370

def backend=(name)
  @backend = name&.to_sym # rubocop:disable ThreadSafety/ClassInstanceVariable
end

.backend_moduleModule?

Determine the concrete backend module to use

This method performs backend auto-selection when backend is :auto. On JRuby, prefers Java backend if available, then FFI, then Citrus. On MRI, prefers MRI backend if available, then Rust, then FFI, then Citrus. Citrus is the final fallback as it’s pure Ruby and works everywhere.

Examples:

mod = TreeHaver.backend_module
if mod
  puts "Using #{mod.capabilities[:backend]} backend"
end

Returns:

  • (Module, nil)

    the backend module (Backends::MRI, Backends::Rust, Backends::FFI, Backends::Java, or Backends::Citrus), or nil if none available



666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
# File 'lib/tree_haver.rb', line 666

def backend_module
  case effective_backend  # Changed from: backend
  when :mri
    Backends::MRI
  when :rust
    Backends::Rust
  when :ffi
    Backends::FFI
  when :java
    Backends::Java
  when :citrus
    Backends::Citrus
  when :prism
    Backends::Prism
  when :psych
    Backends::Psych
  when :commonmarker
    Backends::Commonmarker
  when :markly
    Backends::Markly
  else
    # auto-select: prefer native/fast backends, fall back to pure Ruby (Citrus)
    if defined?(RUBY_ENGINE) && RUBY_ENGINE == "jruby" && Backends::Java.available?
      Backends::Java
    elsif defined?(RUBY_ENGINE) && RUBY_ENGINE == "ruby" && Backends::MRI.available?
      Backends::MRI
    elsif defined?(RUBY_ENGINE) && RUBY_ENGINE == "ruby" && Backends::Rust.available?
      Backends::Rust
    elsif Backends::FFI.available?
      Backends::FFI
    elsif Backends::Citrus.available?
      Backends::Citrus  # Pure Ruby fallback
    else
      # No backend available
      nil
    end
  end
end

.backend_protectObject

Alias for backend_protect?



299
300
301
# File 'lib/tree_haver.rb', line 299

def backend_protect
  backend_protect?
end

.backend_protect=(value) ⇒ Boolean

Whether backend conflict protection is enabled

When true (default), TreeHaver will raise BackendConflict if you try to use a backend that is known to conflict with a previously used backend. For example, FFI will not work after MRI has been used.

Set to false to disable protection (useful for testing compatibility).

Examples:

Disable protection for testing

TreeHaver.backend_protect = false

Returns:

  • (Boolean)


285
286
287
288
# File 'lib/tree_haver.rb', line 285

def backend_protect=(value)
  @backend_protect_mutex ||= Mutex.new
  @backend_protect_mutex.synchronize { @backend_protect = value }
end

.backend_protect?Boolean

Check if backend conflict protection is enabled

Returns:

  • (Boolean)

    true if protection is enabled (default)



293
294
295
296
# File 'lib/tree_haver.rb', line 293

def backend_protect?
  return @backend_protect if defined?(@backend_protect) # rubocop:disable ThreadSafety/ClassInstanceVariable
  true  # Default is protected
end

.backends_usedSet<Symbol>

Track which backends have been used in this process

Returns:

  • (Set<Symbol>)

    set of backend symbols that have been used



306
307
308
# File 'lib/tree_haver.rb', line 306

def backends_used
  @backends_used ||= Set.new # rubocop:disable ThreadSafety/ClassInstanceVariable
end

.capabilitiesHash{Symbol => Object}

Get capabilities of the current backend

Returns a hash describing what features the selected backend supports. Common keys include:

  • :backend - Symbol identifying the backend (:mri, :rust, :ffi, :java)

  • :parse - Whether parsing is implemented

  • :query - Whether the Query API is available

  • :bytes_field - Whether byte position fields are available

  • :incremental - Whether incremental parsing is supported

Examples:

TreeHaver.capabilities
# => { backend: :mri, query: true, bytes_field: true }

Returns:

  • (Hash{Symbol => Object})

    capability map, or empty hash if no backend available



719
720
721
722
723
# File 'lib/tree_haver.rb', line 719

def capabilities
  mod = backend_module
  return {} unless mod
  mod.capabilities
end

.check_backend_conflict!(backend) ⇒ void

This method returns an undefined value.

Check if using a backend would cause a conflict

Parameters:

  • backend (Symbol)

    the backend to check

Raises:



333
334
335
336
337
338
339
340
341
342
343
# File 'lib/tree_haver.rb', line 333

def check_backend_conflict!(backend)
  return unless backend_protect?

  conflicts = conflicting_backends_for(backend)
  return if conflicts.empty?

  raise BackendConflict,
    "Cannot use #{backend} backend: it is blocked by previously used backend(s): #{conflicts.join(", ")}. " \
      "The #{backend} backend will segfault when #{conflicts.first} has already loaded. " \
      "To disable this protection (at risk of segfaults), set TreeHaver.backend_protect = false"
end

.conflicting_backends_for(backend) ⇒ Array<Symbol>

Check if a backend would conflict with previously used backends

Parameters:

  • backend (Symbol)

    the backend to check

Returns:

  • (Array<Symbol>)

    list of previously used backends that block this one



323
324
325
326
# File 'lib/tree_haver.rb', line 323

def conflicting_backends_for(backend)
  blockers = Backends::BLOCKED_BY[backend] || []
  blockers & backends_used.to_a
end

.current_backend_contextHash{Symbol => Object}

Thread-local backend context storage

Returns a hash containing the thread-local backend context with keys:

  • :backend - The backend name (Symbol) or nil if using global default

  • :depth - The nesting depth (Integer) for proper cleanup

Examples:

ctx = TreeHaver.current_backend_context
ctx[:backend]  # => nil or :ffi, :mri, etc.
ctx[:depth]    # => 0, 1, 2, etc.

Returns:

  • (Hash{Symbol => Object})

    context hash with :backend and :depth keys



399
400
401
402
403
404
# File 'lib/tree_haver.rb', line 399

def current_backend_context
  Thread.current[:tree_haver_backend_context] ||= {
    backend: nil,  # nil means "use global default"
    depth: 0,       # Track nesting depth for proper cleanup
  }
end

.effective_backendSymbol

Get the effective backend for current context

Priority: thread-local context → global @backend → :auto

Examples:

TreeHaver.effective_backend  # => :auto (default)

With thread-local context

TreeHaver.with_backend(:ffi) do
  TreeHaver.effective_backend  # => :ffi
end

Returns:

  • (Symbol)

    the backend to use



417
418
419
420
# File 'lib/tree_haver.rb', line 417

def effective_backend
  ctx = current_backend_context
  ctx[:backend] || backend || :auto
end

.parser_for(language_name, library_path: nil, symbol: nil, citrus_config: nil) ⇒ TreeHaver::Parser

Create a parser configured for a specific language

This is the recommended high-level API for creating a parser. It handles:

  1. Checking if the language is already registered

  2. Auto-discovering tree-sitter grammar via GrammarFinder

  3. Falling back to Citrus grammar if tree-sitter is unavailable

  4. Creating and configuring the parser

Examples:

Basic usage (auto-discovers grammar)

parser = TreeHaver.parser_for(:toml)
tree = parser.parse("[package]\nname = \"my-app\"")

With explicit library path

parser = TreeHaver.parser_for(:toml, library_path: "/custom/path/libtree-sitter-toml.so")

With Citrus fallback configuration

parser = TreeHaver.parser_for(:toml,
  citrus_config: { gem_name: "toml-rb", grammar_const: "TomlRB::Document" }
)

Parameters:

  • language_name (Symbol, String)

    the language to parse (e.g., :toml, :json, :bash)

  • library_path (String, nil) (defaults to: nil)

    optional explicit path to tree-sitter grammar library

  • symbol (String, nil) (defaults to: nil)

    optional tree-sitter symbol name (defaults to “tree_sitter_<name>”)

  • citrus_config (Hash, nil) (defaults to: nil)

    optional Citrus fallback configuration

Options Hash (citrus_config:):

  • :gem_name (String)

    gem name for the Citrus grammar

  • :grammar_const (String)

    fully qualified constant name for grammar module

Returns:

Raises:



865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
# File 'lib/tree_haver.rb', line 865

def parser_for(language_name, library_path: nil, symbol: nil, citrus_config: nil)
  name = language_name.to_sym
  symbol ||= "tree_sitter_#{name}"

  # Step 1: Try to get the language (may already be registered)
  language = begin
    # Check if already registered and loadable
    if registered_language(name)
      Language.public_send(name, path: library_path, symbol: symbol)
    end
  rescue NotAvailable, ArgumentError, LoadError
    nil
  end

  # Step 2: If not registered, try GrammarFinder for tree-sitter
  unless language
    # Principle of Least Surprise: If user provides an explicit path,
    # it MUST exist. Don't silently fall back to auto-discovery.
    if library_path && !library_path.empty?
      unless File.exist?(library_path)
        raise NotAvailable,
          "Specified parser path does not exist: #{library_path}"
      end
      begin
        register_language(name, path: library_path, symbol: symbol)
        language = Language.public_send(name)
      rescue NotAvailable, ArgumentError, LoadError => e
        # Re-raise with more context since user explicitly provided this path
        raise NotAvailable,
          "Failed to load parser from specified path #{library_path}: #{e.message}"
      end
    else
      # Auto-discover via GrammarFinder (no explicit path provided)
      begin
        finder = GrammarFinder.new(name)
        if finder.available?
          finder.register!
          language = Language.public_send(name)
        end
      rescue NotAvailable, ArgumentError, LoadError
        language = nil
      end
    end
  end

  # Step 3: Try Citrus fallback if tree-sitter failed
  unless language
    # Use explicit config, or fall back to built-in defaults for known languages
    citrus_config ||= CITRUS_DEFAULTS[name] || {}

    # Only attempt if we have the required configuration
    if citrus_config[:gem_name] && citrus_config[:grammar_const]
      begin
        citrus_finder = CitrusGrammarFinder.new(
          language: name,
          gem_name: citrus_config[:gem_name],
          grammar_const: citrus_config[:grammar_const],
          require_path: citrus_config[:require_path],
        )
        if citrus_finder.available?
          citrus_finder.register!
          language = Language.public_send(name)
        end
      rescue NotAvailable, ArgumentError, LoadError, NameError, TypeError
        language = nil
      end
    end
  end

  # Step 4: Raise if nothing worked
  unless language
    raise NotAvailable,
      "No parser available for #{name}. " \
        "Install tree-sitter-#{name} or the appropriate Ruby gem. " \
        "Set TREE_SITTER_#{name.to_s.upcase}_PATH for custom grammar location."
  end

  # Step 5: Create and configure parser
  parser = Parser.new
  parser.language = language
  parser
end

.record_backend_usage(backend) ⇒ void

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

This method returns an undefined value.

Record that a backend has been used

Parameters:

  • backend (Symbol)

    the backend that was used



315
316
317
# File 'lib/tree_haver.rb', line 315

def record_backend_usage(backend)
  backends_used << backend
end

.register_language(name, path: nil, symbol: nil, grammar_module: nil, gem_name: nil) ⇒ void

This method returns an undefined value.

Register a language helper by name (backend-agnostic)

After registration, you can use dynamic helpers like TreeHaver::Language.toml to load the registered language. TreeHaver will automatically use the appropriate grammar based on the active backend.

The name parameter is an arbitrary identifier you choose - it doesn’t need to match the actual language name. This is useful for:

  • Testing: Use unique names like :toml_test to avoid collisions

  • Aliasing: Register the same grammar under multiple names

  • Versioning: Register different grammar versions as :ruby_2 and :ruby_3

The actual grammar identity comes from path/symbol (tree-sitter) or grammar_module (Citrus), not from the name.

IMPORTANT: This method INTENTIONALLY allows registering BOTH a tree-sitter library AND a Citrus grammar for the same language IN A SINGLE CALL. This is achieved by using separate if statements (not elsif) and no early returns. This design is deliberate and provides significant benefits:

Why register both backends for one language?

  • Backend flexibility: Code works regardless of which backend is active

  • Performance testing: Compare tree-sitter vs Citrus performance

  • Gradual migration: Transition between backends without breaking code

  • Fallback scenarios: Use Citrus when tree-sitter library unavailable

  • Platform portability: tree-sitter on Linux/Mac, Citrus on JRuby/Windows

The active backend determines which registration is used automatically. No code changes needed to switch backends - just change TreeHaver.backend.

Examples:

Register tree-sitter grammar only

TreeHaver.register_language(
  :toml,
  path: "/usr/local/lib/libtree-sitter-toml.so",
  symbol: "tree_sitter_toml"
)

Register Citrus grammar only

TreeHaver.register_language(
  :toml,
  grammar_module: TomlRB::Document,
  gem_name: "toml-rb"
)

Register BOTH backends in separate calls

TreeHaver.register_language(
  :toml,
  path: "/usr/local/lib/libtree-sitter-toml.so",
  symbol: "tree_sitter_toml"
)
TreeHaver.register_language(
  :toml,
  grammar_module: TomlRB::Document,
  gem_name: "toml-rb"
)

Register BOTH backends in ONE call (recommended for maximum flexibility)

TreeHaver.register_language(
  :toml,
  path: "/usr/local/lib/libtree-sitter-toml.so",
  symbol: "tree_sitter_toml",
  grammar_module: TomlRB::Document,
  gem_name: "toml-rb"
)
# Now TreeHaver::Language.toml works with ANY backend!

Parameters:

  • name (Symbol, String)

    identifier for this registration (can be any name you choose)

  • path (String, nil) (defaults to: nil)

    absolute path to the language shared library (for tree-sitter)

  • symbol (String, nil) (defaults to: nil)

    optional exported factory symbol (e.g., “tree_sitter_toml”)

  • grammar_module (Module, nil) (defaults to: nil)

    Citrus grammar module that responds to .parse(source)

  • gem_name (String, nil) (defaults to: nil)

    optional gem name for error messages



798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
# File 'lib/tree_haver.rb', line 798

def register_language(name, path: nil, symbol: nil, grammar_module: nil, gem_name: nil)
  # Register tree-sitter backend if path provided
  # Note: Uses `if` not `elsif` so both backends can be registered in one call
  if path
    LanguageRegistry.register(name, :tree_sitter, path: path, symbol: symbol)
  end

  # Register Citrus backend if grammar_module provided
  # Note: Uses `if` not `elsif` so both backends can be registered in one call
  # This allows maximum flexibility - register once, use with any backend
  if grammar_module
    unless grammar_module.respond_to?(:parse)
      raise ArgumentError, "Grammar module must respond to :parse"
    end

    LanguageRegistry.register(name, :citrus, grammar_module: grammar_module, gem_name: gem_name)
  end

  # Require at least one backend to be registered
  if path.nil? && grammar_module.nil?
    raise ArgumentError, "Must provide at least one of: path (tree-sitter) or grammar_module (Citrus)"
  end

  # Note: No early return! This method intentionally processes both `if` blocks
  # above to allow registering multiple backends for the same language.
  # Both tree-sitter and Citrus can be registered simultaneously for maximum
  # flexibility. See method documentation for rationale.
  nil
end

.registered_language(name) ⇒ Hash?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Fetch a registered language entry

Parameters:

  • name (Symbol, String)

    language identifier

Returns:

  • (Hash, nil)

    registration hash with keys :path and :symbol, or nil if not registered



833
834
835
# File 'lib/tree_haver.rb', line 833

def registered_language(name)
  LanguageRegistry.registered(name)
end

.reset_backend!(to: :auto) ⇒ void

This method returns an undefined value.

Reset backend selection memoization

Primarily useful in tests to switch backends without cross-example leakage.

Examples:

Reset to auto-selection

TreeHaver.reset_backend!

Reset to specific backend

TreeHaver.reset_backend!(to: :ffi)

Parameters:

  • to (Symbol, String, nil) (defaults to: :auto)

    backend name or nil to clear (defaults to :auto)



384
385
386
# File 'lib/tree_haver.rb', line 384

def reset_backend!(to: :auto)
  @backend = to&.to_sym # rubocop:disable ThreadSafety/ClassInstanceVariable
end

.resolve_backend_module(explicit_backend = nil) ⇒ Module?

Get backend module for a specific backend (with explicit override)

Examples:

mod = TreeHaver.resolve_backend_module(:ffi)
mod.capabilities[:backend]  # => :ffi

Parameters:

  • explicit_backend (Symbol, String, nil) (defaults to: nil)

    explicitly requested backend

Returns:

  • (Module, nil)

    the backend module or nil if not available

Raises:

  • (BackendConflict)

    if the backend conflicts with previously used backends



539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
# File 'lib/tree_haver.rb', line 539

def resolve_backend_module(explicit_backend = nil)
  # Temporarily override effective backend
  requested = resolve_effective_backend(explicit_backend)

  mod = case requested
  when :mri
    Backends::MRI
  when :rust
    Backends::Rust
  when :ffi
    Backends::FFI
  when :java
    Backends::Java
  when :citrus
    Backends::Citrus
  when :prism
    Backends::Prism
  when :psych
    Backends::Psych
  when :commonmarker
    Backends::Commonmarker
  when :markly
    Backends::Markly
  when :auto
    backend_module  # Fall back to normal resolution for :auto
  else
    # Unknown backend name - return nil to trigger error in caller
    nil
  end

  # Return nil if the module doesn't exist
  return unless mod

  # Check for backend conflicts FIRST, before checking availability
  # This is critical because the conflict causes the backend to report unavailable
  # We want to raise a clear error explaining WHY it's unavailable
  # Use the requested backend name directly (not capabilities) because
  # capabilities may be empty when the backend is blocked/unavailable
  check_backend_conflict!(requested) if requested && requested != :auto

  # Now check if the backend is available
  # Why assume modules without available? are available?
  # - Some backends might be mocked in tests without an available? method
  # - This makes the code more defensive and test-friendly
  # - It allows graceful degradation if a backend module is incomplete
  # - Backward compatibility: if a module doesn't declare availability, assume it works
  return if mod.respond_to?(:available?) && !mod.available?

  # Record that this backend is being used
  record_backend_usage(requested) if requested && requested != :auto

  mod
end

.resolve_effective_backend(explicit_backend = nil) ⇒ Symbol

Resolve the effective backend considering explicit override

Priority: explicit > thread context > global > :auto

Examples:

TreeHaver.resolve_effective_backend(:ffi)  # => :ffi

With thread-local context

TreeHaver.with_backend(:mri) do
  TreeHaver.resolve_effective_backend(nil)  # => :mri
  TreeHaver.resolve_effective_backend(:ffi)  # => :ffi (explicit wins)
end

Parameters:

  • explicit_backend (Symbol, String, nil) (defaults to: nil)

    explicitly requested backend

Returns:

  • (Symbol)

    the backend to use



526
527
528
529
# File 'lib/tree_haver.rb', line 526

def resolve_effective_backend(explicit_backend = nil)
  return explicit_backend.to_sym if explicit_backend
  effective_backend
end

.resolve_native_backend_module(explicit_backend = nil) ⇒ Module?

Resolve a native tree-sitter backend module (for from_library)

This method is similar to resolve_backend_module but ONLY considers backends that support loading shared libraries (.so files):

  • MRI (ruby_tree_sitter C extension)

  • Rust (tree_stump)

  • FFI (ffi gem with libtree-sitter)

  • Java (jtreesitter on JRuby)

Pure Ruby backends (Citrus, Prism, Psych, Commonmarker, Markly) are NOT considered because they don’t support from_library.

Parameters:

  • explicit_backend (Symbol, String, nil) (defaults to: nil)

    explicitly requested backend

Returns:

  • (Module, nil)

    the backend module or nil if none available

Raises:

  • (BackendConflict)

    if the backend conflicts with previously used backends



613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
# File 'lib/tree_haver.rb', line 613

def resolve_native_backend_module(explicit_backend = nil)
  # Short-circuit on TruffleRuby: no native backends work
  # - MRI: C extension, MRI only
  # - Rust: magnus requires MRI's C API
  # - FFI: STRUCT_BY_VALUE not supported
  # - Java: requires JRuby's Java interop
  if defined?(RUBY_ENGINE) && RUBY_ENGINE == "truffleruby"
    return unless explicit_backend # Auto-select: no backends available
    # If explicit backend requested, let it fail with proper error below
  end

  # Get the effective backend (considers thread-local and global settings)
  requested = resolve_effective_backend(explicit_backend)

  # If the effective backend is a native backend, use it
  if NATIVE_BACKENDS.include?(requested)
    return resolve_backend_module(requested)
  end

  # If a specific non-native backend was explicitly requested, return nil
  # (from_library only works with native backends that load .so files)
  return if explicit_backend

  # If effective backend is :auto, auto-select from native backends in priority order
  # Note: non-native backends set via with_backend are NOT used here because
  # from_library only works with native backends
  native_priority = if defined?(RUBY_ENGINE) && RUBY_ENGINE == "jruby"
    %i[java ffi] # JRuby: Java first, then FFI
  else
    %i[mri rust ffi] # MRI: MRI first, then Rust, then FFI
  end

  native_priority.each do |backend|
    mod = resolve_backend_module(backend)
    return mod if mod
  end

  nil # No native backend available
end

.with_backend(name) { ... } ⇒ Object

Execute a block with a specific backend in thread-local context

This method provides temporary, thread-safe backend switching for a block of code. The backend setting is automatically restored when the block exits, even if an exception is raised. Supports nesting—inner blocks override outer blocks, and each level is properly unwound.

Thread Safety: Each thread maintains its own backend context, so concurrent threads can safely use different backends without interfering with each other.

Use Cases:

  • Testing: Test the same code path with different backends

  • Performance comparison: Benchmark parsing with different backends

  • Fallback scenarios: Try one backend, fall back to another on failure

  • Thread isolation: Different threads can use different backends safely

Examples:

Basic usage

TreeHaver.with_backend(:mri) do
  parser = TreeHaver::Parser.new
  tree = parser.parse(source)
end
# Backend is automatically restored here

Nested blocks (inner overrides outer)

TreeHaver.with_backend(:rust) do
  parser1 = TreeHaver::Parser.new  # Uses :rust
  TreeHaver.with_backend(:citrus) do
    parser2 = TreeHaver::Parser.new  # Uses :citrus
  end
  parser3 = TreeHaver::Parser.new  # Back to :rust
end

Testing multiple backends

[:mri, :rust, :citrus].each do |backend_name|
  TreeHaver.with_backend(backend_name) do
    parser = TreeHaver::Parser.new
    result = parser.parse(source)
    puts "#{backend_name}: #{result.root_node.type}"
  end
end

Exception safety (backend restored even on error)

TreeHaver.with_backend(:mri) do
  raise "Something went wrong"
rescue
  # Handle error
end
# Backend is still restored to its previous value

Thread isolation

threads = [:mri, :rust].map do |backend_name|
  Thread.new do
    TreeHaver.with_backend(backend_name) do
      # Each thread uses its own backend independently
      TreeHaver::Parser.new
    end
  end
end
threads.each(&:join)

Parameters:

  • name (Symbol, String)

    backend name (:mri, :rust, :ffi, :java, :citrus, :auto)

Yields:

  • block to execute with the specified backend

Returns:

  • (Object)

    the return value of the block

Raises:

  • (ArgumentError)

    if backend name is nil

  • (BackendConflict)

    if the requested backend conflicts with a previously used backend

See Also:

  • #effective_backend
  • #current_backend_context


490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
# File 'lib/tree_haver.rb', line 490

def with_backend(name)
  raise ArgumentError, "Backend name required" if name.nil?

  # Get context FIRST to ensure it exists
  ctx = current_backend_context
  old_backend = ctx[:backend]
  old_depth = ctx[:depth]

  begin
    # Set new backend and increment depth
    ctx[:backend] = name.to_sym
    ctx[:depth] += 1

    # Execute block
    yield
  ensure
    # Restore previous backend and depth
    # This ensures proper unwinding even with exceptions
    ctx[:backend] = old_backend
    ctx[:depth] = old_depth
  end
end