Class: TreeHaver::Parser

Inherits:
Object
  • Object
show all
Defined in:
lib/tree_haver.rb

Overview

Represents a tree-sitter parser instance

A Parser is used to parse source code into a syntax tree. You must set a language before parsing.

Wrapping/Unwrapping Responsibility

TreeHaver::Parser is responsible for ALL object wrapping and unwrapping:

**Language objects:**

  • Unwraps Language wrappers before passing to backend.language=

  • MRI backend receives ::TreeSitter::Language

  • Rust backend receives String (language name)

  • FFI backend receives wrapped Language (needs to_ptr)

**Tree objects:**

  • parse() receives raw source, backend returns raw tree, Parser wraps it

  • parse_string() unwraps old_tree before passing to backend, wraps returned tree

  • Backends always work with raw backend trees, never TreeHaver::Tree

**Node objects:**

  • Backends return raw nodes, TreeHaver::Tree and TreeHaver::Node wrap them

This design ensures:

  • Principle of Least Surprise: wrapping happens at boundaries, consistently

  • Backends are simple: they don’t need to know about TreeHaver wrappers

  • Single Responsibility: wrapping logic is only in TreeHaver::Parser

Examples:

Basic parsing

parser = TreeHaver::Parser.new
parser.language = TreeHaver::Language.toml
tree = parser.parse("[package]\nname = \"foo\"")

Instance Method Summary collapse

Constructor Details

#initialize(backend: nil) ⇒ Parser

Create a new parser instance

Examples:

Default (uses context/global)

parser = TreeHaver::Parser.new

Explicit backend

parser = TreeHaver::Parser.new(backend: :ffi)

Parameters:

  • backend (Symbol, String, nil) (defaults to: nil)

    optional backend to use (overrides context/global)

Raises:

  • (NotAvailable)

    if no backend is available or requested backend is unavailable



982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
# File 'lib/tree_haver.rb', line 982

def initialize(backend: nil)
  # Convert string backend names to symbols for consistency
  backend = backend.to_sym if backend.is_a?(String)

  mod = TreeHaver.resolve_backend_module(backend)

  if mod.nil?
    if backend
      raise NotAvailable, "Requested backend #{backend.inspect} is not available"
    else
      raise NotAvailable, "No TreeHaver backend is available"
    end
  end

  # Try to create the parser, with fallback to Citrus if tree-sitter fails
  # This enables auto-fallback when tree-sitter runtime isn't available
  begin
    @impl = mod::Parser.new
    @explicit_backend = backend  # Remember for introspection (always a Symbol or nil)
  rescue NoMethodError, FFI::NotFoundError, LoadError => e
    # Tree-sitter backend failed (likely missing runtime library)
    # Try Citrus as fallback if we weren't explicitly asked for a specific backend
    if backend.nil? || backend == :auto
      if Backends::Citrus.available?
        @impl = Backends::Citrus::Parser.new
        @explicit_backend = :citrus
      else
        # No fallback available, re-raise original error
        raise NotAvailable, "Tree-sitter backend failed: #{e.message}. " \
          "Citrus fallback not available. Install tree-sitter runtime or citrus gem."
      end
    else
      # Explicit backend was requested, don't fallback
      raise
    end
  end
end

Instance Method Details

#backendSymbol

Get the backend this parser is using (for introspection)

Returns the actual backend in use, resolving :auto to the concrete backend.

Returns:

  • (Symbol)

    the backend name (:mri, :rust, :ffi, :java, or :citrus)



1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
# File 'lib/tree_haver.rb', line 1025

def backend
  if @explicit_backend && @explicit_backend != :auto
    @explicit_backend
  else
    # Determine actual backend from the implementation class
    case @impl.class.name
    when /MRI/
      :mri
    when /Rust/
      :rust
    when /FFI/
      :ffi
    when /Java/
      :java
    when /Citrus/
      :citrus
    else
      # Fallback to effective_backend if we can't determine from class name
      TreeHaver.effective_backend
    end
  end
end

#language=(lang) ⇒ Language

Set the language grammar for this parser

Examples:

parser.language = TreeHaver::Language.from_library("/path/to/grammar.so")

Parameters:

  • lang (Language)

    the language to use for parsing

Returns:

  • (Language)

    the language that was set



1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
# File 'lib/tree_haver.rb', line 1054

def language=(lang)
  # Check if this is a Citrus language - if so, we need a Citrus parser
  # This enables automatic backend switching when tree-sitter fails and
  # falls back to Citrus
  if lang.is_a?(Backends::Citrus::Language)
    unless @impl.is_a?(Backends::Citrus::Parser)
      # Switch to Citrus parser to match the Citrus language
      @impl = Backends::Citrus::Parser.new
      @explicit_backend = :citrus
    end
  end

  # Unwrap the language before passing to backend
  # Backends receive raw language objects, never TreeHaver wrappers
  inner_lang = unwrap_language(lang)
  @impl.language = inner_lang
  # Return the original (possibly wrapped) language for consistency
  lang # rubocop:disable Lint/Void (intentional return value)
end

#parse(source) ⇒ Tree

Parse source code into a syntax tree

Examples:

tree = parser.parse("x = 1")
puts tree.root_node.type

Parameters:

  • source (String)

    the source code to parse (should be UTF-8)

Returns:

  • (Tree)

    the parsed syntax tree



1210
1211
1212
1213
1214
# File 'lib/tree_haver.rb', line 1210

def parse(source)
  tree_impl = @impl.parse(source)
  # Wrap backend tree with source so Node#text works
  Tree.new(tree_impl, source: source)
end

#parse_string(old_tree, source) ⇒ Tree

Parse source code into a syntax tree (with optional incremental parsing)

This method provides API compatibility with ruby_tree_sitter which uses ‘parse_string(old_tree, source)`.

Incremental Parsing

tree-sitter supports **incremental parsing** where you can pass a previously parsed tree along with edit information to efficiently re-parse only the changed portions of source code. This is a major performance optimization for editors and IDEs that need to re-parse on every keystroke.

The workflow for incremental parsing is:

  1. Parse the initial source: ‘tree = parser.parse_string(nil, source)`

  2. User edits the source (e.g., inserts a character)

  3. Call ‘tree.edit(…)` to update the tree’s position data

  4. Re-parse with the old tree: ‘new_tree = parser.parse_string(tree, new_source)`

  5. tree-sitter reuses unchanged nodes, only re-parsing affected regions

TreeHaver passes through to the underlying backend if it supports incremental parsing (MRI and Rust backends do). Check ‘TreeHaver.capabilities` to see if the current backend supports it.

Examples:

First parse (no old tree)

tree = parser.parse_string(nil, "x = 1")

Incremental parse

tree.edit(start_byte: 4, old_end_byte: 5, new_end_byte: 6, ...)
new_tree = parser.parse_string(tree, "x = 42")

Parameters:

  • old_tree (Tree, nil)

    previously parsed tree for incremental parsing, or nil for fresh parse

  • source (String)

    the source code to parse (should be UTF-8)

Returns:

  • (Tree)

    the parsed syntax tree

See Also:



1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
# File 'lib/tree_haver.rb', line 1249

def parse_string(old_tree, source)
  # Pass through to backend if it supports incremental parsing
  if old_tree && @impl.respond_to?(:parse_string)
    # Extract the underlying implementation from our Tree wrapper
    old_impl = if old_tree.respond_to?(:inner_tree)
      old_tree.inner_tree
    elsif old_tree.respond_to?(:instance_variable_get)
      # Fallback for compatibility
      old_tree.instance_variable_get(:@inner_tree) || old_tree.instance_variable_get(:@impl) || old_tree
    else
      old_tree
    end
    tree_impl = @impl.parse_string(old_impl, source)
    # Wrap backend tree with source so Node#text works
    Tree.new(tree_impl, source: source)
  elsif @impl.respond_to?(:parse_string)
    tree_impl = @impl.parse_string(nil, source)
    # Wrap backend tree with source so Node#text works
    Tree.new(tree_impl, source: source)
  else
    # Fallback for backends that don't support parse_string
    parse(source)
  end
end