Class: TreeHaver::Parser

Inherits:
Object
  • Object
show all
Defined in:
lib/tree_haver/parser.rb

Overview

Represents a tree-sitter parser instance

A Parser is used to parse source code into a syntax tree. You must set a language before parsing.

Wrapping/Unwrapping Responsibility

TreeHaver::Parser is responsible for ALL object wrapping and unwrapping:

**Language objects:**

  • Unwraps Language wrappers before passing to backend.language=

  • MRI backend receives ::TreeSitter::Language

  • Rust backend receives String (language name)

  • FFI backend receives wrapped Language (needs to_ptr)

**Tree objects:**

  • parse() receives raw source, backend returns raw tree, Parser wraps it

  • parse_string() unwraps old_tree before passing to backend, wraps returned tree

  • Backends always work with raw backend trees, never TreeHaver::Tree

**Node objects:**

  • Backends return raw nodes, TreeHaver::Tree and TreeHaver::Node wrap them

This design ensures:

  • Principle of Least Surprise: wrapping happens at boundaries, consistently

  • Backends are simple: they don’t need to know about TreeHaver wrappers

  • Single Responsibility: wrapping logic is only in TreeHaver::Parser

Examples:

Basic parsing

parser = TreeHaver::Parser.new
parser.language = TreeHaver::Language.toml
tree = parser.parse("[package]\nname = \"foo\"")

Instance Method Summary collapse

Constructor Details

#initialize(backend: nil) ⇒ Parser

Create a new parser instance

Examples:

Default (uses context/global)

parser = TreeHaver::Parser.new

Explicit backend

parser = TreeHaver::Parser.new(backend: :ffi)

Parameters:

  • backend (Symbol, String, nil) (defaults to: nil)

    optional backend to use (overrides context/global)

Raises:

  • (NotAvailable)

    if no backend is available or requested backend is unavailable



45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
# File 'lib/tree_haver/parser.rb', line 45

def initialize(backend: nil)
  # Convert string backend names to symbols for consistency
  backend = backend.to_sym if backend.is_a?(String)

  mod = TreeHaver.resolve_backend_module(backend)

  if mod.nil?
    if backend
      raise NotAvailable, "Requested backend #{backend.inspect} is not available"
    else
      raise NotAvailable, "No TreeHaver backend is available"
    end
  end

  # Try to create the parser, with fallback to Citrus if tree-sitter fails
  # This enables auto-fallback when tree-sitter runtime isn't available
  begin
    @impl = mod::Parser.new
    @explicit_backend = backend  # Remember for introspection (always a Symbol or nil)
  rescue NoMethodError, LoadError => e
    handle_parser_creation_failure(e, backend)
  rescue => e
    # Also catch FFI::NotFoundError if FFI is loaded (can't reference directly as FFI may not exist)
    if defined?(::FFI::NotFoundError) && e.is_a?(::FFI::NotFoundError)
      handle_parser_creation_failure(e, backend)
    else
      raise
    end
  end
end

Instance Method Details

#backendSymbol

Get the backend this parser is using (for introspection)

Returns the actual backend in use, resolving :auto to the concrete backend.

Returns:

  • (Symbol)

    the backend name (:mri, :rust, :ffi, :java, or :citrus)



105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
# File 'lib/tree_haver/parser.rb', line 105

def backend
  if @explicit_backend && @explicit_backend != :auto
    @explicit_backend
  else
    # Determine actual backend from the implementation class
    case @impl.class.name
    when /MRI/
      :mri
    when /Rust/
      :rust
    when /FFI/
      :ffi
    when /Java/
      :java
    when /Citrus/
      :citrus
    else
      # Fallback to effective_backend if we can't determine from class name
      TreeHaver.effective_backend
    end
  end
end

#handle_parser_creation_failure(error, backend) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Handle parser creation failure with optional Citrus fallback

Parameters:

  • error (Exception)

    the error that caused parser creation to fail

  • backend (Symbol, nil)

    the requested backend

Raises:



82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
# File 'lib/tree_haver/parser.rb', line 82

def handle_parser_creation_failure(error, backend)
  # Tree-sitter backend failed (likely missing runtime library)
  # Try Citrus as fallback if we weren't explicitly asked for a specific backend
  if backend.nil? || backend == :auto
    if Backends::Citrus.available?
      @impl = Backends::Citrus::Parser.new
      @explicit_backend = :citrus
    else
      # No fallback available, re-raise original error
      raise NotAvailable, "Tree-sitter backend failed: #{error.message}. " \
        "Citrus fallback not available. Install tree-sitter runtime or citrus gem."
    end
  else
    # Explicit backend was requested, don't fallback
    raise error
  end
end

#language=(lang) ⇒ Language

Set the language grammar for this parser

Examples:

parser.language = TreeHaver::Language.from_library("/path/to/grammar.so")

Parameters:

  • lang (Language)

    the language to use for parsing

Returns:

  • (Language)

    the language that was set



134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
# File 'lib/tree_haver/parser.rb', line 134

def language=(lang)
  # Check if this is a Citrus language - if so, we need a Citrus parser
  # This enables automatic backend switching when tree-sitter fails and
  # falls back to Citrus
  if lang.is_a?(Backends::Citrus::Language)
    unless @impl.is_a?(Backends::Citrus::Parser)
      # Switch to Citrus parser to match the Citrus language
      @impl = Backends::Citrus::Parser.new
      @explicit_backend = :citrus
    end
  end

  # Unwrap the language before passing to backend
  # Backends receive raw language objects, never TreeHaver wrappers
  inner_lang = unwrap_language(lang)
  @impl.language = inner_lang
  # Return the original (possibly wrapped) language for consistency
  lang # rubocop:disable Lint/Void (intentional return value)
end

#parse(source) ⇒ Tree

Parse source code into a syntax tree

Examples:

tree = parser.parse("x = 1")
puts tree.root_node.type

Parameters:

  • source (String)

    the source code to parse (should be UTF-8)

Returns:

  • (Tree)

    the parsed syntax tree



161
162
163
164
165
# File 'lib/tree_haver/parser.rb', line 161

def parse(source)
  tree_impl = @impl.parse(source)
  # Wrap backend tree with source so Node#text works
  Tree.new(tree_impl, source: source)
end

#parse_string(old_tree, source) ⇒ Tree

Parse source code into a syntax tree (with optional incremental parsing)

This method provides API compatibility with ruby_tree_sitter which uses ‘parse_string(old_tree, source)`.

Incremental Parsing

tree-sitter supports **incremental parsing** where you can pass a previously parsed tree along with edit information to efficiently re-parse only the changed portions of source code. This is a major performance optimization for editors and IDEs that need to re-parse on every keystroke.

The workflow for incremental parsing is:

  1. Parse the initial source: ‘tree = parser.parse_string(nil, source)`

  2. User edits the source (e.g., inserts a character)

  3. Call ‘tree.edit(…)` to update the tree’s position data

  4. Re-parse with the old tree: ‘new_tree = parser.parse_string(tree, new_source)`

  5. tree-sitter reuses unchanged nodes, only re-parsing affected regions

TreeHaver passes through to the underlying backend if it supports incremental parsing (MRI and Rust backends do). Check TreeHaver.capabilities[:incremental] to see if the current backend supports it.

Examples:

First parse (no old tree)

tree = parser.parse_string(nil, "x = 1")

Incremental parse

tree.edit(start_byte: 4, old_end_byte: 5, new_end_byte: 6, ...)
new_tree = parser.parse_string(tree, "x = 42")

Parameters:

  • old_tree (Tree, nil)

    previously parsed tree for incremental parsing, or nil for fresh parse

  • source (String)

    the source code to parse (should be UTF-8)

Returns:

  • (Tree)

    the parsed syntax tree

See Also:



200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
# File 'lib/tree_haver/parser.rb', line 200

def parse_string(old_tree, source)
  # Pass through to backend if it supports incremental parsing
  if old_tree && @impl.respond_to?(:parse_string)
    # Extract the underlying implementation from our Tree wrapper
    old_impl = if old_tree.respond_to?(:inner_tree)
      old_tree.inner_tree
    elsif old_tree.respond_to?(:instance_variable_get)
      # Fallback for compatibility
      old_tree.instance_variable_get(:@inner_tree) || old_tree.instance_variable_get(:@impl) || old_tree
    else
      old_tree
    end
    tree_impl = @impl.parse_string(old_impl, source)
    # Wrap backend tree with source so Node#text works
    Tree.new(tree_impl, source: source)
  elsif @impl.respond_to?(:parse_string)
    tree_impl = @impl.parse_string(nil, source)
    # Wrap backend tree with source so Node#text works
    Tree.new(tree_impl, source: source)
  else
    # Fallback for backends that don't support parse_string
    parse(source)
  end
end