Class: TreeHaver::Language
- Inherits:
-
Object
- Object
- TreeHaver::Language
- Defined in:
- lib/tree_haver/language.rb
Overview
Represents a language grammar for parsing source code
Language is the entry point for loading and using grammars. It provides a unified interface that works across all backends (MRI, Rust, FFI, Java, Citrus).
For tree-sitter backends, languages are loaded from shared library files (.so/.dylib/.dll). For pure-Ruby backends (Citrus, Prism, Psych), languages are built-in or provided by gems.
Loading Languages
The primary way to load a language is via registration:
TreeHaver.register_language(:toml, path: "/path/to/libtree-sitter-toml.so")
language = TreeHaver::Language.toml
For explicit loading without registration:
language = TreeHaver::Language.from_library(
"/path/to/libtree-sitter-toml.so",
symbol: "tree_sitter_toml"
)
For ruby_tree_sitter compatibility:
language = TreeHaver::Language.load("toml", "/path/to/libtree-sitter-toml.so")
Class Method Summary collapse
-
.from_library(path, symbol: nil, name: nil, validate: true, backend: nil) ⇒ Language
(also: from_path)
Load a language grammar from a shared library.
-
.load(name, path, validate: true) ⇒ Language
Load a language grammar from a shared library (ruby_tree_sitter compatibility).
-
.method_missing(method_name, *args, **kwargs, &block) ⇒ Language
Dynamic helper to load a registered language by name.
- .respond_to_missing?(method_name, include_private = false) ⇒ Boolean private
Class Method Details
.from_library(path, symbol: nil, name: nil, validate: true, backend: nil) ⇒ Language Also known as: from_path
Load a language grammar from a shared library
The library must export a function that returns a pointer to a TSLanguage struct. By default, TreeHaver looks for a symbol named “tree_sitter_<name>”.
Security
By default, paths are validated using PathValidator to prevent path traversal and other attacks. Set ‘validate: false` to skip validation (not recommended unless you’ve already validated the path).
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
# File 'lib/tree_haver/language.rb', line 83 def from_library(path, symbol: nil, name: nil, validate: true, backend: nil) if validate unless PathValidator.safe_library_path?(path) errors = PathValidator.validation_errors(path) raise ArgumentError, "Unsafe library path: #{path.inspect}. Errors: #{errors.join("; ")}" end if symbol && !PathValidator.safe_symbol_name?(symbol) raise ArgumentError, "Unsafe symbol name: #{symbol.inspect}. " \ "Symbol names must be valid C identifiers." end end # from_library only works with tree-sitter backends that support .so files # Pure Ruby backends (Citrus, Prism, Psych, Commonmarker, Markly) don't support from_library mod = TreeHaver.resolve_native_backend_module(backend) if mod.nil? if backend raise NotAvailable, "Requested backend #{backend.inspect} is not available or does not support shared libraries" else raise NotAvailable, "No native tree-sitter backend is available for loading shared libraries. " \ "Available native backends (MRI, Rust, FFI, Java) require platform-specific setup. " \ "For pure-Ruby parsing, use backend-specific Language classes directly (e.g., Prism, Psych, Citrus)." end end # Backend must implement .from_library; fallback to .from_path for older impls # Include effective backend AND ENV vars in cache key since they affect loading effective_b = TreeHaver.resolve_effective_backend(backend) key = [effective_b, path, symbol, name, ENV["TREE_SITTER_LANG_SYMBOL"]] LanguageRegistry.fetch(key) do if mod::Language.respond_to?(:from_library) mod::Language.from_library(path, symbol: symbol, name: name) else mod::Language.from_path(path) end end end |
.load(name, path, validate: true) ⇒ Language
Load a language grammar from a shared library (ruby_tree_sitter compatibility)
This method provides API compatibility with ruby_tree_sitter which uses ‘Language.load(name, path)`.
48 49 50 |
# File 'lib/tree_haver/language.rb', line 48 def load(name, path, validate: true) from_library(path, symbol: "tree_sitter_#{name}", name: name, validate: validate) end |
.method_missing(method_name, *args, **kwargs, &block) ⇒ Language
Dynamic helper to load a registered language by name
After registering a language with TreeHaver.register_language, you can load it using a method call. The appropriate backend will be used based on registration and current backend.
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 |
# File 'lib/tree_haver/language.rb', line 149 def method_missing(method_name, *args, **kwargs, &block) # Resolve only if the language name was registered all_backends = TreeHaver.registered_language(method_name) return super unless all_backends # Check current backend current_backend = TreeHaver.backend_module # Determine which backend type to use backend_type = if current_backend == Backends::Citrus :citrus else :tree_sitter # MRI, Rust, FFI, Java all use tree-sitter end # Get backend-specific registration reg = all_backends[backend_type] # If Citrus backend is active if backend_type == :citrus if reg && reg[:grammar_module] return Backends::Citrus::Language.new(reg[:grammar_module]) end # Fall back to error if no Citrus grammar registered raise NotAvailable, "Citrus backend is active but no Citrus grammar registered for :#{method_name}. " \ "Either register a Citrus grammar or use a tree-sitter backend. " \ "Registered backends: #{all_backends.keys.inspect}" end # For tree-sitter backends, try to load from path # If that fails, fall back to Citrus if available if reg && reg[:path] path = kwargs[:path] || args.first || reg[:path] # Symbol priority: kwargs override > registration > derive from method_name symbol = if kwargs.key?(:symbol) kwargs[:symbol] elsif reg[:symbol] reg[:symbol] else "tree_sitter_#{method_name}" end # Name priority: kwargs override > derive from symbol (strip tree_sitter_ prefix) # Using symbol-derived name ensures ruby_tree_sitter gets the correct language name # e.g., "toml" not "toml_both" when symbol is "tree_sitter_toml" name = kwargs[:name] || symbol&.sub(/\Atree_sitter_/, "") begin return from_library(path, symbol: symbol, name: name) rescue NotAvailable, ArgumentError, LoadError => e # Tree-sitter failed to load - check for Citrus fallback handle_tree_sitter_load_failure(e, all_backends) rescue => e # Also catch FFI::NotFoundError if FFI is loaded (can't reference directly as FFI may not exist) if defined?(::FFI::NotFoundError) && e.is_a?(::FFI::NotFoundError) handle_tree_sitter_load_failure(e, all_backends) else raise end end end # No tree-sitter path registered - check for Citrus fallback # This enables auto-fallback when tree-sitter grammar is not installed # but a Citrus grammar (pure Ruby) is available citrus_reg = all_backends[:citrus] if citrus_reg && citrus_reg[:grammar_module] return Backends::Citrus::Language.new(citrus_reg[:grammar_module]) end # No appropriate registration found raise ArgumentError, "No grammar registered for :#{method_name} compatible with #{backend_type} backend. " \ "Registered backends: #{all_backends.keys.inspect}" end |
.respond_to_missing?(method_name, include_private = false) ⇒ Boolean
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
227 228 229 |
# File 'lib/tree_haver/language.rb', line 227 def respond_to_missing?(method_name, include_private = false) !!TreeHaver.registered_language(method_name) || super end |