Module: TreeHaver::LibraryPathUtils

Defined in:
lib/tree_haver/library_path_utils.rb

Overview

Utility methods for deriving tree-sitter symbol and language names from library paths

This module provides consistent path parsing across all backends that load tree-sitter grammar libraries from shared object files (.so/.dylib/.dll).

Examples:

TreeHaver::LibraryPathUtils.derive_symbol_from_path("/usr/lib/libtree-sitter-toml.so")
# => "tree_sitter_toml"

TreeHaver::LibraryPathUtils.derive_language_name_from_path("/usr/lib/libtree-sitter-toml.so")
# => "toml"

Class Method Summary collapse

Class Method Details

.derive_language_name_from_path(path) ⇒ String?

Derive the language name from a library path

Language names are the short identifiers (e.g., “toml”, “json”, “ruby”) used by some backends (like tree_stump/Rust) to register grammars.

Parameters:

  • path like “/usr/lib/libtree-sitter-toml.so”

Returns:

  • language name like “toml”, or nil if path is nil



62
63
64
65
66
67
68
# File 'lib/tree_haver/library_path_utils.rb', line 62

def derive_language_name_from_path(path)
  symbol = derive_symbol_from_path(path)
  return unless symbol

  # Strip the "tree_sitter_" prefix to get the language name
  symbol.sub(/\Atree_sitter_/, "")
end

.derive_language_name_from_symbol(symbol) ⇒ String?

Derive language name from a symbol

Parameters:

  • symbol like “tree_sitter_toml”

Returns:

  • language name like “toml”, or nil if symbol is nil



74
75
76
77
78
# File 'lib/tree_haver/library_path_utils.rb', line 74

def derive_language_name_from_symbol(symbol)
  return unless symbol

  symbol.sub(/\Atree_sitter_/, "")
end

.derive_symbol_from_path(path) ⇒ String?

Derive the tree-sitter symbol name from a library path

Symbol names are the exported C function names (e.g., “tree_sitter_toml”) that return a pointer to the TSLanguage struct.

Handles various naming conventions:

  • libtree-sitter-toml.so → tree_sitter_toml

  • libtree_sitter_toml.so → tree_sitter_toml

  • tree-sitter-toml.so → tree_sitter_toml

  • tree_sitter_toml.so → tree_sitter_toml

  • toml.so → tree_sitter_toml (assumes simple language name)

Parameters:

  • path like “/usr/lib/libtree-sitter-toml.so”

Returns:

  • symbol like “tree_sitter_toml”, or nil if path is nil



32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# File 'lib/tree_haver/library_path_utils.rb', line 32

def derive_symbol_from_path(path)
  return unless path

  # Extract filename without extension: "libtree-sitter-toml" or "toml"
  filename = File.basename(path, ".*")

  # Handle multi-part extensions like .so.0.24
  filename = filename.sub(/\.so(\.\d+)*\z/, "")

  # Match patterns and normalize to tree_sitter_<lang>
  case filename
  when /\Alib[-_]?tree[-_]sitter[-_](.+)\z/
    "tree_sitter_#{Regexp.last_match(1).tr("-", "_")}"
  when /\Atree[-_]sitter[-_](.+)\z/
    "tree_sitter_#{Regexp.last_match(1).tr("-", "_")}"
  else
    # Assume filename is just the language name (e.g., "toml.so" -> "tree_sitter_toml")
    # Also strip "lib" prefix if present (e.g., "libtoml.so" -> "tree_sitter_toml")
    lang = filename.sub(/\Alib/, "").tr("-", "_")
    "tree_sitter_#{lang}"
  end
end