Module: TreeHaver::PathValidator
- Defined in:
- lib/tree_haver/path_validator.rb
Overview
These validations provide defense-in-depth but cannot guarantee safety. Loading shared libraries from untrusted sources is always risky.
Security utilities for validating paths and inputs before loading shared libraries.
Loading shared libraries (.so/.dylib/.dll) is inherently dangerous as it executes arbitrary native code. This module provides defense-in-depth validations to reduce the attack surface when paths come from potentially untrusted sources like environment variables or user input.
Constant Summary collapse
- ALLOWED_EXTENSIONS =
Allowed shared library extensions by platform
%w[.so .dylib .dll].freeze
- DEFAULT_TRUSTED_DIRECTORIES =
Default directories that are generally trusted for system libraries These are searched by the dynamic linker anyway
[ "/usr/lib", "/usr/lib64", "/usr/lib/x86_64-linux-gnu", "/usr/lib/aarch64-linux-gnu", "/usr/local/lib", "/opt/homebrew/lib", "/opt/local/lib", ].freeze
- TRUSTED_DIRS_ENV_VAR =
Environment variable for adding trusted directories (comma-separated)
"TREE_HAVER_TRUSTED_DIRS"- MAX_PATH_LENGTH =
Maximum reasonable path length (prevents DoS via extremely long paths)
4096- VALID_FILENAME_PATTERN =
Pattern for valid library filenames (alphanumeric, hyphens, underscores, dots) This prevents shell metacharacters and other injection attempts
/\A[a-zA-Z0-9][a-zA-Z0-9._-]*\z/- VALID_LANGUAGE_PATTERN =
Pattern for valid language names (lowercase alphanumeric and underscores)
/\A[a-z][a-z0-9_]*\z/- VALID_SYMBOL_PATTERN =
Pattern for valid symbol names (C identifier format)
/\A[a-zA-Z_][a-zA-Z0-9_]*\z/
Class Method Summary collapse
-
.add_trusted_directory(directory) ⇒ void
Register a custom trusted directory.
-
.clear_custom_trusted_directories! ⇒ void
Clear all custom trusted directories.
-
.custom_trusted_directories ⇒ Array<String>
Get the list of custom trusted directories (for debugging).
-
.has_valid_extension?(path) ⇒ Boolean
private
Check if path has a valid library extension Allows: .so, .dylib, .dll, and versioned .so files like .so.0, .so.14.
-
.in_trusted_directory?(path) ⇒ Boolean
Check if a path is within a trusted directory.
-
.remove_trusted_directory(directory) ⇒ void
Remove a custom trusted directory.
-
.resolve_check_path(path) ⇒ String?
private
Resolve a path to its real path for trust checking.
-
.safe_backend_name?(backend) ⇒ Boolean
Validate a backend name.
-
.safe_language_name?(name) ⇒ Boolean
Validate a language name is safe.
-
.safe_library_path?(path, require_trusted_dir: false) ⇒ Boolean
Validate a path is safe for loading as a shared library.
-
.safe_symbol_name?(symbol) ⇒ Boolean
Validate a symbol name is safe for dlsym lookup.
-
.sanitize_language_name(name) ⇒ Symbol?
Sanitize a language name for safe use.
-
.trusted_directories ⇒ Array<String>
Get all trusted directories (default + custom + from ENV).
-
.validation_errors(path) ⇒ Array<String>
Get validation errors for a path (for debugging/error messages).
- .windows_absolute_path?(path) ⇒ Boolean private
Class Method Details
.add_trusted_directory(directory) ⇒ void
This method returns an undefined value.
Register a custom trusted directory
Use this to add directories where you install tree-sitter grammars, such as Homebrew locations, luarocks paths, or other package managers.
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
# File 'lib/tree_haver/path_validator.rb', line 106 def add_trusted_directory(directory) = File.(directory) # :nocov: # File.expand_path always returns absolute paths on Unix/macOS. # This guard exists for defensive programming on exotic platforms # where expand_path might behave differently, but cannot be tested # in standard CI environments. unless .start_with?("/") raise ArgumentError, "Trusted directory must be an absolute path: #{directory.inspect}" end # :nocov: @mutex.synchronize do @custom_trusted_directories << unless @custom_trusted_directories.include?() end nil end |
.clear_custom_trusted_directories! ⇒ void
This method returns an undefined value.
Clear all custom trusted directories
Does not affect DEFAULT_TRUSTED_DIRECTORIES or ENV-based directories. Primarily useful for testing.
141 142 143 144 |
# File 'lib/tree_haver/path_validator.rb', line 141 def clear_custom_trusted_directories! @mutex.synchronize { @custom_trusted_directories.clear } nil end |
.custom_trusted_directories ⇒ Array<String>
Get the list of custom trusted directories (for debugging)
149 150 151 |
# File 'lib/tree_haver/path_validator.rb', line 149 def custom_trusted_directories @mutex.synchronize { @custom_trusted_directories.dup } end |
.has_valid_extension?(path) ⇒ Boolean
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Check if path has a valid library extension Allows: .so, .dylib, .dll, and versioned .so files like .so.0, .so.14
342 343 344 345 346 347 348 349 350 351 |
# File 'lib/tree_haver/path_validator.rb', line 342 def has_valid_extension?(path) # Check for exact matches first (.so, .dylib, .dll) return true if ALLOWED_EXTENSIONS.any? { |ext| path.end_with?(ext) } # Check for versioned .so files (Linux convention) # e.g., libtree-sitter.so.0, libtree-sitter.so.14 return true if path.match?(/\.so\.\d+\z/) false end |
.in_trusted_directory?(path) ⇒ Boolean
Check if a path is within a trusted directory
Checks against DEFAULT_TRUSTED_DIRECTORIES, custom registered directories, and directories from TREE_HAVER_TRUSTED_DIRS environment variable.
208 209 210 211 212 213 214 215 216 |
# File 'lib/tree_haver/path_validator.rb', line 208 def in_trusted_directory?(path) return false if path.nil? # Resolve the real path to handle symlinks check_path = resolve_check_path(path) return false if check_path.nil? trusted_directories.any? { |trusted| check_path.start_with?(trusted) } end |
.remove_trusted_directory(directory) ⇒ void
This method returns an undefined value.
Remove a custom trusted directory
129 130 131 132 133 |
# File 'lib/tree_haver/path_validator.rb', line 129 def remove_trusted_directory(directory) = File.(directory) @mutex.synchronize { @custom_trusted_directories.delete() } nil end |
.resolve_check_path(path) ⇒ String?
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Resolve a path to its real path for trust checking
223 224 225 226 227 228 229 230 231 232 233 |
# File 'lib/tree_haver/path_validator.rb', line 223 def resolve_check_path(path) File.realpath(path) rescue Errno::ENOENT # File doesn't exist yet, check the directory dir = File.dirname(path) begin File.realpath(dir) rescue Errno::ENOENT nil end end |
.safe_backend_name?(backend) ⇒ Boolean
Validate a backend name
279 280 281 282 283 |
# File 'lib/tree_haver/path_validator.rb', line 279 def safe_backend_name?(backend) return true if backend.nil? # nil means :auto i[auto mri rust ffi java].include?(backend.to_s.to_sym) end |
.safe_language_name?(name) ⇒ Boolean
Validate a language name is safe
Language names are used to construct:
-
Environment variable names (TREE_SITTER_<LANG>_PATH)
-
Library filenames (libtree-sitter-<lang>.so)
-
Symbol names (tree_sitter_<lang>)
249 250 251 252 253 254 255 256 257 |
# File 'lib/tree_haver/path_validator.rb', line 249 def safe_language_name?(name) return false if name.nil? name_str = name.to_s return false if name_str.empty? return false if name_str.length > 64 # Reasonable limit name_str.match?(VALID_LANGUAGE_PATTERN) end |
.safe_library_path?(path, require_trusted_dir: false) ⇒ Boolean
Validate a path is safe for loading as a shared library
Checks performed:
-
Path is not nil or empty
-
Path length is reasonable
-
Path is absolute (no relative path traversal)
-
Path has an allowed extension
-
Path does not contain null bytes
-
Filename portion matches safe pattern
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 |
# File 'lib/tree_haver/path_validator.rb', line 173 def safe_library_path?(path, require_trusted_dir: false) return false if path.nil? || path.empty? return false if path.length > MAX_PATH_LENGTH return false if path.include?("\0") # Null byte injection # Must be absolute path (prevents relative path traversal) return false unless path.start_with?("/") || windows_absolute_path?(path) # Check for path traversal attempts return false if path.include?("/../") || path.end_with?("/..") return false if path.include?("/./") || path.end_with?("/.") # Validate extension # Allow versioned .so files like .so.0, .so.14, etc. (common on Linux) return false unless has_valid_extension?(path) # Validate filename portion filename = File.basename(path) return false unless filename.match?(VALID_FILENAME_PATTERN) # Optionally require the path to be in a trusted directory if require_trusted_dir return false unless in_trusted_directory?(path) end true end |
.safe_symbol_name?(symbol) ⇒ Boolean
Validate a symbol name is safe for dlsym lookup
267 268 269 270 271 272 273 |
# File 'lib/tree_haver/path_validator.rb', line 267 def safe_symbol_name?(symbol) return false if symbol.nil? return false if symbol.empty? return false if symbol.length > 256 # Reasonable limit symbol.match?(VALID_SYMBOL_PATTERN) end |
.sanitize_language_name(name) ⇒ Symbol?
Sanitize a language name for safe use
293 294 295 296 297 298 299 300 301 |
# File 'lib/tree_haver/path_validator.rb', line 293 def sanitize_language_name(name) return if name.nil? sanitized = name.to_s.downcase.gsub(/[^a-z0-9_]/, "") return if sanitized.empty? return unless sanitized.match?(/\A[a-z]/) sanitized.to_sym end |
.trusted_directories ⇒ Array<String>
Get all trusted directories (default + custom + from ENV)
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
# File 'lib/tree_haver/path_validator.rb', line 71 def trusted_directories dirs = DEFAULT_TRUSTED_DIRECTORIES.dup # Add custom registered directories @mutex.synchronize { dirs.concat(@custom_trusted_directories) } # Add directories from environment variable ENV[TRUSTED_DIRS_ENV_VAR]&.split(",")&.each do |dir| = File.(dir.strip) # :nocov: # File.expand_path always returns absolute paths on Unix/macOS. # This guard exists for defensive programming on exotic platforms # where expand_path might behave differently, but cannot be tested # in standard CI environments. dirs << if .start_with?("/") # :nocov: end dirs.uniq end |
.validation_errors(path) ⇒ Array<String>
Get validation errors for a path (for debugging/error messages)
307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 |
# File 'lib/tree_haver/path_validator.rb', line 307 def validation_errors(path) errors = [] if path.nil? || path.empty? errors << "Path is nil or empty" return errors end errors << "Path exceeds maximum length (#{MAX_PATH_LENGTH})" if path.length > MAX_PATH_LENGTH errors << "Path contains null byte" if path.include?("\0") errors << "Path is not absolute" unless path.start_with?("/") || windows_absolute_path?(path) errors << "Path contains traversal sequence (/../)" if path.include?("/../") || path.end_with?("/..") errors << "Path contains traversal sequence (/./)" if path.include?("/./") || path.end_with?("/.") unless has_valid_extension?(path) errors << "Path does not have allowed extension (.so, .so.X, .dylib, .dll)" end filename = File.basename(path) unless filename.match?(VALID_FILENAME_PATTERN) errors << "Filename contains invalid characters" end errors end |
.windows_absolute_path?(path) ⇒ Boolean
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
334 335 336 337 |
# File 'lib/tree_haver/path_validator.rb', line 334 def windows_absolute_path?(path) # Match Windows absolute paths like C:\path or D:/path path.match?(/\A[A-Za-z]:[\\\/]/) end |