Class: Ast::Merge::MatchRefinerBase

Inherits:
Object
  • Object
show all
Defined in:
lib/ast/merge/match_refiner_base.rb

Overview

Base class for match refiners that pair unmatched nodes after signature matching.

Match refiners run after initial signature-based matching to find additional pairings between nodes that didn’t match by signature. This is useful when you want more nuanced matching than exact signatures provide - for example, matching tables with similar (but not identical) headers, or finding the closest match among several candidates using multi-factor scoring.

By default, most node types use content-based signatures (including tables, which match on row count + header content). Refiners let you override this to implement fuzzy matching, positional matching, or any custom logic.

Refiners use a callable interface (‘#call`) so simple lambdas/procs can also be used where a full class isn’t needed.

Examples:

Markdown: Table matching with multi-factor scoring

# Tables may have similar but not identical headers
# See Commonmarker::Merge::TableMatchRefiner
class TableMatchRefiner < Ast::Merge::MatchRefinerBase
  def initialize(algorithm: nil, **options)
    super(**options)
    @algorithm = algorithm || TableMatchAlgorithm.new
  end

  def call(template_nodes, dest_nodes, context = {})
    template_tables = filter_by_type(template_nodes, :table)
    dest_tables = filter_by_type(dest_nodes, :table)

    greedy_match(template_tables, dest_tables) do |t_node, d_node|
      @algorithm.call(t_node, d_node)
    end
  end
end

Ruby: Method matching with fuzzy name/signature scoring

# Methods may have similar names (process_user vs process_users)
# or same name with different parameters
# See Prism::Merge::MethodMatchRefiner
class MethodMatchRefiner < Ast::Merge::MatchRefinerBase
  def call(template_nodes, dest_nodes, context = {})
    template_methods = template_nodes.select { |n| n.is_a?(Prism::DefNode) }
    dest_methods = dest_nodes.select { |n| n.is_a?(Prism::DefNode) }

    greedy_match(template_methods, dest_methods) do |t_node, d_node|
      compute_method_similarity(t_node, d_node)
    end
  end

  private

  def compute_method_similarity(t_method, d_method)
    name_score = string_similarity(t_method.name.to_s, d_method.name.to_s)
    param_score = param_similarity(t_method, d_method)
    name_score * 0.7 + param_score * 0.3
  end
end

YAML: Mapping key matching with fuzzy scoring

# YAML keys may be renamed or have typos
# See Psych::Merge::MappingMatchRefiner
class MappingMatchRefiner < Ast::Merge::MatchRefinerBase
  def call(template_nodes, dest_nodes, context = {})
    template_mappings = template_nodes.select { |n| n.respond_to?(:key) }
    dest_mappings = dest_nodes.select { |n| n.respond_to?(:key) }

    greedy_match(template_mappings, dest_mappings) do |t_node, d_node|
      key_similarity(t_node.key, d_node.key)
    end
  end
end

JSON: Object property matching for arrays of objects

# JSON arrays may contain objects that should match by content
# See Json::Merge::ObjectMatchRefiner
class ObjectMatchRefiner < Ast::Merge::MatchRefinerBase
  def call(template_nodes, dest_nodes, context = {})
    template_objects = template_nodes.select { |n| n.type == :object }
    dest_objects = dest_nodes.select { |n| n.type == :object }

    greedy_match(template_objects, dest_objects) do |t_node, d_node|
      compute_object_similarity(t_node, d_node)
    end
  end
end

Using find_best_match with manual tracking (alternative approach)

class TableMatchRefiner < Ast::Merge::MatchRefinerBase
  def call(template_nodes, dest_nodes, context = {})
    matches = []
    used_dest_nodes = Set.new
    template_tables = filter_by_type(template_nodes, :table)
    dest_tables = filter_by_type(dest_nodes, :table)

    template_tables.each do |t_node|
      best = find_best_match(t_node, dest_tables, used_dest_nodes: used_dest_nodes) do |t, d|
        compute_table_score(t, d)
      end
      if best
        matches << best
        used_dest_nodes << best.dest_node
      end
    end

    matches
  end
end

Using a simple lambda refiner

simple_refiner = ->(template, dest, ctx) do
  # Return array of MatchResult objects
  []
end

Using refiners with a merger

merger = SmartMerger.new(
  template,
  destination,
  match_refiners: [
    TableMatchRefiner.new(threshold: 0.6),
    CustomRefiner.new
  ]
)

Defined Under Namespace

Classes: MatchResult

Constant Summary collapse

DEFAULT_THRESHOLD =

Default minimum score threshold for accepting a match

0.5

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(threshold: DEFAULT_THRESHOLD, node_types: []) ⇒ MatchRefinerBase

Initialize a new match refiner.

Parameters:

  • threshold (Float) (defaults to: DEFAULT_THRESHOLD)

    Minimum score to accept a match (0.0-1.0)

  • node_types (Array<Symbol>) (defaults to: [])

    Node types to process (empty = all)



171
172
173
174
# File 'lib/ast/merge/match_refiner_base.rb', line 171

def initialize(threshold: DEFAULT_THRESHOLD, node_types: [])
  @threshold = [[threshold.to_f, 0.0].max, 1.0].min
  @node_types = Array(node_types)
end

Instance Attribute Details

#node_typesArray<Symbol> (readonly)

Returns Node types this refiner handles (empty = all types).

Returns:

  • (Array<Symbol>)

    Node types this refiner handles (empty = all types)



165
166
167
# File 'lib/ast/merge/match_refiner_base.rb', line 165

def node_types
  @node_types
end

#thresholdFloat (readonly)

Returns Minimum score to accept a match.

Returns:

  • (Float)

    Minimum score to accept a match



162
163
164
# File 'lib/ast/merge/match_refiner_base.rb', line 162

def threshold
  @threshold
end

Instance Method Details

#call(template_nodes, dest_nodes, context = {}) ⇒ Array<MatchResult>

Refine matches between unmatched template and destination nodes.

This is the main entry point. Override in subclasses to implement custom matching logic.

Parameters:

  • template_nodes (Array)

    Unmatched nodes from template

  • dest_nodes (Array)

    Unmatched nodes from destination

  • context (Hash) (defaults to: {})

    Additional context (e.g., file analyses)

Returns:

Raises:

  • (NotImplementedError)

    If not overridden in subclass



186
187
188
# File 'lib/ast/merge/match_refiner_base.rb', line 186

def call(template_nodes, dest_nodes, context = {})
  raise NotImplementedError, "#{self.class}#call must be implemented"
end

#handles_type?(node_type) ⇒ Boolean

Check if this refiner handles a given node type.

Parameters:

  • node_type (Symbol)

    The node type to check

Returns:

  • (Boolean)

    True if this refiner handles the type



194
195
196
# File 'lib/ast/merge/match_refiner_base.rb', line 194

def handles_type?(node_type)
  node_types.empty? || node_types.include?(node_type)
end