Class: Ast::Merge::Text::SectionSplitter Abstract

Inherits:
Object
  • Object
show all
Defined in:
lib/ast/merge/text/section_splitter.rb

Overview

This class is abstract.

Subclass and implement #split and #join

Abstract base class for text-based section splitters.

A SectionSplitter takes text content (typically from a leaf node in an AST) and divides it into logical sections that can be matched, compared, and merged independently. This is useful for:

  • Markdown documents split by headings

  • Plain text files with comment-delimited sections

  • Configuration files with section markers

  • Any text where structure is defined by patterns, not AST

Important: This is for TEXT-BASED splitting of content that doesn’t have a structured AST. For AST-level node classification (like identifying ‘appraise` blocks in Ruby), use `Ast::Merge::SectionTyping` instead.

## How Section Splitting Works

  1. Split: Parse text content into sections with unique names

  2. Match: Compare sections between template and destination by name

  3. Merge: Apply merge rules per-section (template wins, dest wins, merge)

  4. Join: Reconstruct the text from merged sections

## Implementing a SectionSplitter

Subclasses must implement:

  • ‘split(content)` - Parse content into an array of Section objects

  • ‘join(sections)` - Reconstruct content from sections

Subclasses may override:

  • ‘section_signature(section)` - Custom matching logic beyond name

  • ‘merge_sections(template_section, dest_section)` - Custom section merge

  • ‘normalize_name(name)` - Custom name normalization for matching

Examples:

Implementing a Markdown heading splitter

class HeadingSplitter < SectionSplitter
  def initialize(split_level: 2)
    @split_level = split_level
  end

  def split(content)
    # Parse and split on headings at @split_level
  end

  def join(sections)
    sections.map(&:full_text).join
  end
end

Using a splitter for section-based merging

splitter = HeadingSplitter.new(split_level: 2)
template_sections = splitter.split(template_content)
dest_sections = splitter.split(dest_content)

merged = splitter.merge_documents(
  template_sections,
  dest_sections,
  preference: {
    default: :destination,
    "Installation" => :template
  }
)

result = splitter.join(merged)

Direct Known Subclasses

LineSectionSplitter

Constant Summary collapse

DEFAULT_PREFERENCE =

Default preference when none specified

:destination

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(**options) ⇒ SectionSplitter

Initialize the splitter with options.

Parameters:

  • options (Hash)

    Splitter-specific options



82
83
84
# File 'lib/ast/merge/text/section_splitter.rb', line 82

def initialize(**options)
  @options = options
end

Instance Attribute Details

#optionsHash (readonly)

Returns Options passed to the splitter.

Returns:

  • (Hash)

    Options passed to the splitter



77
78
79
# File 'lib/ast/merge/text/section_splitter.rb', line 77

def options
  @options
end

Class Method Details

.validate!(config) ⇒ void

This method returns an undefined value.

Validate splitter configuration.

Parameters:

  • config (Hash, nil)

    Configuration to validate

Raises:

  • (ArgumentError)

    If configuration is invalid



273
274
275
276
277
278
279
# File 'lib/ast/merge/text/section_splitter.rb', line 273

def self.validate!(config)
  return if config.nil?

  unless config.is_a?(Hash)
    raise ArgumentError, "splitter config must be a Hash, got #{config.class}"
  end
end

Instance Method Details

#join(sections) ⇒ String

This method is abstract.

Subclasses must implement this method

Reconstruct text content from sections.

Parameters:

  • sections (Array<Section>)

    Sections to join

Returns:

  • (String)

    Reconstructed text content

Raises:

  • (NotImplementedError)


100
101
102
# File 'lib/ast/merge/text/section_splitter.rb', line 100

def join(sections)
  raise NotImplementedError, "#{self.class}#join must be implemented"
end

#merge(template_content, dest_content, preference: DEFAULT_PREFERENCE, add_template_only: false) ⇒ String

Merge two text documents using section-based semantics.

This is the main entry point for section-based merging. It:

  1. Splits both documents into sections

  2. Matches sections by name

  3. Merges each section according to preferences

  4. Joins the result back into text

Parameters:

  • template_content (String)

    Template text content

  • dest_content (String)

    Destination text content

  • preference (Symbol, Hash) (defaults to: DEFAULT_PREFERENCE)

    Merge preference

    • ‘:template` - Template wins for all sections

    • ‘:destination` - Destination wins for all sections

    • Hash - Per-section preferences: ‘{ default: :dest, “Section Name” => :template }`

  • add_template_only (Boolean) (defaults to: false)

    Whether to add sections only in template

Returns:

  • (String)

    Merged text content



120
121
122
123
124
125
126
127
128
129
130
131
132
# File 'lib/ast/merge/text/section_splitter.rb', line 120

def merge(template_content, dest_content, preference: DEFAULT_PREFERENCE, add_template_only: false)
  template_sections = split(template_content)
  dest_sections = split(dest_content)

  merged_sections = merge_section_lists(
    template_sections,
    dest_sections,
    preference: preference,
    add_template_only: add_template_only,
  )

  join(merged_sections)
end

#merge_section_content(template_section, dest_section) ⇒ Section

Merge content within a section (for :merge preference).

Default implementation prefers destination. Subclasses should override for format-specific content merging.

Parameters:

  • template_section (Section)

    Section from template

  • dest_section (Section)

    Section from destination

Returns:

  • (Section)

    Section with merged content



211
212
213
214
215
216
217
218
219
220
221
# File 'lib/ast/merge/text/section_splitter.rb', line 211

def merge_section_content(template_section, dest_section)
  # Default: use template header, dest body
  Section.new(
    name: dest_section.name,
    header: template_section.header || dest_section.header,
    body: dest_section.body,
    start_line: dest_section.start_line,
    end_line: dest_section.end_line,
    metadata: dest_section.&.merge(template_section. || {}),
  )
end

#merge_section_lists(template_sections, dest_sections, preference: DEFAULT_PREFERENCE, add_template_only: false) ⇒ Array<Section>

Merge two lists of sections.

Parameters:

  • template_sections (Array<Section>)

    Sections from template

  • dest_sections (Array<Section>)

    Sections from destination

  • preference (Symbol, Hash) (defaults to: DEFAULT_PREFERENCE)

    Merge preference

  • add_template_only (Boolean) (defaults to: false)

    Whether to add template-only sections

Returns:

  • (Array<Section>)

    Merged sections



141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
# File 'lib/ast/merge/text/section_splitter.rb', line 141

def merge_section_lists(template_sections, dest_sections, preference: DEFAULT_PREFERENCE, add_template_only: false)
  # Build lookup by normalized name
  dest_by_name = dest_sections.each_with_object({}) do |section, hash|
    key = normalize_name(section.name)
    hash[key] = section
  end

  merged = []
  seen_names = Set.new

  # Process template sections in order
  template_sections.each do |template_section|
    key = normalize_name(template_section.name)
    seen_names << key

    dest_section = dest_by_name[key]

    if dest_section
      # Section exists in both - merge according to preference
      section_pref = preference_for_section(template_section.name, preference)
      merged << merge_sections(template_section, dest_section, section_pref)
    elsif add_template_only
      # Template-only section - add if configured
      merged << template_section
    end
    # Otherwise skip template-only sections
  end

  # Append destination-only sections (preserve destination content)
  dest_sections.each do |dest_section|
    key = normalize_name(dest_section.name)
    next if seen_names.include?(key)
    merged << dest_section
  end

  merged
end

#merge_sections(template_section, dest_section, preference) ⇒ Section

Merge a single pair of matching sections.

The default implementation simply chooses one section based on preference. Subclasses can override for more sophisticated merging (e.g., line-level merging within sections).

Parameters:

  • template_section (Section)

    Section from template

  • dest_section (Section)

    Section from destination

  • preference (Symbol)

    :template or :destination

Returns:



189
190
191
192
193
194
195
196
197
198
199
200
201
# File 'lib/ast/merge/text/section_splitter.rb', line 189

def merge_sections(template_section, dest_section, preference)
  case preference
  when :template
    template_section
  when :destination
    dest_section
  when :merge
    # Subclasses can implement actual content merging
    merge_section_content(template_section, dest_section)
  else
    dest_section
  end
end

#normalize_name(name) ⇒ String

Normalize a section name for matching.

Default implementation strips whitespace, downcases, normalizes spaces. Subclasses can override for format-specific normalization.

Parameters:

  • name (String, Symbol, nil)

    The section name

Returns:

  • (String)

    Normalized name



251
252
253
254
255
# File 'lib/ast/merge/text/section_splitter.rb', line 251

def normalize_name(name)
  return "" if name.nil?
  return name.to_s if name.is_a?(Symbol)
  name.to_s.strip.downcase.gsub(/\s+/, " ")
end

#preference_for_section(section_name, preference) ⇒ Symbol

Get the preference for a specific section.

Parameters:

  • section_name (String, Symbol)

    The section name

  • preference (Symbol, Hash)

    Overall preference configuration

Returns:

  • (Symbol)

    :template or :destination



228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
# File 'lib/ast/merge/text/section_splitter.rb', line 228

def preference_for_section(section_name, preference)
  return preference unless preference.is_a?(Hash)

  # Try exact match first
  return preference[section_name] if preference.key?(section_name)

  # Try normalized name
  normalized = normalize_name(section_name)
  preference.each do |key, value|
    return value if normalize_name(key) == normalized
  end

  # Fall back to default
  preference.fetch(:default, DEFAULT_PREFERENCE)
end

#section_signature(section) ⇒ Array, String

Generate a signature for section matching.

Default uses normalized name. Subclasses can override for more sophisticated matching (e.g., including metadata).

Parameters:

  • section (Section)

    The section

Returns:

  • (Array, String)

    Signature for matching



264
265
266
# File 'lib/ast/merge/text/section_splitter.rb', line 264

def section_signature(section)
  normalize_name(section.name)
end

#split(content) ⇒ Array<Section>

This method is abstract.

Subclasses must implement this method

Split text content into sections.

Parameters:

  • content (String)

    The text content to split

Returns:

  • (Array<Section>)

    Array of sections in document order

Raises:

  • (NotImplementedError)


91
92
93
# File 'lib/ast/merge/text/section_splitter.rb', line 91

def split(content)
  raise NotImplementedError, "#{self.class}#split must be implemented"
end