Class: Ast::Merge::Text::SectionSplitter Abstract
- Inherits:
-
Object
- Object
- Ast::Merge::Text::SectionSplitter
- Defined in:
- lib/ast/merge/text/section_splitter.rb
Overview
Abstract base class for text-based section splitters.
A SectionSplitter takes text content (typically from a leaf node in an AST) and divides it into logical sections that can be matched, compared, and merged independently. This is useful for:
-
Markdown documents split by headings
-
Plain text files with comment-delimited sections
-
Configuration files with section markers
-
Any text where structure is defined by patterns, not AST
Important: This is for TEXT-BASED splitting of content that doesn’t have a structured AST. For AST-level node classification (like identifying ‘appraise` blocks in Ruby), use `Ast::Merge::SectionTyping` instead.
## How Section Splitting Works
-
Split: Parse text content into sections with unique names
-
Match: Compare sections between template and destination by name
-
Merge: Apply merge rules per-section (template wins, dest wins, merge)
-
Join: Reconstruct the text from merged sections
## Implementing a SectionSplitter
Subclasses must implement:
-
‘split(content)` - Parse content into an array of Section objects
-
‘join(sections)` - Reconstruct content from sections
Subclasses may override:
-
‘section_signature(section)` - Custom matching logic beyond name
-
‘merge_sections(template_section, dest_section)` - Custom section merge
-
‘normalize_name(name)` - Custom name normalization for matching
Direct Known Subclasses
Constant Summary collapse
- DEFAULT_PREFERENCE =
Default preference when none specified
:destination
Instance Attribute Summary collapse
-
#options ⇒ Hash
readonly
Options passed to the splitter.
Class Method Summary collapse
-
.validate!(config) ⇒ void
Validate splitter configuration.
Instance Method Summary collapse
-
#initialize(**options) ⇒ SectionSplitter
constructor
Initialize the splitter with options.
-
#join(sections) ⇒ String
abstract
Reconstruct text content from sections.
-
#merge(template_content, dest_content, preference: DEFAULT_PREFERENCE, add_template_only: false) ⇒ String
Merge two text documents using section-based semantics.
-
#merge_section_content(template_section, dest_section) ⇒ Section
Merge content within a section (for :merge preference).
-
#merge_section_lists(template_sections, dest_sections, preference: DEFAULT_PREFERENCE, add_template_only: false) ⇒ Array<Section>
Merge two lists of sections.
-
#merge_sections(template_section, dest_section, preference) ⇒ Section
Merge a single pair of matching sections.
-
#normalize_name(name) ⇒ String
Normalize a section name for matching.
-
#preference_for_section(section_name, preference) ⇒ Symbol
Get the preference for a specific section.
-
#section_signature(section) ⇒ Array, String
Generate a signature for section matching.
-
#split(content) ⇒ Array<Section>
abstract
Split text content into sections.
Constructor Details
#initialize(**options) ⇒ SectionSplitter
Initialize the splitter with options.
82 83 84 |
# File 'lib/ast/merge/text/section_splitter.rb', line 82 def initialize(**) @options = end |
Instance Attribute Details
#options ⇒ Hash (readonly)
Returns Options passed to the splitter.
77 78 79 |
# File 'lib/ast/merge/text/section_splitter.rb', line 77 def @options end |
Class Method Details
.validate!(config) ⇒ void
This method returns an undefined value.
Validate splitter configuration.
273 274 275 276 277 278 279 |
# File 'lib/ast/merge/text/section_splitter.rb', line 273 def self.validate!(config) return if config.nil? unless config.is_a?(Hash) raise ArgumentError, "splitter config must be a Hash, got #{config.class}" end end |
Instance Method Details
#join(sections) ⇒ String
Subclasses must implement this method
Reconstruct text content from sections.
100 101 102 |
# File 'lib/ast/merge/text/section_splitter.rb', line 100 def join(sections) raise NotImplementedError, "#{self.class}#join must be implemented" end |
#merge(template_content, dest_content, preference: DEFAULT_PREFERENCE, add_template_only: false) ⇒ String
Merge two text documents using section-based semantics.
This is the main entry point for section-based merging. It:
-
Splits both documents into sections
-
Matches sections by name
-
Merges each section according to preferences
-
Joins the result back into text
120 121 122 123 124 125 126 127 128 129 130 131 132 |
# File 'lib/ast/merge/text/section_splitter.rb', line 120 def merge(template_content, dest_content, preference: DEFAULT_PREFERENCE, add_template_only: false) template_sections = split(template_content) dest_sections = split(dest_content) merged_sections = merge_section_lists( template_sections, dest_sections, preference: preference, add_template_only: add_template_only, ) join(merged_sections) end |
#merge_section_content(template_section, dest_section) ⇒ Section
Merge content within a section (for :merge preference).
Default implementation prefers destination. Subclasses should override for format-specific content merging.
211 212 213 214 215 216 217 218 219 220 221 |
# File 'lib/ast/merge/text/section_splitter.rb', line 211 def merge_section_content(template_section, dest_section) # Default: use template header, dest body Section.new( name: dest_section.name, header: template_section.header || dest_section.header, body: dest_section.body, start_line: dest_section.start_line, end_line: dest_section.end_line, metadata: dest_section.&.merge(template_section. || {}), ) end |
#merge_section_lists(template_sections, dest_sections, preference: DEFAULT_PREFERENCE, add_template_only: false) ⇒ Array<Section>
Merge two lists of sections.
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
# File 'lib/ast/merge/text/section_splitter.rb', line 141 def merge_section_lists(template_sections, dest_sections, preference: DEFAULT_PREFERENCE, add_template_only: false) # Build lookup by normalized name dest_by_name = dest_sections.each_with_object({}) do |section, hash| key = normalize_name(section.name) hash[key] = section end merged = [] seen_names = Set.new # Process template sections in order template_sections.each do |template_section| key = normalize_name(template_section.name) seen_names << key dest_section = dest_by_name[key] if dest_section # Section exists in both - merge according to preference section_pref = preference_for_section(template_section.name, preference) merged << merge_sections(template_section, dest_section, section_pref) elsif add_template_only # Template-only section - add if configured merged << template_section end # Otherwise skip template-only sections end # Append destination-only sections (preserve destination content) dest_sections.each do |dest_section| key = normalize_name(dest_section.name) next if seen_names.include?(key) merged << dest_section end merged end |
#merge_sections(template_section, dest_section, preference) ⇒ Section
Merge a single pair of matching sections.
The default implementation simply chooses one section based on preference. Subclasses can override for more sophisticated merging (e.g., line-level merging within sections).
189 190 191 192 193 194 195 196 197 198 199 200 201 |
# File 'lib/ast/merge/text/section_splitter.rb', line 189 def merge_sections(template_section, dest_section, preference) case preference when :template template_section when :destination dest_section when :merge # Subclasses can implement actual content merging merge_section_content(template_section, dest_section) else dest_section end end |
#normalize_name(name) ⇒ String
Normalize a section name for matching.
Default implementation strips whitespace, downcases, normalizes spaces. Subclasses can override for format-specific normalization.
251 252 253 254 255 |
# File 'lib/ast/merge/text/section_splitter.rb', line 251 def normalize_name(name) return "" if name.nil? return name.to_s if name.is_a?(Symbol) name.to_s.strip.downcase.gsub(/\s+/, " ") end |
#preference_for_section(section_name, preference) ⇒ Symbol
Get the preference for a specific section.
228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 |
# File 'lib/ast/merge/text/section_splitter.rb', line 228 def preference_for_section(section_name, preference) return preference unless preference.is_a?(Hash) # Try exact match first return preference[section_name] if preference.key?(section_name) # Try normalized name normalized = normalize_name(section_name) preference.each do |key, value| return value if normalize_name(key) == normalized end # Fall back to default preference.fetch(:default, DEFAULT_PREFERENCE) end |
#section_signature(section) ⇒ Array, String
Generate a signature for section matching.
Default uses normalized name. Subclasses can override for more sophisticated matching (e.g., including metadata).
264 265 266 |
# File 'lib/ast/merge/text/section_splitter.rb', line 264 def section_signature(section) normalize_name(section.name) end |
#split(content) ⇒ Array<Section>
Subclasses must implement this method
Split text content into sections.
91 92 93 |
# File 'lib/ast/merge/text/section_splitter.rb', line 91 def split(content) raise NotImplementedError, "#{self.class}#split must be implemented" end |