Class: Ast::Merge::Text::LineSectionSplitter
- Inherits:
-
SectionSplitter
- Object
- SectionSplitter
- Ast::Merge::Text::LineSectionSplitter
- Defined in:
- lib/ast/merge/text/section_splitter.rb
Overview
Line-pattern section splitter for text content.
Splits text content into sections based on a line pattern (regex). Useful for documents with consistent structural markers like headings.
Constant Summary
Constants inherited from SectionSplitter
SectionSplitter::DEFAULT_PREFERENCE
Instance Attribute Summary collapse
-
#name_capture ⇒ Integer
readonly
Capture group index for section name (1-based).
-
#pattern ⇒ Regexp
readonly
Pattern to match section headers.
Attributes inherited from SectionSplitter
Instance Method Summary collapse
-
#initialize(pattern:, name_capture: 1, **options) ⇒ LineSectionSplitter
constructor
Initialize a line-based splitter.
-
#join(sections) ⇒ String
Join sections back into text content.
-
#split(content) ⇒ Array<Section>
Split content on lines matching the pattern.
Methods inherited from SectionSplitter
#merge, #merge_section_content, #merge_section_lists, #merge_sections, #normalize_name, #preference_for_section, #section_signature, validate!
Constructor Details
#initialize(pattern:, name_capture: 1, **options) ⇒ LineSectionSplitter
Initialize a line-based splitter.
307 308 309 310 311 |
# File 'lib/ast/merge/text/section_splitter.rb', line 307 def initialize(pattern:, name_capture: 1, **) super(**) @pattern = pattern @name_capture = name_capture end |
Instance Attribute Details
#name_capture ⇒ Integer (readonly)
Returns Capture group index for section name (1-based).
300 301 302 |
# File 'lib/ast/merge/text/section_splitter.rb', line 300 def name_capture @name_capture end |
#pattern ⇒ Regexp (readonly)
Returns Pattern to match section headers.
297 298 299 |
# File 'lib/ast/merge/text/section_splitter.rb', line 297 def pattern @pattern end |
Instance Method Details
#join(sections) ⇒ String
Join sections back into text content.
378 379 380 |
# File 'lib/ast/merge/text/section_splitter.rb', line 378 def join(sections) sections.map(&:full_text).join end |
#split(content) ⇒ Array<Section>
Split content on lines matching the pattern.
317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 |
# File 'lib/ast/merge/text/section_splitter.rb', line 317 def split(content) lines = content.lines sections = [] current_section = nil preamble_lines = [] lines.each_with_index do |line, index| line_num = index + 1 if (match = line.match(pattern)) # Start new section if current_section sections << finalize_section(current_section) elsif preamble_lines.any? sections << Section.new( name: :preamble, header: nil, body: preamble_lines.join, start_line: 1, end_line: line_num - 1, metadata: {type: :preamble}, ) end section_name = match[name_capture] || match[0] current_section = { name: section_name.strip, header: line, body_lines: [], start_line: line_num, } elsif current_section current_section[:body_lines] << line else preamble_lines << line end end # Finalize last section if current_section current_section[:end_line] = lines.length sections << finalize_section(current_section) elsif preamble_lines.any? && sections.empty? # Entire document is preamble (no sections found) sections << Section.new( name: :preamble, header: nil, body: preamble_lines.join, start_line: 1, end_line: lines.length, metadata: {type: :preamble}, ) end sections end |