Class: Ast::Merge::Text::LineSectionSplitter

Inherits:
SectionSplitter show all
Defined in:
lib/ast/merge/text/section_splitter.rb

Overview

Line-pattern section splitter for text content.

Splits text content into sections based on a line pattern (regex). Useful for documents with consistent structural markers like headings.

Examples:

Split Markdown on level-2 headings

splitter = LineSectionSplitter.new(pattern: /^## (.+)$/)
sections = splitter.split(markdown_content)

Split on comment markers

splitter = LineSectionSplitter.new(pattern: /^# === (.+) ===\s*$/)
sections = splitter.split(config_file)

Constant Summary

Constants inherited from SectionSplitter

SectionSplitter::DEFAULT_PREFERENCE

Instance Attribute Summary collapse

Attributes inherited from SectionSplitter

#options

Instance Method Summary collapse

Methods inherited from SectionSplitter

#merge, #merge_section_content, #merge_section_lists, #merge_sections, #normalize_name, #preference_for_section, #section_signature, validate!

Constructor Details

#initialize(pattern:, name_capture: 1, **options) ⇒ LineSectionSplitter

Initialize a line-based splitter.

Parameters:

  • pattern (Regexp)

    Pattern to match section header lines

  • name_capture (Integer) (defaults to: 1)

    Capture group for section name (default: 1)

  • options (Hash)

    Additional options



307
308
309
310
311
# File 'lib/ast/merge/text/section_splitter.rb', line 307

def initialize(pattern:, name_capture: 1, **options)
  super(**options)
  @pattern = pattern
  @name_capture = name_capture
end

Instance Attribute Details

#name_captureInteger (readonly)

Returns Capture group index for section name (1-based).

Returns:

  • (Integer)

    Capture group index for section name (1-based)



300
301
302
# File 'lib/ast/merge/text/section_splitter.rb', line 300

def name_capture
  @name_capture
end

#patternRegexp (readonly)

Returns Pattern to match section headers.

Returns:

  • (Regexp)

    Pattern to match section headers



297
298
299
# File 'lib/ast/merge/text/section_splitter.rb', line 297

def pattern
  @pattern
end

Instance Method Details

#join(sections) ⇒ String

Join sections back into text content.

Parameters:

  • sections (Array<Section>)

    Sections to join

Returns:

  • (String)

    Reconstructed content



378
379
380
# File 'lib/ast/merge/text/section_splitter.rb', line 378

def join(sections)
  sections.map(&:full_text).join
end

#split(content) ⇒ Array<Section>

Split content on lines matching the pattern.

Parameters:

  • content (String)

    Text content

Returns:



317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
# File 'lib/ast/merge/text/section_splitter.rb', line 317

def split(content)
  lines = content.lines
  sections = []
  current_section = nil
  preamble_lines = []

  lines.each_with_index do |line, index|
    line_num = index + 1

    if (match = line.match(pattern))
      # Start new section
      if current_section
        sections << finalize_section(current_section)
      elsif preamble_lines.any?
        sections << Section.new(
          name: :preamble,
          header: nil,
          body: preamble_lines.join,
          start_line: 1,
          end_line: line_num - 1,
          metadata: {type: :preamble},
        )
      end

      section_name = match[name_capture] || match[0]
      current_section = {
        name: section_name.strip,
        header: line,
        body_lines: [],
        start_line: line_num,
      }
    elsif current_section
      current_section[:body_lines] << line
    else
      preamble_lines << line
    end
  end

  # Finalize last section
  if current_section
    current_section[:end_line] = lines.length
    sections << finalize_section(current_section)
  elsif preamble_lines.any? && sections.empty?
    # Entire document is preamble (no sections found)
    sections << Section.new(
      name: :preamble,
      header: nil,
      body: preamble_lines.join,
      start_line: 1,
      end_line: lines.length,
      metadata: {type: :preamble},
    )
  end

  sections
end