Class: Ast::Merge::Text::LineSectionSplitter

Inherits:
SectionSplitter show all
Defined in:
lib/ast/merge/text/section_splitter.rb

Overview

Line-pattern section splitter for text content.

Splits text content into sections based on a line pattern (regex). Useful for documents with consistent structural markers like headings.

Examples:

Split Markdown on level-2 headings

splitter = LineSectionSplitter.new(pattern: /^## (.+)$/)
sections = splitter.split(markdown_content)

Split on comment markers

splitter = LineSectionSplitter.new(pattern: /^# === (.+) ===\s*$/)
sections = splitter.split(config_file)

Constant Summary

Constants inherited from SectionSplitter

SectionSplitter::DEFAULT_PREFERENCE

Instance Attribute Summary collapse

Attributes inherited from SectionSplitter

#options

Instance Method Summary collapse

Methods inherited from SectionSplitter

#merge, #merge_section_content, #merge_section_lists, #merge_sections, #normalize_name, #preference_for_section, #section_signature, validate!

Constructor Details

#initialize(pattern:, name_capture: 1, **options) ⇒ LineSectionSplitter

Initialize a line-based splitter.

Parameters:

  • pattern (Regexp)

    Pattern to match section header lines

  • name_capture (Integer) (defaults to: 1)

    Capture group for section name (default: 1)

  • options (Hash)

    Additional options



309
310
311
312
313
# File 'lib/ast/merge/text/section_splitter.rb', line 309

def initialize(pattern:, name_capture: 1, **options)
  super(**options)
  @pattern = pattern
  @name_capture = name_capture
end

Instance Attribute Details

#name_captureInteger (readonly)

Returns Capture group index for section name (1-based).

Returns:

  • (Integer)

    Capture group index for section name (1-based)



302
303
304
# File 'lib/ast/merge/text/section_splitter.rb', line 302

def name_capture
  @name_capture
end

#patternRegexp (readonly)

Returns Pattern to match section headers.

Returns:

  • (Regexp)

    Pattern to match section headers



299
300
301
# File 'lib/ast/merge/text/section_splitter.rb', line 299

def pattern
  @pattern
end

Instance Method Details

#join(sections) ⇒ String

Join sections back into text content.

Parameters:

  • sections (Array<Section>)

    Sections to join

Returns:

  • (String)

    Reconstructed content



380
381
382
# File 'lib/ast/merge/text/section_splitter.rb', line 380

def join(sections)
  sections.map(&:full_text).join
end

#split(content) ⇒ Array<Section>

Split content on lines matching the pattern.

Parameters:

  • content (String)

    Text content

Returns:



319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
# File 'lib/ast/merge/text/section_splitter.rb', line 319

def split(content)
  lines = content.lines
  sections = []
  current_section = nil
  preamble_lines = []

  lines.each_with_index do |line, index|
    line_num = index + 1

    if (match = line.match(pattern))
      # Start new section
      if current_section
        sections << finalize_section(current_section)
      elsif preamble_lines.any?
        sections << Section.new(
          name: :preamble,
          header: nil,
          body: preamble_lines.join,
          start_line: 1,
          end_line: line_num - 1,
          metadata: {type: :preamble},
        )
      end

      section_name = match[name_capture] || match[0]
      current_section = {
        name: section_name.strip,
        header: line,
        body_lines: [],
        start_line: line_num,
      }
    elsif current_section
      current_section[:body_lines] << line
    else
      preamble_lines << line
    end
  end

  # Finalize last section
  if current_section
    current_section[:end_line] = lines.length
    sections << finalize_section(current_section)
  elsif preamble_lines.any? && sections.empty?
    # Entire document is preamble (no sections found)
    sections << Section.new(
      name: :preamble,
      header: nil,
      body: preamble_lines.join,
      start_line: 1,
      end_line: lines.length,
      metadata: {type: :preamble},
    )
  end

  sections
end