Class: Markdown::Merge::FileAnalysisBase Abstract

Inherits:
Object
  • Object
show all
Includes:
Ast::Merge::FileAnalyzable
Defined in:
lib/markdown/merge/file_analysis_base.rb

Overview

This class is abstract.

Subclass and implement parser-specific methods

Base class for file analysis for Markdown files.

Parses Markdown source code and extracts:

  • Top-level block elements (headings, paragraphs, lists, code blocks, etc.)

  • Freeze blocks marked with HTML comments

  • Structural signatures for matching elements between files

Subclasses must implement parser-specific methods:

  • #parse_document(source) - Parse source and return document node

  • #next_sibling(node) - Get next sibling of a node

  • #compute_parser_signature(node) - Compute signature for parser-specific nodes

  • #node_type_name(type) - Map canonical type names if needed

Freeze blocks are marked with HTML comments:

<!-- markdown-merge:freeze -->
... content to preserve ...
<!-- markdown-merge:unfreeze -->

Examples:

Basic usage (subclass)

class FileAnalysis < Markdown::Merge::FileAnalysisBase
  def parse_document(source)
    Markly.parse(source, flags: @flags)
  end

  def next_sibling(node)
    node.next
  end
end

Direct Known Subclasses

FileAnalysis

Constant Summary collapse

DEFAULT_FREEZE_TOKEN =

Default freeze token for identifying freeze blocks

Returns:

  • (String)
"markdown-merge"

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(source, freeze_token: DEFAULT_FREEZE_TOKEN, signature_generator: nil, **parser_options) ⇒ FileAnalysisBase

Initialize file analysis

Parameters:

  • source (String)

    Markdown source code to analyze

  • freeze_token (String) (defaults to: DEFAULT_FREEZE_TOKEN)

    Token for freeze block markers

  • signature_generator (Proc, nil) (defaults to: nil)

    Custom signature generator



58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
# File 'lib/markdown/merge/file_analysis_base.rb', line 58

def initialize(source, freeze_token: DEFAULT_FREEZE_TOKEN, signature_generator: nil, **parser_options)
  @source = source
  # Split by newlines, keeping trailing empty strings (-1)
  # But remove the final empty string if source ends with newline
  # (that empty string represents the "line after the last newline" which doesn't exist)
  @lines = source.split("\n", -1)
  @lines.pop if @lines.last == "" && source.end_with?("\n")

  @freeze_token = freeze_token
  @signature_generator = signature_generator
  @parser_options = parser_options
  @errors = []

  # Parse the Markdown source - subclasses implement this
  @document = DebugLogger.time("FileAnalysisBase#parse") do
    parse_document(source)
  end

  # Extract and integrate all nodes including freeze blocks
  @statements = extract_and_integrate_all_nodes

  DebugLogger.debug("FileAnalysisBase initialized", {
    signature_generator: signature_generator ? "custom" : "default",
    document_children: count_children(@document),
    statements_count: @statements.size,
    freeze_blocks: freeze_blocks.size,
  })
end

Instance Attribute Details

#documentObject (readonly)

Returns The root document node.

Returns:

  • (Object)

    The root document node



46
47
48
# File 'lib/markdown/merge/file_analysis_base.rb', line 46

def document
  @document
end

#errorsArray (readonly)

Returns Parse errors if any.

Returns:

  • (Array)

    Parse errors if any



49
50
51
# File 'lib/markdown/merge/file_analysis_base.rb', line 49

def errors
  @errors
end

#statementsArray<Object, FreezeNode> (readonly)

Get all statements (block nodes outside freeze blocks + FreezeNode instances)

Returns:



115
116
117
# File 'lib/markdown/merge/file_analysis_base.rb', line 115

def statements
  @statements
end

Instance Method Details

#compute_node_signature(node) ⇒ Array?

Compute default signature for a node

Parameters:

  • node (Object)

    The parser node or FreezeNode

Returns:

  • (Array, nil)

    Signature array



120
121
122
123
124
125
126
127
128
129
130
131
# File 'lib/markdown/merge/file_analysis_base.rb', line 120

def compute_node_signature(node)
  case node
  when Ast::Merge::FreezeNodeBase
    node.signature
  when LinkDefinitionNode
    node.signature
  when GapLineNode
    node.signature
  else
    compute_parser_signature(node)
  end
end

#compute_parser_signature(node) ⇒ Array?

This method is abstract.

Subclasses should override this method

Compute signature for a parser-specific node.

Parameters:

  • node (Object)

    The parser node

Returns:

  • (Array, nil)

    Signature array



158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
# File 'lib/markdown/merge/file_analysis_base.rb', line 158

def compute_parser_signature(node)
  type = node.type
  case type
  when :heading, :header
    # Content-based: Match headings by level and text content
    [:heading, node.header_level, extract_text_content(node)]
  when :paragraph
    # Content-based: Match paragraphs by content hash (first 32 chars of digest)
    text = extract_text_content(node)
    [:paragraph, Digest::SHA256.hexdigest(text)[0, 32]]
  when :code_block
    # Content-based: Match code blocks by fence info and content hash
    content = safe_string_content(node)
    fence_info = node.respond_to?(:fence_info) ? node.fence_info : nil
    [:code_block, fence_info, Digest::SHA256.hexdigest(content)[0, 16]]
  when :list
    # Structure-based: Match lists by type and item count (content may differ)
    list_type = node.respond_to?(:list_type) ? node.list_type : nil
    [:list, list_type, count_children(node)]
  when :block_quote, :blockquote
    # Content-based: Match block quotes by content hash
    text = extract_text_content(node)
    [:blockquote, Digest::SHA256.hexdigest(text)[0, 16]]
  when :thematic_break, :hrule
    # Structure-based: All thematic breaks are equivalent
    [:hrule]
  when :html_block, :html
    # Content-based: Match HTML blocks by content hash
    content = safe_string_content(node)
    [:html, Digest::SHA256.hexdigest(content)[0, 16]]
  when :table
    # Content-based: Match tables by structure and header content
    header_content = extract_table_header_content(node)
    [:table, count_children(node), Digest::SHA256.hexdigest(header_content)[0, 16]]
  when :footnote_definition
    # Name/label-based: Match footnotes by name or label
    label = node.respond_to?(:name) ? node.name : safe_string_content(node)
    [:footnote_definition, label]
  when :custom_block
    # Content-based: Match custom blocks by content hash
    text = extract_text_content(node)
    [:custom_block, Digest::SHA256.hexdigest(text)[0, 16]]
  else
    # Unknown type - use type and position
    pos = node.source_position
    [:unknown, type, pos&.dig(:start_line)]
  end
end

#extract_text_content(node) ⇒ String

Extract all text content from a node and its children

Parameters:

  • node (Object)

    The node

Returns:

  • (String)

    Concatenated text content



220
221
222
223
224
225
226
227
228
229
230
# File 'lib/markdown/merge/file_analysis_base.rb', line 220

def extract_text_content(node)
  text_parts = []
  node.walk do |child|
    if child.type == :text
      text_parts << child.string_content.to_s
    elsif child.type == :code
      text_parts << child.string_content.to_s
    end
  end
  text_parts.join
end

#fallthrough_node?(value) ⇒ Boolean

Override to detect parser nodes for signature generator fallthrough

Parameters:

  • value (Object)

    The value to check

Returns:

  • (Boolean)

    true if this is a fallthrough node



136
137
138
139
140
141
142
# File 'lib/markdown/merge/file_analysis_base.rb', line 136

def fallthrough_node?(value)
  value.is_a?(Ast::Merge::FreezeNodeBase) ||
    value.is_a?(LinkDefinitionNode) ||
    value.is_a?(GapLineNode) ||
    parser_node?(value) ||
    super
end

#next_sibling(node) ⇒ Object?

This method is abstract.

Subclasses must implement this method

Get the next sibling of a node.

Different parsers use different methods (next vs next_sibling).

Parameters:

  • node (Object)

    Current node

Returns:

  • (Object, nil)

    Next sibling or nil

Raises:

  • (NotImplementedError)


103
104
105
# File 'lib/markdown/merge/file_analysis_base.rb', line 103

def next_sibling(node)
  raise NotImplementedError, "#{self.class} must implement #next_sibling"
end

#parse_document(source) ⇒ Object

This method is abstract.

Subclasses must implement this method

Parse the source document.

Parameters:

  • source (String)

    Markdown source to parse

Returns:

  • (Object)

    Root document node

Raises:

  • (NotImplementedError)


92
93
94
# File 'lib/markdown/merge/file_analysis_base.rb', line 92

def parse_document(source)
  raise NotImplementedError, "#{self.class} must implement #parse_document"
end

#parser_node?(value) ⇒ Boolean

Check if value is a parser-specific node.

Parameters:

  • value (Object)

    Value to check

Returns:

  • (Boolean)

    true if this is a parser node



148
149
150
151
# File 'lib/markdown/merge/file_analysis_base.rb', line 148

def parser_node?(value)
  # Default: check if it responds to :type (common for AST nodes)
  value.respond_to?(:type)
end

#safe_string_content(node) ⇒ String

Safely get string content from a node

Parameters:

  • node (Object)

    The node

Returns:

  • (String)

    String content or empty string



210
211
212
213
214
215
# File 'lib/markdown/merge/file_analysis_base.rb', line 210

def safe_string_content(node)
  node.string_content.to_s
rescue TypeError
  # Some node types don't support string_content
  extract_text_content(node)
end

#source_range(start_line, end_line) ⇒ String

Get the source text for a range of lines

Lines are joined with newlines, and each line gets a trailing newline except for the last line of the file (which may or may not have one in the original).

Parameters:

  • start_line (Integer)

    Start line (1-indexed)

  • end_line (Integer)

    End line (1-indexed)

Returns:

  • (String)

    Source text



240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
# File 'lib/markdown/merge/file_analysis_base.rb', line 240

def source_range(start_line, end_line)
  return "" if start_line < 1 || end_line < start_line

  extracted_lines = @lines[(start_line - 1)..(end_line - 1)]
  return "" if extracted_lines.empty?

  # Add newlines between and after lines, but not after the last line of the file
  # unless it originally had one
  result = extracted_lines.join("\n")

  # Add trailing newline if this isn't the last line of the file
  # (the last line may or may not have a trailing newline in the original source)
  if end_line < @lines.length
    result += "\n"
  elsif @source&.end_with?("\n")
    # Last line of file, but original source ends with newline
    result += "\n"
  end

  result
end

#valid?Boolean

Check if parse was successful

Returns:

  • (Boolean)


109
110
111
# File 'lib/markdown/merge/file_analysis_base.rb', line 109

def valid?
  @errors.empty? && !@document.nil?
end