Class: Markdown::Merge::LinkParser
- Inherits:
-
Object
- Object
- Markdown::Merge::LinkParser
- Defined in:
- lib/markdown/merge/link_parser.rb
Overview
Parslet-based parser for markdown link structures.
This parser extracts:
-
Link reference definitions: ‘[label]: url` or `[label]: url “title”`
-
Inline links: ‘[text](url)` or `[text](url “title”)`
-
Inline images: ‘` or ``
-
Linked images: ‘[](link-url)` (nested structures)
Handles complex cases like:
-
Emoji in labels (e.g., ‘[🖼️galtzo-discord]`)
-
Nested brackets (for linked images like ‘[![alt]](url)`)
-
Multi-byte UTF-8 characters
Defined Under Namespace
Classes: DefinitionGrammar, InlineImageGrammar, InlineLinkGrammar
Instance Method Summary collapse
-
#build_link_tree(links, images) ⇒ Array<Hash>
Build a tree structure from links and images, detecting nesting.
-
#build_url_to_label_map(definitions) ⇒ Hash<String, String>
Build URL to label mapping from definitions.
-
#find_all_link_constructs(content) ⇒ Array<Hash>
Find all link constructs (links and images) with proper nesting structure.
-
#find_inline_images(content) ⇒ Array<Hash>
Find all inline images in content with positions.
-
#find_inline_links(content) ⇒ Array<Hash>
Find all inline links in content with positions.
-
#flatten_leaf_first(items) ⇒ Array<Hash>
Flatten a tree of link constructs to leaf-first order for processing.
-
#initialize ⇒ LinkParser
constructor
A new instance of LinkParser.
-
#parse_definition_line(line) ⇒ Hash?
Parse a single line as a link reference definition.
-
#parse_definitions(content) ⇒ Array<Hash>
Parse link reference definitions from content.
Constructor Details
#initialize ⇒ LinkParser
Returns a new instance of LinkParser.
143 144 145 146 147 |
# File 'lib/markdown/merge/link_parser.rb', line 143 def initialize @definition_grammar = DefinitionGrammar.new @link_grammar = InlineLinkGrammar.new @image_grammar = InlineImageGrammar.new end |
Instance Method Details
#build_link_tree(links, images) ⇒ Array<Hash>
Build a tree structure from links and images, detecting nesting.
239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 |
# File 'lib/markdown/merge/link_parser.rb', line 239 def build_link_tree(links, images) # Combine all items all_items = links.map { |l| l.merge(type: :link) } + images.map { |i| i.merge(type: :image) } # Sort by start position sorted = all_items.sort_by { |item| item[:start_pos] } result = [] skip_until = -1 sorted.each do |item| # Skip items that are children of a previous item next if item[:start_pos] < skip_until # Find any items nested inside this one children = sorted.select do |other| other[:start_pos] > item[:start_pos] && other[:end_pos] <= item[:end_pos] && other != item end if children.any? item = item.merge(children: children) # Mark children to be skipped skip_until = item[:end_pos] end result << item end result end |
#build_url_to_label_map(definitions) ⇒ Hash<String, String>
Build URL to label mapping from definitions.
205 206 207 208 209 210 211 212 213 214 215 |
# File 'lib/markdown/merge/link_parser.rb', line 205 def build_url_to_label_map(definitions) url_to_labels = Hash.new { |h, k| h[k] = [] } definitions.each do |defn| url_to_labels[defn[:url]] << defn[:label] end url_to_labels.transform_values do |labels| labels.min_by { |l| [l.length, l] } end end |
#find_all_link_constructs(content) ⇒ Array<Hash>
Find all link constructs (links and images) with proper nesting structure.
This method returns a flat list of items where linked images are represented as a single item with :children containing the nested image. This allows for proper replacement from leaves to root.
225 226 227 228 229 230 231 232 |
# File 'lib/markdown/merge/link_parser.rb', line 225 def find_all_link_constructs(content) # Find all images and links images = find_inline_images(content) links = find_inline_links(content) # Build a tree structure where images inside links are children build_link_tree(links, images) end |
#find_inline_images(content) ⇒ Array<Hash>
Find all inline images in content with positions.
197 198 199 |
# File 'lib/markdown/merge/link_parser.rb', line 197 def find_inline_images(content) find_constructs(content, :image) end |
#find_inline_links(content) ⇒ Array<Hash>
Find all inline links in content with positions.
189 190 191 |
# File 'lib/markdown/merge/link_parser.rb', line 189 def find_inline_links(content) find_constructs(content, :link) end |
#flatten_leaf_first(items) ⇒ Array<Hash>
Flatten a tree of link constructs to leaf-first order for processing.
This is useful for replacement operations where we want to process innermost items first (depth-first, post-order traversal).
280 281 282 283 284 285 286 287 288 289 290 291 292 293 |
# File 'lib/markdown/merge/link_parser.rb', line 280 def flatten_leaf_first(items) result = [] items.each do |item| if item[:children] # First add children (recursively), then the parent result.concat(flatten_leaf_first(item[:children])) end # Add the item without children key for cleaner processing result << item.except(:children) end result end |
#parse_definition_line(line) ⇒ Hash?
Parse a single line as a link reference definition.
168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 |
# File 'lib/markdown/merge/link_parser.rb', line 168 def parse_definition_line(line) result = @definition_grammar.parse(line) url = result[:url].to_s # Strip angle brackets if present url = url[1..-2] if url.start_with?("<") && url.end_with?(">") definition = { label: result[:label].to_s, url: url, } definition[:title] = result[:title].to_s if result[:title] definition rescue Parslet::ParseFailed nil end |
#parse_definitions(content) ⇒ Array<Hash>
Parse link reference definitions from content.
153 154 155 156 157 158 159 160 161 162 |
# File 'lib/markdown/merge/link_parser.rb', line 153 def parse_definitions(content) definitions = [] content.each_line do |line| result = parse_definition_line(line.chomp) definitions << result if result end definitions end |