Class: Burly::Parsers::HtmlParser Private
- Inherits:
-
Burly::Parser
- Object
- Burly::Parser
- Burly::Parsers::HtmlParser
- Defined in:
- lib/burly/parsers/html_parser.rb
This class is part of a private API. You should avoid using this class if possible, as it may be removed or be changed in the future.
Constant Summary collapse
- SRCSET_ATTRIBUTES_MAP =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
A map of HTML
srcsetattributes and their associated element names. { "imagesrcset" => ["link"], "srcset" => ["img", "source"], }.freeze
- URL_ATTRIBUTES_MAP =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
A map of HTML URL attributes and their associated element names.
{ "action" => ["form"], "cite" => ["blockquote", "del", "ins", "q"], "data" => ["object"], "formaction" => ["button", "input"], "href" => ["a", "area", "base", "link"], "ping" => ["a", "area"], "poster" => ["video"], "src" => ["audio", "embed", "iframe", "img", "input", "script", "source", "track", "video"], }.freeze
- ATTRIBUTES_XPATHS =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
URL_ATTRIBUTES_MAP.merge(SRCSET_ATTRIBUTES_MAP).flat_map do |attribute, names| names.map { |name| ".//#{name} / @#{attribute}" } end
Constants inherited from Burly::Parser
Burly::Parser::URI_PARSER, Burly::Parser::URI_REGEXP
Instance Method Summary collapse
-
#initialize(document, context: nil) ⇒ HtmlParser
constructor
private
A new instance of HtmlParser.
-
#parse ⇒ Array<String>
private
Parse an HTML document for absolute or relative URLs.
Constructor Details
#initialize(document, context: nil) ⇒ HtmlParser
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Returns a new instance of HtmlParser.
41 42 43 44 45 |
# File 'lib/burly/parsers/html_parser.rb', line 41 def initialize(document, context: nil) @context = context super end |
Instance Method Details
#parse ⇒ Array<String>
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Parse an HTML document for absolute or relative URLs.
50 51 52 53 54 55 56 57 58 |
# File 'lib/burly/parsers/html_parser.rb', line 50 def parse attr_nodes.flat_map do |attr_node| if SRCSET_ATTRIBUTES_MAP.key?(attr_node.name) urls_from_candidate_strings(attr_node.value.split(/\s*,\s*/)) else attr_node.value.strip end end end |