Class: ContentUrls::JavaScriptParser

Inherits:
Object
  • Object
show all
Defined in:
lib/content_urls/parsers/java_script_parser.rb

Overview

JavaScriptParser finds and rewrites URLs in JavaScript content.

Implementation note:

This methods in this class identify URLs by locating strings which match URI‘s regexp.

Class Method Summary collapse

Class Method Details

.rewrite_each_url(content, &block) ⇒ Object

Rewrites each URL in the JavaScript content by calling the supplied block with each URL.

Examples:

Rewrite URLs in JavaScript code

javascript = 'var link="http://example.com/"'
javascript = ContentUrls::JavaScriptParser.rewrite_each_url(javascript) {|url| url.upcase}
puts "Rewritten: #{javascript}"
# => "Rewritten: var link="HTTP://EXAMPLE.COM/""

Parameters:

  • content (String)

    the JavaScript content.



41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# File 'lib/content_urls/parsers/java_script_parser.rb', line 41

def self.rewrite_each_url(content, &block)
  rewritten_content = content.dup
  rewrite_urls = {}
  parser = RKelly::Parser.new
  ast = parser.parse(content)
  return content if ast.nil?
  ast.each do |node|
    if node.kind_of? RKelly::Nodes::StringNode
      value = node.value
      if match = /^'(.*)'$/.match(value)
        value = match[1]  # remove single quotes
      end
      if match = URI.regexp.match(value)
        url = match.to_s
        rewritten_url = yield url
        rewrite_urls[url] = rewritten_url if url != rewritten_url
      end
    end
  end
  if rewrite_urls.count > 0
    rewrite_urls.each do |url, rewritten_url|
      rewritten_content[url] = rewritten_url
    end
  end
  rewritten_content
end

.urls(content) ⇒ Array

Returns the URLs found in the JavaScript content.

Examples:

Parse JavaScript code for URLs

javascript = 'var link="http://example.com/"'
ContentUrls::JavaScriptParser.urls(javascript).each do |url|
  puts "Found URL: #{url}"
end
# => "Found URL: http://example.com/"

Parameters:

  • content (String)

    the JavaScript content.

Returns:

  • (Array)

    the unique URLs found in the content.



23
24
25
26
27
28
29
# File 'lib/content_urls/parsers/java_script_parser.rb', line 23

def self.urls(content)
  urls = []
  return urls if content.nil? || content.length == 0
  rewrite_each_url(content) { |url| urls << url; url }
  urls.uniq!
  urls
end