Class: Gembuild::GemScraper
- Inherits:
-
Object
- Object
- Gembuild::GemScraper
- Defined in:
- lib/gembuild/gem_scraper.rb
Overview
This class is used to query for various information from rubygems.org.
Instance Attribute Summary collapse
-
#agent ⇒ Mechanize
readonly
The Mechanize agent.
-
#deps ⇒ String
readonly
The rubygems URL for getting dependency information.
-
#gem ⇒ String
readonly
The rubygems URL for the frontend.
-
#gemname ⇒ String
readonly
The rubygem about which to query.
-
#url ⇒ String
readonly
The rubygems URL to get version information.
Instance Method Summary collapse
-
#format_description_from_response(response) ⇒ String
Gets a well-formed gem description from the parsed response.
-
#get_checksum_from_response(response) ⇒ String
Gets the sha256 checksum returned from the rubygems.org API.
-
#get_dependencies_for_version(version) ⇒ Array
Get all other gem dependencies for the given version.
-
#get_licenses_from_response(response) ⇒ Array
Get the array of licenses under which the gem is licensed.
-
#get_version_from_response(response) ⇒ Gem::Version
Gets the version number from the parsed response.
-
#initialize(gemname) ⇒ Gembuild::GemScraper
constructor
Creates a new GemScraper instance.
-
#query_latest_version ⇒ Hash
Query the rubygems version api for the latest version.
-
#scrape! ⇒ Hash
Quick method to get all important information in a single hash for later processing.
-
#scrape_frontend_for_homepage_url ⇒ String
Scrape the rubygems.org frontend for the gem’s homepage URL.
Constructor Details
#initialize(gemname) ⇒ Gembuild::GemScraper
Creates a new GemScraper instance
65 66 67 68 69 70 71 72 73 74 |
# File 'lib/gembuild/gem_scraper.rb', line 65 def initialize(gemname) fail Gembuild::UndefinedGemNameError if gemname.nil? || gemname.empty? @gemname = gemname @agent = Mechanize.new @url = "https://rubygems.org/api/v1/versions/#{gemname}.json" @deps = "https://rubygems.org/api/v1/dependencies?gems=#{gemname}" @gem = "https://rubygems.org/gems/#{gemname}" end |
Instance Attribute Details
#agent ⇒ Mechanize (readonly)
Returns the Mechanize agent.
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
# File 'lib/gembuild/gem_scraper.rb', line 35 class GemScraper attr_reader :agent, :deps, :gem, :gemname, :url # Creates a new GemScraper instance # # @raise [Gembuild::UndefinedGemName] if the gemname is nil or empty # # @example Create a new GemScraper object # Gembuild::GemScraper.new('mina') # # => #<Gembuild::GemScraper:0x00000002f8a500 # # @agent= # # #<Mechanize # # #<Mechanize::CookieJar:0x00000002f8a410 # # @store= # # #<HTTP::CookieJar::HashStore:0x00000002f8a370 # # @gc_index=0, # # @gc_threshold=150, # # @jar={}, # # @logger=nil, # # @mon_count=0, # # @mon_mutex=#<Mutex:0x00000002f8a320>, # # @mon_owner=nil>> # # nil>, # # @deps="https://rubygems.org/api/v1/dependencies?gems=mina", # # @gem="https://rubygems.org/gems/mina", # # @gemname="mina", # # @url="https://rubygems.org/api/v1/versions/mina.json"> # # @param gemname [String] The gem about which to query. # @return [Gembuild::GemScraper] a new GemScraper instance def initialize(gemname) fail Gembuild::UndefinedGemNameError if gemname.nil? || gemname.empty? @gemname = gemname @agent = Mechanize.new @url = "https://rubygems.org/api/v1/versions/#{gemname}.json" @deps = "https://rubygems.org/api/v1/dependencies?gems=#{gemname}" @gem = "https://rubygems.org/gems/#{gemname}" end # Query the rubygems version api for the latest version. # # @raise [Gembuild::GemNotFoundError] if the page returns a 404 (not # found) error. # # @example Query rubygems.org for version information # s = Gembuild::GemScraper.new('mina') # s.query_latest_version # # => {:authors=>"Rico Sta. Cruz, Michael Galero", # # :built_at=>"2015-07-08T00:00:00.000Z", # # :created_at=>"2015-07-08T13:13:33.292Z", # # :description=>"Really fast deployer and server automation tool.", # # :downloads_count=>18709, # # :metadata=>{}, # # :number=>"0.3.7", # # :summary=>"Really fast deployer and server automation tool.", # # :platform=>"ruby", # # :ruby_version=>">= 0", # # :prerelease=>false, # # :licenses=>[], # # :requirements=>[], # # :sha=> # # "bd1fa2b56ed1aded882a12f6365a04496f5cf8a14c07f8c4f1f3cfc944ef34f6" # # } # # @return [Hash] the information about the latest version of the gem def query_latest_version response = JSON.parse(agent.get(url).body, symbolize_names: true) # Skip any release marked as a "prerelease" response.shift while response.first[:prerelease] response.first rescue Mechanize::ResponseCodeError, Net::HTTPNotFound raise Gembuild::GemNotFoundError end # Gets the version number from the parsed response. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [Gem::Version] the current version of the gem def get_version_from_response(response) Gem::Version.new(response.fetch(:number)) end # Gets a well-formed gem description from the parsed response. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [String] the gem description or summary ending in a full-stop def format_description_from_response(response) description = response.fetch(:description) description = response.fetch(:summary) if description.empty? # Replace any newlines or tabs (which would mess up a PKGBUILD) with # spaces. Then, make sure there is no description = description.gsub(/[[:space:]]+/, ' ').strip # Ensure that the description ends in a full-stop. description += '.' unless description[-1, 1] == '.' description end # Gets the sha256 checksum returned from the rubygems.org API. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [String] the sha256 sum of the gem file def get_checksum_from_response(response) response.fetch(:sha) end # Get the array of licenses under which the gem is licensed. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [Array] the licenses for the gem def get_licenses_from_response(response) response.fetch(:licenses) end # Get all other gem dependencies for the given version. # # @param version [String|Gem::Version] The version for which to get the # dependencies. # @return [Array] list of other gems upon which the gem depends def get_dependencies_for_version(version) version = Gem::Version.new(version) if version.is_a?(String) payload = Marshal.load(agent.get(deps).body) dependencies = payload.find do |v| Gem::Version.new(v[:number]) == version end dependencies[:dependencies].map(&:first) end # Scrape the rubygems.org frontend for the gem's homepage URL. # # @return [String] the homepage URL of the gem def scrape_frontend_for_homepage_url html = agent.get(gem).body links = Nokogiri::HTML(html).css('a') homepage_link = links.find do |a| a.text.strip == 'Homepage' end homepage_link[:href] end # Quick method to get all important information in a single hash for # later processing. # # @return [Hash] hash containing all the information available from the # rubygems.org APIs and website def scrape! response = query_latest_version version = get_version_from_response(response) { version: version, description: format_description_from_response(response), checksum: get_checksum_from_response(response), license: get_licenses_from_response(response), dependencies: get_dependencies_for_version(version), homepage: scrape_frontend_for_homepage_url } end end |
#deps ⇒ String (readonly)
Returns the rubygems URL for getting dependency information.
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
# File 'lib/gembuild/gem_scraper.rb', line 35 class GemScraper attr_reader :agent, :deps, :gem, :gemname, :url # Creates a new GemScraper instance # # @raise [Gembuild::UndefinedGemName] if the gemname is nil or empty # # @example Create a new GemScraper object # Gembuild::GemScraper.new('mina') # # => #<Gembuild::GemScraper:0x00000002f8a500 # # @agent= # # #<Mechanize # # #<Mechanize::CookieJar:0x00000002f8a410 # # @store= # # #<HTTP::CookieJar::HashStore:0x00000002f8a370 # # @gc_index=0, # # @gc_threshold=150, # # @jar={}, # # @logger=nil, # # @mon_count=0, # # @mon_mutex=#<Mutex:0x00000002f8a320>, # # @mon_owner=nil>> # # nil>, # # @deps="https://rubygems.org/api/v1/dependencies?gems=mina", # # @gem="https://rubygems.org/gems/mina", # # @gemname="mina", # # @url="https://rubygems.org/api/v1/versions/mina.json"> # # @param gemname [String] The gem about which to query. # @return [Gembuild::GemScraper] a new GemScraper instance def initialize(gemname) fail Gembuild::UndefinedGemNameError if gemname.nil? || gemname.empty? @gemname = gemname @agent = Mechanize.new @url = "https://rubygems.org/api/v1/versions/#{gemname}.json" @deps = "https://rubygems.org/api/v1/dependencies?gems=#{gemname}" @gem = "https://rubygems.org/gems/#{gemname}" end # Query the rubygems version api for the latest version. # # @raise [Gembuild::GemNotFoundError] if the page returns a 404 (not # found) error. # # @example Query rubygems.org for version information # s = Gembuild::GemScraper.new('mina') # s.query_latest_version # # => {:authors=>"Rico Sta. Cruz, Michael Galero", # # :built_at=>"2015-07-08T00:00:00.000Z", # # :created_at=>"2015-07-08T13:13:33.292Z", # # :description=>"Really fast deployer and server automation tool.", # # :downloads_count=>18709, # # :metadata=>{}, # # :number=>"0.3.7", # # :summary=>"Really fast deployer and server automation tool.", # # :platform=>"ruby", # # :ruby_version=>">= 0", # # :prerelease=>false, # # :licenses=>[], # # :requirements=>[], # # :sha=> # # "bd1fa2b56ed1aded882a12f6365a04496f5cf8a14c07f8c4f1f3cfc944ef34f6" # # } # # @return [Hash] the information about the latest version of the gem def query_latest_version response = JSON.parse(agent.get(url).body, symbolize_names: true) # Skip any release marked as a "prerelease" response.shift while response.first[:prerelease] response.first rescue Mechanize::ResponseCodeError, Net::HTTPNotFound raise Gembuild::GemNotFoundError end # Gets the version number from the parsed response. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [Gem::Version] the current version of the gem def get_version_from_response(response) Gem::Version.new(response.fetch(:number)) end # Gets a well-formed gem description from the parsed response. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [String] the gem description or summary ending in a full-stop def format_description_from_response(response) description = response.fetch(:description) description = response.fetch(:summary) if description.empty? # Replace any newlines or tabs (which would mess up a PKGBUILD) with # spaces. Then, make sure there is no description = description.gsub(/[[:space:]]+/, ' ').strip # Ensure that the description ends in a full-stop. description += '.' unless description[-1, 1] == '.' description end # Gets the sha256 checksum returned from the rubygems.org API. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [String] the sha256 sum of the gem file def get_checksum_from_response(response) response.fetch(:sha) end # Get the array of licenses under which the gem is licensed. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [Array] the licenses for the gem def get_licenses_from_response(response) response.fetch(:licenses) end # Get all other gem dependencies for the given version. # # @param version [String|Gem::Version] The version for which to get the # dependencies. # @return [Array] list of other gems upon which the gem depends def get_dependencies_for_version(version) version = Gem::Version.new(version) if version.is_a?(String) payload = Marshal.load(agent.get(deps).body) dependencies = payload.find do |v| Gem::Version.new(v[:number]) == version end dependencies[:dependencies].map(&:first) end # Scrape the rubygems.org frontend for the gem's homepage URL. # # @return [String] the homepage URL of the gem def scrape_frontend_for_homepage_url html = agent.get(gem).body links = Nokogiri::HTML(html).css('a') homepage_link = links.find do |a| a.text.strip == 'Homepage' end homepage_link[:href] end # Quick method to get all important information in a single hash for # later processing. # # @return [Hash] hash containing all the information available from the # rubygems.org APIs and website def scrape! response = query_latest_version version = get_version_from_response(response) { version: version, description: format_description_from_response(response), checksum: get_checksum_from_response(response), license: get_licenses_from_response(response), dependencies: get_dependencies_for_version(version), homepage: scrape_frontend_for_homepage_url } end end |
#gem ⇒ String (readonly)
Returns the rubygems URL for the frontend.
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
# File 'lib/gembuild/gem_scraper.rb', line 35 class GemScraper attr_reader :agent, :deps, :gem, :gemname, :url # Creates a new GemScraper instance # # @raise [Gembuild::UndefinedGemName] if the gemname is nil or empty # # @example Create a new GemScraper object # Gembuild::GemScraper.new('mina') # # => #<Gembuild::GemScraper:0x00000002f8a500 # # @agent= # # #<Mechanize # # #<Mechanize::CookieJar:0x00000002f8a410 # # @store= # # #<HTTP::CookieJar::HashStore:0x00000002f8a370 # # @gc_index=0, # # @gc_threshold=150, # # @jar={}, # # @logger=nil, # # @mon_count=0, # # @mon_mutex=#<Mutex:0x00000002f8a320>, # # @mon_owner=nil>> # # nil>, # # @deps="https://rubygems.org/api/v1/dependencies?gems=mina", # # @gem="https://rubygems.org/gems/mina", # # @gemname="mina", # # @url="https://rubygems.org/api/v1/versions/mina.json"> # # @param gemname [String] The gem about which to query. # @return [Gembuild::GemScraper] a new GemScraper instance def initialize(gemname) fail Gembuild::UndefinedGemNameError if gemname.nil? || gemname.empty? @gemname = gemname @agent = Mechanize.new @url = "https://rubygems.org/api/v1/versions/#{gemname}.json" @deps = "https://rubygems.org/api/v1/dependencies?gems=#{gemname}" @gem = "https://rubygems.org/gems/#{gemname}" end # Query the rubygems version api for the latest version. # # @raise [Gembuild::GemNotFoundError] if the page returns a 404 (not # found) error. # # @example Query rubygems.org for version information # s = Gembuild::GemScraper.new('mina') # s.query_latest_version # # => {:authors=>"Rico Sta. Cruz, Michael Galero", # # :built_at=>"2015-07-08T00:00:00.000Z", # # :created_at=>"2015-07-08T13:13:33.292Z", # # :description=>"Really fast deployer and server automation tool.", # # :downloads_count=>18709, # # :metadata=>{}, # # :number=>"0.3.7", # # :summary=>"Really fast deployer and server automation tool.", # # :platform=>"ruby", # # :ruby_version=>">= 0", # # :prerelease=>false, # # :licenses=>[], # # :requirements=>[], # # :sha=> # # "bd1fa2b56ed1aded882a12f6365a04496f5cf8a14c07f8c4f1f3cfc944ef34f6" # # } # # @return [Hash] the information about the latest version of the gem def query_latest_version response = JSON.parse(agent.get(url).body, symbolize_names: true) # Skip any release marked as a "prerelease" response.shift while response.first[:prerelease] response.first rescue Mechanize::ResponseCodeError, Net::HTTPNotFound raise Gembuild::GemNotFoundError end # Gets the version number from the parsed response. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [Gem::Version] the current version of the gem def get_version_from_response(response) Gem::Version.new(response.fetch(:number)) end # Gets a well-formed gem description from the parsed response. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [String] the gem description or summary ending in a full-stop def format_description_from_response(response) description = response.fetch(:description) description = response.fetch(:summary) if description.empty? # Replace any newlines or tabs (which would mess up a PKGBUILD) with # spaces. Then, make sure there is no description = description.gsub(/[[:space:]]+/, ' ').strip # Ensure that the description ends in a full-stop. description += '.' unless description[-1, 1] == '.' description end # Gets the sha256 checksum returned from the rubygems.org API. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [String] the sha256 sum of the gem file def get_checksum_from_response(response) response.fetch(:sha) end # Get the array of licenses under which the gem is licensed. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [Array] the licenses for the gem def get_licenses_from_response(response) response.fetch(:licenses) end # Get all other gem dependencies for the given version. # # @param version [String|Gem::Version] The version for which to get the # dependencies. # @return [Array] list of other gems upon which the gem depends def get_dependencies_for_version(version) version = Gem::Version.new(version) if version.is_a?(String) payload = Marshal.load(agent.get(deps).body) dependencies = payload.find do |v| Gem::Version.new(v[:number]) == version end dependencies[:dependencies].map(&:first) end # Scrape the rubygems.org frontend for the gem's homepage URL. # # @return [String] the homepage URL of the gem def scrape_frontend_for_homepage_url html = agent.get(gem).body links = Nokogiri::HTML(html).css('a') homepage_link = links.find do |a| a.text.strip == 'Homepage' end homepage_link[:href] end # Quick method to get all important information in a single hash for # later processing. # # @return [Hash] hash containing all the information available from the # rubygems.org APIs and website def scrape! response = query_latest_version version = get_version_from_response(response) { version: version, description: format_description_from_response(response), checksum: get_checksum_from_response(response), license: get_licenses_from_response(response), dependencies: get_dependencies_for_version(version), homepage: scrape_frontend_for_homepage_url } end end |
#gemname ⇒ String (readonly)
Returns the rubygem about which to query.
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
# File 'lib/gembuild/gem_scraper.rb', line 35 class GemScraper attr_reader :agent, :deps, :gem, :gemname, :url # Creates a new GemScraper instance # # @raise [Gembuild::UndefinedGemName] if the gemname is nil or empty # # @example Create a new GemScraper object # Gembuild::GemScraper.new('mina') # # => #<Gembuild::GemScraper:0x00000002f8a500 # # @agent= # # #<Mechanize # # #<Mechanize::CookieJar:0x00000002f8a410 # # @store= # # #<HTTP::CookieJar::HashStore:0x00000002f8a370 # # @gc_index=0, # # @gc_threshold=150, # # @jar={}, # # @logger=nil, # # @mon_count=0, # # @mon_mutex=#<Mutex:0x00000002f8a320>, # # @mon_owner=nil>> # # nil>, # # @deps="https://rubygems.org/api/v1/dependencies?gems=mina", # # @gem="https://rubygems.org/gems/mina", # # @gemname="mina", # # @url="https://rubygems.org/api/v1/versions/mina.json"> # # @param gemname [String] The gem about which to query. # @return [Gembuild::GemScraper] a new GemScraper instance def initialize(gemname) fail Gembuild::UndefinedGemNameError if gemname.nil? || gemname.empty? @gemname = gemname @agent = Mechanize.new @url = "https://rubygems.org/api/v1/versions/#{gemname}.json" @deps = "https://rubygems.org/api/v1/dependencies?gems=#{gemname}" @gem = "https://rubygems.org/gems/#{gemname}" end # Query the rubygems version api for the latest version. # # @raise [Gembuild::GemNotFoundError] if the page returns a 404 (not # found) error. # # @example Query rubygems.org for version information # s = Gembuild::GemScraper.new('mina') # s.query_latest_version # # => {:authors=>"Rico Sta. Cruz, Michael Galero", # # :built_at=>"2015-07-08T00:00:00.000Z", # # :created_at=>"2015-07-08T13:13:33.292Z", # # :description=>"Really fast deployer and server automation tool.", # # :downloads_count=>18709, # # :metadata=>{}, # # :number=>"0.3.7", # # :summary=>"Really fast deployer and server automation tool.", # # :platform=>"ruby", # # :ruby_version=>">= 0", # # :prerelease=>false, # # :licenses=>[], # # :requirements=>[], # # :sha=> # # "bd1fa2b56ed1aded882a12f6365a04496f5cf8a14c07f8c4f1f3cfc944ef34f6" # # } # # @return [Hash] the information about the latest version of the gem def query_latest_version response = JSON.parse(agent.get(url).body, symbolize_names: true) # Skip any release marked as a "prerelease" response.shift while response.first[:prerelease] response.first rescue Mechanize::ResponseCodeError, Net::HTTPNotFound raise Gembuild::GemNotFoundError end # Gets the version number from the parsed response. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [Gem::Version] the current version of the gem def get_version_from_response(response) Gem::Version.new(response.fetch(:number)) end # Gets a well-formed gem description from the parsed response. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [String] the gem description or summary ending in a full-stop def format_description_from_response(response) description = response.fetch(:description) description = response.fetch(:summary) if description.empty? # Replace any newlines or tabs (which would mess up a PKGBUILD) with # spaces. Then, make sure there is no description = description.gsub(/[[:space:]]+/, ' ').strip # Ensure that the description ends in a full-stop. description += '.' unless description[-1, 1] == '.' description end # Gets the sha256 checksum returned from the rubygems.org API. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [String] the sha256 sum of the gem file def get_checksum_from_response(response) response.fetch(:sha) end # Get the array of licenses under which the gem is licensed. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [Array] the licenses for the gem def get_licenses_from_response(response) response.fetch(:licenses) end # Get all other gem dependencies for the given version. # # @param version [String|Gem::Version] The version for which to get the # dependencies. # @return [Array] list of other gems upon which the gem depends def get_dependencies_for_version(version) version = Gem::Version.new(version) if version.is_a?(String) payload = Marshal.load(agent.get(deps).body) dependencies = payload.find do |v| Gem::Version.new(v[:number]) == version end dependencies[:dependencies].map(&:first) end # Scrape the rubygems.org frontend for the gem's homepage URL. # # @return [String] the homepage URL of the gem def scrape_frontend_for_homepage_url html = agent.get(gem).body links = Nokogiri::HTML(html).css('a') homepage_link = links.find do |a| a.text.strip == 'Homepage' end homepage_link[:href] end # Quick method to get all important information in a single hash for # later processing. # # @return [Hash] hash containing all the information available from the # rubygems.org APIs and website def scrape! response = query_latest_version version = get_version_from_response(response) { version: version, description: format_description_from_response(response), checksum: get_checksum_from_response(response), license: get_licenses_from_response(response), dependencies: get_dependencies_for_version(version), homepage: scrape_frontend_for_homepage_url } end end |
#url ⇒ String (readonly)
Returns the rubygems URL to get version information.
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
# File 'lib/gembuild/gem_scraper.rb', line 35 class GemScraper attr_reader :agent, :deps, :gem, :gemname, :url # Creates a new GemScraper instance # # @raise [Gembuild::UndefinedGemName] if the gemname is nil or empty # # @example Create a new GemScraper object # Gembuild::GemScraper.new('mina') # # => #<Gembuild::GemScraper:0x00000002f8a500 # # @agent= # # #<Mechanize # # #<Mechanize::CookieJar:0x00000002f8a410 # # @store= # # #<HTTP::CookieJar::HashStore:0x00000002f8a370 # # @gc_index=0, # # @gc_threshold=150, # # @jar={}, # # @logger=nil, # # @mon_count=0, # # @mon_mutex=#<Mutex:0x00000002f8a320>, # # @mon_owner=nil>> # # nil>, # # @deps="https://rubygems.org/api/v1/dependencies?gems=mina", # # @gem="https://rubygems.org/gems/mina", # # @gemname="mina", # # @url="https://rubygems.org/api/v1/versions/mina.json"> # # @param gemname [String] The gem about which to query. # @return [Gembuild::GemScraper] a new GemScraper instance def initialize(gemname) fail Gembuild::UndefinedGemNameError if gemname.nil? || gemname.empty? @gemname = gemname @agent = Mechanize.new @url = "https://rubygems.org/api/v1/versions/#{gemname}.json" @deps = "https://rubygems.org/api/v1/dependencies?gems=#{gemname}" @gem = "https://rubygems.org/gems/#{gemname}" end # Query the rubygems version api for the latest version. # # @raise [Gembuild::GemNotFoundError] if the page returns a 404 (not # found) error. # # @example Query rubygems.org for version information # s = Gembuild::GemScraper.new('mina') # s.query_latest_version # # => {:authors=>"Rico Sta. Cruz, Michael Galero", # # :built_at=>"2015-07-08T00:00:00.000Z", # # :created_at=>"2015-07-08T13:13:33.292Z", # # :description=>"Really fast deployer and server automation tool.", # # :downloads_count=>18709, # # :metadata=>{}, # # :number=>"0.3.7", # # :summary=>"Really fast deployer and server automation tool.", # # :platform=>"ruby", # # :ruby_version=>">= 0", # # :prerelease=>false, # # :licenses=>[], # # :requirements=>[], # # :sha=> # # "bd1fa2b56ed1aded882a12f6365a04496f5cf8a14c07f8c4f1f3cfc944ef34f6" # # } # # @return [Hash] the information about the latest version of the gem def query_latest_version response = JSON.parse(agent.get(url).body, symbolize_names: true) # Skip any release marked as a "prerelease" response.shift while response.first[:prerelease] response.first rescue Mechanize::ResponseCodeError, Net::HTTPNotFound raise Gembuild::GemNotFoundError end # Gets the version number from the parsed response. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [Gem::Version] the current version of the gem def get_version_from_response(response) Gem::Version.new(response.fetch(:number)) end # Gets a well-formed gem description from the parsed response. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [String] the gem description or summary ending in a full-stop def format_description_from_response(response) description = response.fetch(:description) description = response.fetch(:summary) if description.empty? # Replace any newlines or tabs (which would mess up a PKGBUILD) with # spaces. Then, make sure there is no description = description.gsub(/[[:space:]]+/, ' ').strip # Ensure that the description ends in a full-stop. description += '.' unless description[-1, 1] == '.' description end # Gets the sha256 checksum returned from the rubygems.org API. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [String] the sha256 sum of the gem file def get_checksum_from_response(response) response.fetch(:sha) end # Get the array of licenses under which the gem is licensed. # # @param response [Hash] The JSON parsed results from rubygems.org. # @return [Array] the licenses for the gem def get_licenses_from_response(response) response.fetch(:licenses) end # Get all other gem dependencies for the given version. # # @param version [String|Gem::Version] The version for which to get the # dependencies. # @return [Array] list of other gems upon which the gem depends def get_dependencies_for_version(version) version = Gem::Version.new(version) if version.is_a?(String) payload = Marshal.load(agent.get(deps).body) dependencies = payload.find do |v| Gem::Version.new(v[:number]) == version end dependencies[:dependencies].map(&:first) end # Scrape the rubygems.org frontend for the gem's homepage URL. # # @return [String] the homepage URL of the gem def scrape_frontend_for_homepage_url html = agent.get(gem).body links = Nokogiri::HTML(html).css('a') homepage_link = links.find do |a| a.text.strip == 'Homepage' end homepage_link[:href] end # Quick method to get all important information in a single hash for # later processing. # # @return [Hash] hash containing all the information available from the # rubygems.org APIs and website def scrape! response = query_latest_version version = get_version_from_response(response) { version: version, description: format_description_from_response(response), checksum: get_checksum_from_response(response), license: get_licenses_from_response(response), dependencies: get_dependencies_for_version(version), homepage: scrape_frontend_for_homepage_url } end end |
Instance Method Details
#format_description_from_response(response) ⇒ String
Gets a well-formed gem description from the parsed response.
125 126 127 128 129 130 131 132 133 134 135 136 137 |
# File 'lib/gembuild/gem_scraper.rb', line 125 def format_description_from_response(response) description = response.fetch(:description) description = response.fetch(:summary) if description.empty? # Replace any newlines or tabs (which would mess up a PKGBUILD) with # spaces. Then, make sure there is no description = description.gsub(/[[:space:]]+/, ' ').strip # Ensure that the description ends in a full-stop. description += '.' unless description[-1, 1] == '.' description end |
#get_checksum_from_response(response) ⇒ String
Gets the sha256 checksum returned from the rubygems.org API.
143 144 145 |
# File 'lib/gembuild/gem_scraper.rb', line 143 def get_checksum_from_response(response) response.fetch(:sha) end |
#get_dependencies_for_version(version) ⇒ Array
Get all other gem dependencies for the given version.
160 161 162 163 164 165 166 167 168 169 170 |
# File 'lib/gembuild/gem_scraper.rb', line 160 def get_dependencies_for_version(version) version = Gem::Version.new(version) if version.is_a?(String) payload = Marshal.load(agent.get(deps).body) dependencies = payload.find do |v| Gem::Version.new(v[:number]) == version end dependencies[:dependencies].map(&:first) end |
#get_licenses_from_response(response) ⇒ Array
Get the array of licenses under which the gem is licensed.
151 152 153 |
# File 'lib/gembuild/gem_scraper.rb', line 151 def get_licenses_from_response(response) response.fetch(:licenses) end |
#get_version_from_response(response) ⇒ Gem::Version
Gets the version number from the parsed response.
117 118 119 |
# File 'lib/gembuild/gem_scraper.rb', line 117 def get_version_from_response(response) Gem::Version.new(response.fetch(:number)) end |
#query_latest_version ⇒ Hash
Query the rubygems version api for the latest version.
102 103 104 105 106 107 108 109 110 111 |
# File 'lib/gembuild/gem_scraper.rb', line 102 def query_latest_version response = JSON.parse(agent.get(url).body, symbolize_names: true) # Skip any release marked as a "prerelease" response.shift while response.first[:prerelease] response.first rescue Mechanize::ResponseCodeError, Net::HTTPNotFound raise Gembuild::GemNotFoundError end |
#scrape! ⇒ Hash
Quick method to get all important information in a single hash for later processing.
191 192 193 194 195 196 197 198 199 200 201 202 203 |
# File 'lib/gembuild/gem_scraper.rb', line 191 def scrape! response = query_latest_version version = get_version_from_response(response) { version: version, description: format_description_from_response(response), checksum: get_checksum_from_response(response), license: get_licenses_from_response(response), dependencies: get_dependencies_for_version(version), homepage: scrape_frontend_for_homepage_url } end |
#scrape_frontend_for_homepage_url ⇒ String
Scrape the rubygems.org frontend for the gem’s homepage URL.
175 176 177 178 179 180 181 182 183 184 |
# File 'lib/gembuild/gem_scraper.rb', line 175 def scrape_frontend_for_homepage_url html = agent.get(gem).body links = Nokogiri::HTML(html).css('a') homepage_link = links.find do |a| a.text.strip == 'Homepage' end homepage_link[:href] end |