Module: Pages
- Included in:
- ClassMethods
- Defined in:
- lib/botinsta/pages.rb
Overview
Contains methods for getting pages, query_id and JS link. To like a media from every tag we first need its query_id (a.k.a) query_hash
Instance Method Summary collapse
-
#get_first_page_data(tag) ⇒ Object
Gets first page JSON string for the tag to extract data (i.e. media IDs and owner IDs) and creates a PageData instance.
-
#get_js_link(tag) ⇒ String
Returns the .js link of the TagPageContainer from which we will extract the query_id.
-
#get_next_page_data(tag) ⇒ Object
Gets next page JSON string for when we liked all the media on the first page and creates a PageData instance.
-
#get_user_page_data(user_id) ⇒ Object
Gets user page JSON string and parses it to create a UserData instance.
-
#set_query_id(tag) ⇒ Object
Sets the query id for the current tag.
Instance Method Details
#get_first_page_data(tag) ⇒ Object
Gets first page JSON string for the tag to extract data (i.e. media IDs and owner IDs) and creates a PageData instance.
36 37 38 39 40 41 42 43 |
# File 'lib/botinsta/pages.rb', line 36 def get_first_page_data(tag) print_time_stamp puts 'Getting the first page for the tag '.colorize(:blue) + "##{tag}" response = @agent.get "https://www.instagram.com/explore/tags/#{tag}/?__a=1" data = JSON.parse(response.body.sub(/graphql/, 'data')) data.extend Hashie::Extensions::DeepFind @page = PageData.new(data) end |
#get_js_link(tag) ⇒ String
Returns the .js link of the TagPageContainer from which we will extract the query_id.
21 22 23 24 25 26 27 28 29 |
# File 'lib/botinsta/pages.rb', line 21 def get_js_link(tag) response = @agent.get "https://instagram.com/explore/tags/#{tag}" # Parsing the returned page to select the script which has 'TagPageContainer.js' in its src parsed_page = Nokogiri::HTML(response.body) script_array = parsed_page.css('script').select {|script| script.to_s.include?('TagPageContainer.js')} script = script_array.first 'https://instagram.com' + script['src'] end |
#get_next_page_data(tag) ⇒ Object
Gets next page JSON string for when we liked all the media on the first page and creates a PageData instance. This is where we need query_id and end_cursor string of the current page.
51 52 53 54 55 56 57 58 59 60 61 |
# File 'lib/botinsta/pages.rb', line 51 def get_next_page_data(tag) print_time_stamp puts 'Getting the next page for the tag '.colorize(:blue) + "#{tag}" next_page_link = "https://www.instagram.com/graphql/query/?query_hash=#{@query_id}&"\ "variables={\"tag_name\":\"#{tag}\"," \ "\"first\":10,\"after\":\"#{@page.end_cursor}\"}" response = @agent.get next_page_link data = JSON.parse(response.body) data.extend Hashie::Extensions::DeepFind @page = PageData.new(data) end |
#get_user_page_data(user_id) ⇒ Object
Gets user page JSON string and parses it to create a UserData instance.
67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/botinsta/pages.rb', line 67 def get_user_page_data(user_id) url_user_detail = "https://i.instagram.com/api/v1/users/#{user_id}/info/" begin response = @agent.get url_user_detail rescue Mechanize::ResponseCodeError return false end data = JSON.parse(response.body) data.extend Hashie::Extensions::DeepFind @user = UserData.new(data) true end |
#set_query_id(tag) ⇒ Object
Sets the query id for the current tag.
9 10 11 12 13 14 |
# File 'lib/botinsta/pages.rb', line 9 def set_query_id(tag) response = @agent.get get_js_link tag # RegExp for getting the right query id. Because there are a few of them. match_data = /byTagName\.get\(t\)\.pagination},queryId:"(?<queryId>[0-9a-z]+)/.match(response.body) @query_id = match_data[:queryId] end |