Module: Pages

Included in:
ClassMethods
Defined in:
lib/botinsta/pages.rb

Overview

Contains methods for getting pages, query_id and JS link. To like a media from every tag we first need its query_id (a.k.a) query_hash

Instance Method Summary collapse

Instance Method Details

#get_first_page_data(tag) ⇒ Object

Gets first page JSON string for the tag to extract data (i.e. media IDs and owner IDs) and creates a PageData instance.

Parameters:

  • tag (String)

    Current tag.



36
37
38
39
40
41
42
43
# File 'lib/botinsta/pages.rb', line 36

def get_first_page_data(tag)
  print_time_stamp
  puts 'Getting the first page for the tag '.colorize(:blue) + "##{tag}"
  response = @agent.get "https://www.instagram.com/explore/tags/#{tag}/?__a=1"
  data = JSON.parse(response.body.sub(/graphql/, 'data'))
  data.extend Hashie::Extensions::DeepFind
  @page = PageData.new(data)
end

Returns the .js link of the TagPageContainer from which we will extract the query_id.

Parameters:

  • tag (String)

    Current tag.

Returns:

  • (String)

    Full link of the TagPageContainer.js



21
22
23
24
25
26
27
28
29
# File 'lib/botinsta/pages.rb', line 21

def get_js_link(tag)
  response = @agent.get "https://instagram.com/explore/tags/#{tag}"
  # Parsing the returned page to select the script which has 'TagPageContainer.js' in its src
  parsed_page = Nokogiri::HTML(response.body)
  script_array = parsed_page.css('script').select {|script| script.to_s.include?('TagPageContainer.js')}
  script = script_array.first

  'https://instagram.com' + script['src']
end

#get_next_page_data(tag) ⇒ Object

Gets next page JSON string for when we liked all the media on the first page and creates a PageData instance. This is where we need query_id and end_cursor string of the current page.

Parameters:

  • tag (String)

    Current tag.



51
52
53
54
55
56
57
58
59
60
61
# File 'lib/botinsta/pages.rb', line 51

def get_next_page_data(tag)
  print_time_stamp
  puts 'Getting the next page for the tag '.colorize(:blue) + "#{tag}"
  next_page_link =  "https://www.instagram.com/graphql/query/?query_hash=#{@query_id}&"\
                    "variables={\"tag_name\":\"#{tag}\"," \
                    "\"first\":10,\"after\":\"#{@page.end_cursor}\"}"
  response = @agent.get next_page_link
  data = JSON.parse(response.body)
  data.extend Hashie::Extensions::DeepFind
  @page = PageData.new(data)
end

#get_user_page_data(user_id) ⇒ Object

Gets user page JSON string and parses it to create a UserData instance.

Parameters:

  • user_id (String)

    User id of the media owner.



67
68
69
70
71
72
73
74
75
76
77
78
# File 'lib/botinsta/pages.rb', line 67

def get_user_page_data(user_id)
  url_user_detail = "https://i.instagram.com/api/v1/users/#{user_id}/info/"
  begin
  response = @agent.get url_user_detail
  rescue Mechanize::ResponseCodeError
    return false
  end
  data = JSON.parse(response.body)
  data.extend Hashie::Extensions::DeepFind
  @user = UserData.new(data)
  true
end

#set_query_id(tag) ⇒ Object

Sets the query id for the current tag.

Parameters:

  • tag (String)

    Current tag.



9
10
11
12
13
14
# File 'lib/botinsta/pages.rb', line 9

def set_query_id(tag)
  response = @agent.get get_js_link tag
  # RegExp for getting the right query id. Because there are a few of them.
  match_data = /byTagName\.get\(t\)\.pagination},queryId:"(?<queryId>[0-9a-z]+)/.match(response.body)
  @query_id = match_data[:queryId]
end