Class: Biffbot::Bulk

Inherits:
Base
  • Object
show all
Includes:
Hashie::Extensions::Coercion
Defined in:
lib/biffbot/bulk.rb

Instance Method Summary collapse

Methods inherited from Base

#generate_url, #parse, #parse_options

Constructor Details

#initialize(token = Biffbot.token) ⇒ Bulk

a new instance of Biffbot::Bulk

Parameters:

  • token (String) (defaults to: Biffbot.token)

    Override Biffbot.token with another token



11
12
13
# File 'lib/biffbot/bulk.rb', line 11

def initialize token = Biffbot.token
  @token = token
end

Instance Method Details

#create_job(name, api_type, urls = [], options = {}) ⇒ Hash

create a bulk job

Parameters:

  • name (String)

    Desired name for bulk job

  • api_type (String)

    Desired API to use for urls

  • urls (Array) (defaults to: [])

    An array of input urls to pass to bulk job

  • options (Hash) (defaults to: {})

    An hash of options

Returns:

  • (Hash)


22
23
24
25
26
27
28
29
30
31
# File 'lib/biffbot/bulk.rb', line 22

def create_job name, api_type, urls = [], options = {}
  api_url = "http://api.diffbot.com/v2/#{api_type}"
  api_url = "http://api.diffbot.com/#{options[:version]}/#{api_type}" if options[:version] == 'v2' || options[:version] == 'v3'
  api_url = parse_options(options, api_url)
  endpoint = 'http://api.diffbot.com/v3/bulk'
  post_body = generate_post_body(name, api_url, urls, options)
  JSON.parse(HTTParty.post(endpoint, body: post_body.to_json, headers: {'Content-Type' => 'application/json'}).body).each_pair do |k, v|
    self[k] = v
  end
end

#generate_post_body(name, api_url, urls = [], options = {}) ⇒ Hash

generate the POST body required for bulk job creation

Parameters:

  • name (String)

    Desired name for bulk job

  • api_url (String)

    Desired API url to use for urls

  • urls (Array) (defaults to: [])

    An array of input urls to pass to bulk job

  • options (Hash) (defaults to: {})

    An hash of options

Returns:

  • (Hash)


40
41
42
43
44
45
46
47
# File 'lib/biffbot/bulk.rb', line 40

def generate_post_body name, api_url, urls = [], options = {}
  post_body = {token: @token, name: name, apiUrl: api_url, urls: urls}
  options.each do |key, value|
    next unless %w(notifyEmail maxRounds notifyWebHook pageProcessPattern).include?(key.to_s)
    post_body[key] = value
  end
  post_body
end

#retrieve_data(jobName, _options = {}) ⇒ Hash

retrieve data per given jobName

Parameters:

  • jobName (String)

    Name of bulk job

  • _options (Hash) (defaults to: {})

    An hash of options

Returns:

  • (Hash)


71
72
73
74
75
76
77
# File 'lib/biffbot/bulk.rb', line 71

def retrieve_data jobName, _options = {}
  # TODO: add support for csv
  endpoint = "http://api.diffbot.com/v3/bulk/download/#{@token}-#{jobName}_data.json"
  JSON.parse(HTTParty.get(endpoint).body).each_pair do |key, value|
    self[key] = value
  end
end