Class: Gini::Api::Document

Inherits:
Object
  • Object
show all
Defined in:
lib/gini-api/document.rb

Overview

Contains document related data from uploaded or fetched document

Defined Under Namespace

Classes: Extractions, Layout

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(api, location, from_data = nil) ⇒ Document

Instantiate a new Gini::Api::Document object from URL

Parameters:

  • api (Gini::Api::Client)

    Gini::Api::Client object

  • location (String)

    Document URL

  • from_data (Hash) (defaults to: nil)

    Hash with doc data (from search for example)



16
17
18
19
20
21
# File 'lib/gini-api/document.rb', line 16

def initialize(api, location, from_data = nil)
  @api      = api
  @location = location

  update(from_data)
end

Instance Attribute Details

#durationObject

Returns the value of attribute duration.



8
9
10
# File 'lib/gini-api/document.rb', line 8

def duration
  @duration
end

Instance Method Details

#completed?Boolean

Indicate if the document has been processed

Returns:

  • (Boolean)

    true if progress == PENDING



71
72
73
# File 'lib/gini-api/document.rb', line 71

def completed?
  @progress != 'PENDING'
end

#extractions(options = {}) ⇒ Gini::Api::Document::Extractions

Initialize extractions from @_links and return Gini::Api::Extractions object

Parameters:

  • options (Hash) (defaults to: {})

    Options

Options Hash (options):

  • :refresh (Boolean)

    Invalidate extractions cache

  • :incubator (Boolean)

    Return experimental extractions

Returns:



110
111
112
113
114
115
116
117
# File 'lib/gini-api/document.rb', line 110

def extractions(options = {})
  opts = { refresh: false, incubator: false }.merge(options)
  if opts[:refresh] or @extractions.nil?
    @extractions = Gini::Api::Document::Extractions.new(@api, @_links[:extractions], opts[:incubator])
  else
    @extractions
  end
end

#layoutGini::Api::Document::Layout

Initialize layout from @_links and return Gini::Api::Layout object

Returns:



123
124
125
# File 'lib/gini-api/document.rb', line 123

def layout
  @layout ||= Gini::Api::Document::Layout.new(@api, @_links[:layout])
end

#pagesObject

Override @pages instance variable. Removes key :pageNumber, key :images and starts by index 0. Page 1 becomes index 0



130
131
132
# File 'lib/gini-api/document.rb', line 130

def pages
  @pages.map { |page| page[:images] }
end

#poll(interval, &block) ⇒ Object

Poll document progress and return when state equals COMPLETED Known states are PENDING, COMPLETED and ERROR

Parameters:

  • interval (Float)

    API polling interval



58
59
60
61
62
63
64
65
# File 'lib/gini-api/document.rb', line 58

def poll(interval, &block)
  until @progress =~ /(COMPLETED|ERROR)/ do
    update
    yield self if block_given?
    sleep(interval)
  end
  nil
end

#processeddata

Get processed document

Returns:

  • (data)

    The binary representation of the processed document (pdf, jpg, png, …)



87
88
89
90
91
92
93
94
95
96
97
98
99
100
# File 'lib/gini-api/document.rb', line 87

def processed
  response = @api.request(
    :get,
    @_links[:processed],
    headers: { accept: 'application/octet-stream' }
  )
  unless response.status == 200
    raise Gini::Api::DocumentError.new(
      "Failed to fetch processed document (code=#{response.status})",
      response
    )
  end
  response.body
end

#report_error(summary = nil, description = nil) ⇒ String

Submit error report on document

Parameters:

  • summary (String) (defaults to: nil)

    Short summary on the error found

  • description (String) (defaults to: nil)

    More detailed description of the error found

Returns:

  • (String)

    Error ID retured from API



165
166
167
168
169
170
171
172
173
174
175
176
177
178
# File 'lib/gini-api/document.rb', line 165

def report_error(summary = nil, description = nil)
  response = @api.request(
    :post,
    "#{@_links[:document]}/errorreport",
    params: { summary: summary, description: description }
  )
  unless response.status == 200
    raise Gini::Api::DocumentError.new(
      "Failed to submit error report for document #{@id} (code=#{response.status})",
      response
    )
  end
  response.parsed[:errorId]
end

#submit_feedback(label, value) ⇒ Object

Deprecated.

Use ‘doc.extractions.LABEL = VALUE’ instead. Will be removed in next version

Submit feedback on extraction label

Parameters:

  • label (String)

    Extraction label to submit feedback on

  • value (String)

    The new value for the given label



140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
# File 'lib/gini-api/document.rb', line 140

def submit_feedback(label, value)
  unless extractions.send(label.to_sym)
    raise Gini::Api::DocumentError.new("Unknown label #{label}: Not found")
  end
  response = @api.request(
    :put,
    "#{@_links[:extractions]}/#{label}",
    headers: { 'content-type' => @api.version_header[:accept] },
    body: { value: value }.to_json
  )
  unless response.status == 204
    raise Gini::Api::DocumentError.new(
      "Failed to submit feedback for label #{label} (code=#{response.status})",
      response
    )
  end
end

#successful?Boolean

Was the document processed successfully?

Returns:

  • (Boolean)

    true/false based on @progress



79
80
81
# File 'lib/gini-api/document.rb', line 79

def successful?
  @progress == 'COMPLETED'
end

#update(from_data = nil) ⇒ Object

Fetch document resource and populate instance variables

Parameters:

  • from_data (Hash) (defaults to: nil)

    Ruby hash with doc data



27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# File 'lib/gini-api/document.rb', line 27

def update(from_data = nil)
  data = {}

  if from_data.nil?
    response = @api.request(:get, @location)
    unless response.status == 200
      raise Gini::Api::DocumentError.new(
        "Failed to fetch document data (code=#{response.status})",
        response
      )
    end
    data = response.parsed
  else
    data = from_data
  end

  data.each do |k, v|
    instance_variable_set("@#{k}", v)

    # We skip pages as it's rewritted by method pages()
    next if k == :pages

    self.class.send(:attr_reader, k)
  end
end