Class: Mindee::Client
- Inherits:
-
Object
- Object
- Mindee::Client
- Defined in:
- lib/mindee/client.rb
Overview
Mindee API Client. See: https://developers.mindee.com/docs
Instance Method Summary collapse
-
#create_endpoint(endpoint_name: '', account_name: '', version: '') ⇒ Mindee::HTTP::Endpoint
Creates a custom endpoint with the given values.
-
#enqueue(input_source, product_class, endpoint: nil, options: {}) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for async parsing.
-
#enqueue_and_parse(input_source, product_class, endpoint, options) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for async parsing and automatically try to retrieve it.
-
#execute_workflow(input_source, workflow_id, options: {}) ⇒ Mindee::Parsing::Common::WorkflowResponse
Sends a document to a workflow.
-
#initialize(api_key: '') ⇒ Client
constructor
A new instance of Client.
-
#load_prediction(product_class, local_response) ⇒ Mindee::Parsing::Common::ApiResponse
Load a prediction.
-
#parse(input_source, product_class, endpoint: nil, options: {}, enqueue: true) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for parsing and automatically try to retrieve it if needed.
-
#parse_queued(job_id, product_class, endpoint: nil) ⇒ Mindee::Parsing::Common::ApiResponse
Parses a queued document.
-
#source_from_b64string(base64_string, filename, repair_pdf: false) ⇒ Mindee::Input::Source::Base64InputSource
Load a document from a base64 encoded string.
-
#source_from_bytes(input_bytes, filename, repair_pdf: false) ⇒ Mindee::Input::Source::BytesInputSource
Load a document from raw bytes.
-
#source_from_file(input_file, filename, repair_pdf: false) ⇒ Mindee::Input::Source::FileInputSource
Load a document from a normal Ruby
File
. -
#source_from_path(input_path, repair_pdf: false) ⇒ Mindee::Input::Source::PathInputSource
Load a document from an absolute path, as a string.
-
#source_from_url(url) ⇒ Mindee::Input::Source::URLInputSource
Load a document from a secure remote source (HTTPS).
Constructor Details
#initialize(api_key: '') ⇒ Client
Returns a new instance of Client.
93 94 95 |
# File 'lib/mindee/client.rb', line 93 def initialize(api_key: '') @api_key = api_key end |
Instance Method Details
#create_endpoint(endpoint_name: '', account_name: '', version: '') ⇒ Mindee::HTTP::Endpoint
Creates a custom endpoint with the given values. Do not set for standard (off the shelf) endpoints.
390 391 392 393 394 395 396 397 |
# File 'lib/mindee/client.rb', line 390 def create_endpoint(endpoint_name: '', account_name: '', version: '') initialize_endpoint( Mindee::Product::Universal::Universal, endpoint_name: endpoint_name, account_name: account_name, version: version ) end |
#enqueue(input_source, product_class, endpoint: nil, options: {}) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for async parsing
194 195 196 197 198 199 200 201 202 203 204 |
# File 'lib/mindee/client.rb', line 194 def enqueue(input_source, product_class, endpoint: nil, options: {}) opts = () endpoint ||= initialize_endpoint(product_class) logger.debug("Enqueueing document as '#{endpoint.url_root}'") prediction, raw_http = endpoint.predict_async( input_source, opts ) Mindee::Parsing::Common::ApiResponse.new(product_class, prediction, raw_http.to_json) end |
#enqueue_and_parse(input_source, product_class, endpoint, options) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for async parsing and automatically try to retrieve it
250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 |
# File 'lib/mindee/client.rb', line 250 def enqueue_and_parse(input_source, product_class, endpoint, ) validate_async_params(.initial_delay_sec, .delay_sec, .max_retries) enqueue_res = enqueue(input_source, product_class, endpoint: endpoint, options: ) job = enqueue_res.job or raise Errors::MindeeAPIError, 'Expected job to be present' job_id = job.id sleep(.initial_delay_sec) polling_attempts = 1 logger.debug("Successfully enqueued document with job id: '#{job_id}'") queue_res = parse_queued(job_id, product_class, endpoint: endpoint) queue_res_job = queue_res.job or raise Errors::MindeeAPIError, 'Expected job to be present' valid_statuses = [ Mindee::Parsing::Common::JobStatus::WAITING, Mindee::Parsing::Common::JobStatus::PROCESSING, ] # @type var valid_statuses: Array[(:waiting | :processing | :completed | :failed)] while valid_statuses.include?(queue_res_job.status) && polling_attempts < .max_retries logger.debug("Polling server for parsing result with job id: '#{job_id}'. Attempt #{polling_attempts}") sleep(.delay_sec) queue_res = parse_queued(job_id, product_class, endpoint: endpoint) queue_res_job = queue_res.job or raise Errors::MindeeAPIError, 'Expected job to be present' polling_attempts += 1 end if queue_res_job.status != Mindee::Parsing::Common::JobStatus::COMPLETED elapsed = .initial_delay_sec + (polling_attempts * .delay_sec.to_f) raise Errors::MindeeAPIError, "Asynchronous parsing request timed out after #{elapsed} seconds (#{polling_attempts} tries)" end queue_res end |
#execute_workflow(input_source, workflow_id, options: {}) ⇒ Mindee::Parsing::Common::WorkflowResponse
Sends a document to a workflow.
Accepts options either as a Hash or as a WorkflowOptions struct.
requiring authentication.
page_options
[Hash, nil] Page cutting/merge options::page_indexes
Zero-based list of page indexes.:operation
Operation to apply on the document, given the `page_indexes specified::KEEP_ONLY
- keep only the specified pages, and remove all others.:REMOVE
- remove the specified pages, and keep all others.
:on_min_pages
Apply the operation only if document has at least this many pages.
304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 |
# File 'lib/mindee/client.rb', line 304 def execute_workflow(input_source, workflow_id, options: {}) opts = .is_a?(WorkflowOptions) ? : WorkflowOptions.new(params: ) if opts.respond_to?(:page_options) && input_source.is_a?(Input::Source::LocalInputSource) process_pdf_if_required(input_source, opts) end workflow_endpoint = Mindee::HTTP::WorkflowEndpoint.new(workflow_id, api_key: @api_key.to_s) logger.debug("Sending document to workflow '#{workflow_id}'") prediction, raw_http = workflow_endpoint.execute_workflow( input_source, opts ) Mindee::Parsing::Common::WorkflowResponse.new(Product::Universal::Universal, prediction, raw_http) end |
#load_prediction(product_class, local_response) ⇒ Mindee::Parsing::Common::ApiResponse
Load a prediction.
326 327 328 329 330 331 332 333 334 335 |
# File 'lib/mindee/client.rb', line 326 def load_prediction(product_class, local_response) raise Errors::MindeeAPIError, 'Expected LocalResponse to not be nil.' if local_response.nil? response_hash = local_response.as_hash || {} raise Errors::MindeeAPIError, 'Expected LocalResponse#as_hash to return a hash.' if response_hash.nil? Mindee::Parsing::Common::ApiResponse.new(product_class, response_hash, response_hash.to_json) rescue KeyError, Errors::MindeeAPIError raise Errors::MindeeInputError, 'No prediction found in local response.' end |
#parse(input_source, product_class, endpoint: nil, options: {}, enqueue: true) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for parsing and automatically try to retrieve it if needed.
Accepts options either as a Hash or as a ParseOptions struct.
124 125 126 127 128 129 130 131 132 133 134 |
# File 'lib/mindee/client.rb', line 124 def parse(input_source, product_class, endpoint: nil, options: {}, enqueue: true) opts = () process_pdf_if_required(input_source, opts) if input_source.is_a?(Input::Source::LocalInputSource) endpoint ||= initialize_endpoint(product_class) if enqueue && product_class.has_async enqueue_and_parse(input_source, product_class, endpoint, opts) else parse_sync(input_source, product_class, endpoint, opts) end end |
#parse_queued(job_id, product_class, endpoint: nil) ⇒ Mindee::Parsing::Common::ApiResponse
Parses a queued document
Doesn't need to be set in the case of OTS APIs.
214 215 216 217 218 219 |
# File 'lib/mindee/client.rb', line 214 def parse_queued(job_id, product_class, endpoint: nil) endpoint = initialize_endpoint(product_class) if endpoint.nil? logger.debug("Fetching queued document as '#{endpoint.url_root}'") prediction, raw_http = endpoint.parse_async(job_id) Mindee::Parsing::Common::ApiResponse.new(product_class, prediction, raw_http.to_json) end |
#source_from_b64string(base64_string, filename, repair_pdf: false) ⇒ Mindee::Input::Source::Base64InputSource
Load a document from a base64 encoded string.
359 360 361 |
# File 'lib/mindee/client.rb', line 359 def source_from_b64string(base64_string, filename, repair_pdf: false) Input::Source::Base64InputSource.new(base64_string, filename, repair_pdf: repair_pdf) end |
#source_from_bytes(input_bytes, filename, repair_pdf: false) ⇒ Mindee::Input::Source::BytesInputSource
Load a document from raw bytes.
350 351 352 |
# File 'lib/mindee/client.rb', line 350 def source_from_bytes(input_bytes, filename, repair_pdf: false) Input::Source::BytesInputSource.new(input_bytes, filename, repair_pdf: repair_pdf) end |
#source_from_file(input_file, filename, repair_pdf: false) ⇒ Mindee::Input::Source::FileInputSource
Load a document from a normal Ruby File
.
368 369 370 |
# File 'lib/mindee/client.rb', line 368 def source_from_file(input_file, filename, repair_pdf: false) Input::Source::FileInputSource.new(input_file, filename, repair_pdf: repair_pdf) end |
#source_from_path(input_path, repair_pdf: false) ⇒ Mindee::Input::Source::PathInputSource
Load a document from an absolute path, as a string.
341 342 343 |
# File 'lib/mindee/client.rb', line 341 def source_from_path(input_path, repair_pdf: false) Input::Source::PathInputSource.new(input_path, repair_pdf: repair_pdf) end |
#source_from_url(url) ⇒ Mindee::Input::Source::URLInputSource
Load a document from a secure remote source (HTTPS).
375 376 377 |
# File 'lib/mindee/client.rb', line 375 def source_from_url(url) Input::Source::URLInputSource.new(url) end |