Class: Veritable::Analysis

Inherits:
Object
  • Object
show all
Includes:
VeritableResource
Defined in:
lib/veritable/api.rb

Overview

Represents the resources associated with a single analysis

Attributes

  • _id – the unique String id of the analysis

  • description – the String description of the analysis

  • created_at – a String timestamp recording the time the analysis was created

  • finished_at – a String timestamp recording the time the analysis completd

  • state – the state of the analysis, one of ["running", "succeeded", "failed"]

  • running?true if state is "running"

  • succeeded?+true if state is "succeeded"

  • failed?true if state is "failed"

  • error – a Hash containing details of the error that occurred, if state is "failed", otherwise nil

  • progress – a Hash containing details of the analysis progress, if state is "running", otherwise nil

  • schema – a Veritable::Schema describing the columns included in the analysis

Methods

  • update – refreshes the local representation of the API resource

  • delete – deletes the associated API resource

  • wait – blocks until the analysis succeeds or fails

  • predict – makes new predictions based on the analysis

  • related_to – calculates column relatedness based on the analysis

  • similar_to – calculates row relatedness based on the analysis

See also: dev.priorknowledge.com/docs/client/ruby

Instance Method Summary collapse

Methods included from Connection

#get, #initialize, #post, #put, #request

Methods included from VeritableObject

#initialize

Instance Method Details

#_idObject

The unique String id of the analysis



643
# File 'lib/veritable/api.rb', line 643

def _id; @doc['_id']; end

#batch_predict(rows, count = 100) ⇒ Object

Makes predictions based on the analysis for multiple rows at a time

Arguments

  • rows – an Enumerator over prediction request Hashes, each of which represents a row whose missing values are to be predicted. Keys must be valid String ids of columns contained in the underlying table, and values must be either fixed (conditioning) values of an appropriate type for each column, or nil for values to be predicted. Each prediction request Hash must also have a ‘_request_id’ key with a unique string value.

  • count – optionally specify the number of samples from the predictive distribution to return. Defaults to 100.

Returns

An Enumerator over Veritable::Prediction objects

See also: dev.priorknowledge.com/docs/client/ruby



530
531
532
# File 'lib/veritable/api.rb', line 530

def batch_predict(rows, count=100)
  return raw_predict(rows.to_enum, count, api_limits['predictions_max_response_cells'], api_limits['predictions_max_cols'], true)
end

#created_atObject

String timestamp recording the time the analysis was created



646
# File 'lib/veritable/api.rb', line 646

def created_at; @doc['created_at']; end

#deleteObject

Deletes the associated analysis resource

Returns

nil on success. Succeeds silently if the analysis has already been deleted.

See also: dev.priorknowledge.com/docs/client/ruby



463
# File 'lib/veritable/api.rb', line 463

def delete; rest_delete(link('self')); end

#descriptionObject

The String description of the analysis



672
# File 'lib/veritable/api.rb', line 672

def description; @doc['description']; end

#errorObject

A Hash containing details of the error if state is "failed", otherwise nil



666
# File 'lib/veritable/api.rb', line 666

def error; state == 'failed' ? @doc['error'] : nil; end

#failed?Boolean

true if state is "failed", otherwise false

Returns:

  • (Boolean)


663
# File 'lib/veritable/api.rb', line 663

def failed?; state == 'failed'; end

#finished_atObject

String timestamp recording the time the analysis completed



649
# File 'lib/veritable/api.rb', line 649

def finished_at; @doc['finished_at']; end

#grouping(column_id) ⇒ Object

Get a grouping for a particular column. If no grouping is currently running, this will create it.

Arguments

  • column_id – the name of the column along which to group rows

Returns

A Grouping instance for the column id

See also: dev.priorknowledge.com/docs/client/ruby



607
608
609
# File 'lib/veritable/api.rb', line 607

def grouping(column_id)
  return (groupings([column_id]).to_a)[0]
end

#groupings(column_ids) ⇒ Object

Gets groupings for a list of columns. If corresponding groupings are not currently running, this will create them.

Arguments

  • column_ids – an array of column names to create groupings for

Returns

An Enumerator over Grouping instances

See also: dev.priorknowledge.com/docs/client/ruby



621
622
623
624
625
626
627
628
629
630
631
632
633
# File 'lib/veritable/api.rb', line 621

def groupings(column_ids)
  update if running?
  if succeeded?
    doc = post(link('group'), {:columns => column_ids}.update(@opts))
    return doc['groupings'].to_a.map {|g| Grouping.new(@opts, g)}
  elsif running?
    raise VeritableError.new("Grouping -- Analysis with id #{_id} is still running and not yet ready to calculate groupings.")
  elsif failed?
    raise VeritableError.new("Grouping -- Analysis with id #{_id} has failed and cannot calculate groupings.")
  else
    raise VeritableError.new("Grouping -- Shouldn't be here -- please let us know at [email protected].")
  end
end

#inspectObject

Returns a string representation of the analysis resource



637
# File 'lib/veritable/api.rb', line 637

def inspect; to_s; end

#predict(row, count = 100) ⇒ Object

Makes predictions based on the analysis

Arguments

  • row – a Hash representing the row whose missing values are to be predicted. Keys must be valid String ids of columns contained in the underlying table, and values must be either fixed (conditioning) values of an appropriate type for each column, or nil for values to be predicted.

  • count – optionally specify the number of samples from the predictive distribution to return. Defaults to 100.

Returns

A Veritable::Prediction object

See also: dev.priorknowledge.com/docs/client/ruby



512
513
514
515
516
517
# File 'lib/veritable/api.rb', line 512

def predict(row, count=100)
  if not row.is_a? Hash
    raise VeritableError.new("Predict -- Must provide a row hash to make predictions.")
  end
  raw_predict([row].to_enum, count, api_limits['predictions_max_response_cells'], api_limits['predictions_max_cols'], false).next
end

#progressObject

A Hash containing details of the analysis progress if state is "running", otherwise nil



669
# File 'lib/veritable/api.rb', line 669

def progress; state == 'running' ? @doc['progress'] : nil; end

Scores how related columns are to a column of interest

Arguments

  • column_id – the name of the column of interest

  • start – the column name from which to start the cursor. Columns with related scores greater than or equal to the score of column start will be returned by the cursor. Default is nil, in which case all columns in the table will be returned by the cursor.

  • limit – optionally limits the number of columns returned by the cursor. Default is nil, in which case the number of columns returned will not be limited.

Returns

A Veritable::Cursor. The cursor will return column ids, in order of their relatedness to the column of interest.

See also: dev.priorknowledge.com/docs/client/ruby



546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
# File 'lib/veritable/api.rb', line 546

def related_to(column_id, opts={'start' => nil, 'limit' => nil})
  update if running?
  if succeeded?
    Cursor.new(
     {'collection' => "#{link('related')}/#{column_id}",
      'start' => opts['start'],
      'limit' => opts['limit']}.update(@opts))
  elsif running?
    raise VeritableError.new("Related -- Analysis with id #{_id} is still running and not yet ready to calculate related.")
  elsif failed?
    raise VeritableError.new("Related -- Analysis with id #{_id} has failed and cannot calculate related.")
  else
    raise VeritableError.new("Related -- Shouldn't be here -- please let us know at [email protected].")
  end
end

#rest_deleteObject

Alias the connection’s delete method as rest_delete



455
# File 'lib/veritable/api.rb', line 455

alias :rest_delete :delete

#running?Boolean

true if state is "running", otherwise false

Returns:

  • (Boolean)


657
# File 'lib/veritable/api.rb', line 657

def running?; state == 'running'; end

#schemaObject

The schema describing the analysis

Returns

A new Veritable::Schema object describing the colums contained in the analysis.

See also: dev.priorknowledge.com/docs/client/ruby



471
472
473
474
475
476
# File 'lib/veritable/api.rb', line 471

def schema
    if @old_schema.nil?
        @old_schema = Schema.new(get(link('schema')))
    end
    return @old_schema        
end

#similar_to(row, column_id, opts = {:max_rows => 10, :return_data => true}) ⇒ Object

Returns rows which are similar to a target row in the context of a particular column of interest.

Arguments

  • row – either a row ‘_id’ string or a row hash corrsponding to the target row. If a row hash is provided, it must contain an ‘_id’ key whose value is the ‘_id’ of a row present in the table at the time of the analysis

  • column_id – the name of the column of interest.

  • max_rows – the maximum number of similar rows to return. Default is 10. The actual number of similar rows returned will be less than or equal to max_rows.

  • return_data – if true, the full row content will be returned. If false, only the ‘_id’ field for each row will be returned. Default is true.

Returns

An Enumerator over row data Hashes ordered from most similar to least similar. Each row data Hash has a ‘_similarity’ key containing a similarity score between 0 and 1.

See also: dev.priorknowledge.com/docs/client/ruby



576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
# File 'lib/veritable/api.rb', line 576

def similar_to(row, column_id, opts={:max_rows => 10, :return_data => true})
  if row.is_a? String
    row = {'_id' => row}
  end
  if not row.is_a? Hash
    raise VeritableError.new("Similar -- Must provide an existing row to get similar!")
  end
  update if running?
  if succeeded?
    doc = post(link('similar'), {:data => row, :column => column_id, 
                                 :max_rows => 10, :return_data => true}.update(opts))
    return doc['data'].to_enum
  elsif running?
    raise VeritableError.new("Similar -- Analysis with id #{_id} is still running and not yet ready to calculate similar.")
  elsif failed?
    raise VeritableError.new("Similar -- Analysis with id #{_id} has failed and cannot calculate similar.")
  else
    raise VeritableError.new("Similar -- Shouldn't be here -- please let us know at [email protected].")
  end
end

#stateObject

The state of the analysis

One of ["running", "succeeded", "failed"]



654
# File 'lib/veritable/api.rb', line 654

def state; @doc['state']; end

#succeeded?Boolean

true if state is "succeeded", otherwise false

Returns:

  • (Boolean)


660
# File 'lib/veritable/api.rb', line 660

def succeeded?; state == 'succeeded'; end

#to_sObject

Returns a string representation of the analysis resource



640
# File 'lib/veritable/api.rb', line 640

def to_s; "#<Veritable::Analysis _id='#{_id}'>"; end

#updateObject

Refreshes the local representation of the analysis

Returns

nil on success

See also: dev.priorknowledge.com/docs/client/ruby



452
# File 'lib/veritable/api.rb', line 452

def update; @doc = get(link('self')); nil; end

#wait(max_time = nil, poll = 2) ⇒ Object

Blocks until the analysis succeeds or fails

Arguments

  • max_time – the maximum time to wait, in seconds. Default is nil, in which case the method will wait indefinitely.

  • poll – the number of seconds to wait between polling the API server. Default is 2.

Returns

nil on success.

See also: dev.priorknowledge.com/docs/client/ruby



488
489
490
491
492
493
494
495
496
497
498
499
500
# File 'lib/veritable/api.rb', line 488

def wait(max_time=nil, poll=2)
  elapsed = 0
  while running?
    sleep poll
    if not max_time.nil?
      elapsed += poll
      if elapsed > max_time
        raise VeritableError.new("Wait for analysis -- Maximum time of #{max_time} second exceeded.")
      end
    end
    update
  end
end