Class: OpenTox::Model::Validation

Inherits:
Object
  • Object
show all
Includes:
Mongoid::Document, Mongoid::Timestamps, OpenTox
Defined in:
lib/model.rb

Overview

Convenience class for generating and validating lazar models in a single step and predicting substances (compounds and nanoparticles), arrays of substances and datasets

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.from_csv_file(file) ⇒ OpenTox::Model::Validation

Create and validate a lazar model from a csv file with training data and a json file with metadata

Parameters:

  • CSV (File)

    file with two columns. The first line should contain either SMILES or InChI (first column) and the endpoint (second column). The first column should contain either the SMILES or InChI of the training compounds, the second column the training compounds toxic activities (qualitative or quantitative). Use -log10 transformed values for regression datasets. Add metadata to a JSON file with the same basename containing the fields “species”, “endpoint”, “source” and “unit” (regression only). You can find example training data at github.com/opentox/lazar-public-data.

Returns:



462
463
464
465
466
467
468
469
470
471
472
# File 'lib/model.rb', line 462

def self.from_csv_file file
   = file.sub(/csv$/,"json")
  bad_request_error "No metadata file #{metadata_file}" unless File.exist? 
  model_validation = self.new JSON.parse(File.read())
  training_dataset = Dataset.from_csv_file file
  model = Lazar.create training_dataset: training_dataset
  model_validation[:model_id] = model.id
  model_validation[:repeated_crossvalidation_id] = OpenTox::Validation::RepeatedCrossValidation.create(model).id # full class name required
  model_validation.save
  model_validation
end

.from_enanomapper(training_dataset: nil, prediction_feature: nil, algorithms: nil) ⇒ OpenTox::Model::Validation

Create and validate a nano-lazar model, import data from eNanoMapper if necessary

nano-lazar methods are described in detail in https://github.com/enanomapper/nano-lazar-paper/blob/master/nano-lazar.pdf

Parameters:

  • training_dataset (OpenTox::Dataset, nil) (defaults to: nil)
  • prediction_feature (OpenTox::Feature, nil) (defaults to: nil)
  • algorithms (Hash, nil) (defaults to: nil)

Returns:



480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
# File 'lib/model.rb', line 480

def self.from_enanomapper training_dataset: nil, prediction_feature:nil, algorithms: nil
  
  # find/import training_dataset
  training_dataset ||= Dataset.where(:name => "Protein Corona Fingerprinting Predicts the Cellular Interaction of Gold and Silver Nanoparticles").first
  unless training_dataset # try to import 
    Import::Enanomapper.import
    training_dataset = Dataset.where(name: "Protein Corona Fingerprinting Predicts the Cellular Interaction of Gold and Silver Nanoparticles").first
    bad_request_error "Cannot import 'Protein Corona Fingerprinting Predicts the Cellular Interaction of Gold and Silver Nanoparticles' dataset" unless training_dataset
  end
  prediction_feature ||= Feature.where(name: "log2(Net cell association)", category: "TOX").first

  model_validation = self.new(
    :endpoint => prediction_feature.name,
    :source => prediction_feature.source,
    :species => "A549 human lung epithelial carcinoma cells",
    :unit => prediction_feature.unit
  )
  model = LazarRegression.create prediction_feature: prediction_feature, training_dataset: training_dataset, algorithms: algorithms
  model_validation[:model_id] = model.id
  repeated_cv = OpenTox::Validation::RepeatedCrossValidation.create model, 10, 5
  model_validation[:repeated_crossvalidation_id] = repeated_cv.id
  model_validation.save
  model_validation
end

Instance Method Details

#algorithmsHash

Get algorithms

Returns:

  • (Hash)


425
426
427
# File 'lib/model.rb', line 425

def algorithms
  model.algorithms
end

#classification?TrueClass, FalseClass

Is it a classification model

Returns:

  • (TrueClass, FalseClass)


455
456
457
# File 'lib/model.rb', line 455

def classification?
  model.is_a? LazarClassification
end

#crossvalidationsArray<OpenTox::CrossValidation]

Get crossvalidations

Returns:

  • (Array<OpenTox::CrossValidation])

    Array<OpenTox::CrossValidation]



443
444
445
# File 'lib/model.rb', line 443

def crossvalidations
  repeated_crossvalidation.crossvalidations
end

#modelOpenTox::Model::Lazar

Get lazar model



419
420
421
# File 'lib/model.rb', line 419

def model
  Lazar.find model_id
end

#predict(object) ⇒ Hash, ...

Predict a substance (compound or nanoparticle), an array of substances or a dataset



407
408
409
# File 'lib/model.rb', line 407

def predict object
  model.predict object
end

#prediction_featureOpenTox::Feature

Get prediction feature

Returns:



431
432
433
# File 'lib/model.rb', line 431

def prediction_feature
  model.prediction_feature
end

#regression?TrueClass, FalseClass

Is it a regression model

Returns:

  • (TrueClass, FalseClass)


449
450
451
# File 'lib/model.rb', line 449

def regression?
  model.is_a? LazarRegression
end

#repeated_crossvalidationOpenTox::Validation::RepeatedCrossValidation

Get repeated crossvalidations



437
438
439
# File 'lib/model.rb', line 437

def repeated_crossvalidation
  OpenTox::Validation::RepeatedCrossValidation.find repeated_crossvalidation_id # full class name required
end

#training_datasetOpenTox::Dataset

Get training dataset

Returns:



413
414
415
# File 'lib/model.rb', line 413

def training_dataset
  model.training_dataset
end