Class: ROC

Inherits:
Object
  • Object
show all
Defined in:
lib/roc.rb

Overview

Class for all types of classification analysis: receiver-operator-characteristics, precision-recall, etc.. Some definitions from (Davis & Goadrich. Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, 2006):

Recall              = TP/(TP+FN) [aka, Sensitivity]
Precision           = TP/(TP+FP) [aka, Positive Predictive Value]
True Positive Rate  = TP/(TP+FN)
False Positive Rate = FP/(FP+TN)

Keys to some abbreviations used in this class:

pred = number predicted to be correct
tps = number of true positives
ppv = positive predictive value
om_ppv = one minus positive predictive value = FP/(TP+FP)

NOTE: this class assumes that lower scores are better. Negate your scores if this is not the case.

For estimation of false positive rates using a decoy database strategy, see the DecoyROC class.

Direct Known Subclasses

DecoyROC

Instance Method Summary collapse

Instance Method Details

#area_under_curve(x, y) ⇒ Object

returns area under the curve found by trapezoids x and y specify the coordinates to use x should be monotonic increasing



31
32
33
34
35
36
37
38
39
40
41
42
43
# File 'lib/roc.rb', line 31

def area_under_curve(x,y)
  area = 0.0
  (0...(x.size-1)).each do |i|
    # determine which is larger 
    if y[i+1] >= y[i]
      y1 = y[i+1]; y0 = y[i] 
    else
      y0 = y[i+1]; y1 = y[i] 
    end
    area += (x[i+1]-x[i]).to_f * ( y0.to_f + (y1-y0).to_f/2 ) 
  end
  area
end

#prep_list(list) ⇒ Object

given an array of doublets where each doublet is a value and a boolean, sorts the list and divides it into two arrays (tps, fps) of the values. The output can then be fed into many of the other routines.



48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/roc.rb', line 48

def prep_list(list)
  tp = []; fp = []
  list.each do |dbl|
    if dbl[1]
      tp << dbl
    else
      fp << dbl
    end
  end
  [tp,fp].collect do |arr|
    arr.collect! {|dbl| dbl[0] }
    arr.sort
  end
end

#tps_and_ppv(tp, fp) ⇒ Object

Base function for tps calculations



64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/roc.rb', line 64

def tps_and_ppv(tp, fp)
  tp_i = 0
  fp_i = 0
  x = []
  y = []
  num_tps = 0

  while tp_i < tp.size
    while fp_i < fp.size && tp[tp_i] >= fp[fp_i]
      fp_i += 1
    end
    unless tp[tp_i] == tp[tp_i+1]
      # get the correct number of each
      num_tps = tp_i + 1 
      num_fps = fp_i 

      x << num_tps
      y << num_tps.to_f/(num_tps+num_fps)

    end
    tp_i += 1 
  end
  return x, y
end