Class: Jcsv::Dimension
- Inherits:
-
Object
- Object
- Jcsv::Dimension
- Defined in:
- lib/dimensions.rb
Overview
Class Dimension keeps track of all data dimensions in a CSV file. A data dimension is similar to a mathematical dimension such as x, y or z. In principle, every data should be associates with only one set of data dimensions. For example, let’s say that our data has an employee ID, then column ID defines a dimension on the data, since every employee has a one ID and every ID is associated with only one employee. As another example, let’s say that we have data about a medical experiment that was done with a set of patients for 4 weeks, which were given either a medicine of a placebo. The data could have columns labeled: “Patient Index”, “Week”, “Type of Medicine”, “Blood Sample”. Some entries would be:
“Patient Index” “Week” “Type of Medicine” “Blood Sample”
1 1 Placebo xxxx
1 2 Placebo xxxx
2 1 med1 xxxx
2 2 med1 xxxx
“Patient Index”, “Week”, “Type of Medice” are three dimensions of this data and taken together unequivocally define the data, i.e., those dimensions are similar to a DB key. Since this is a key, there should be no other line of data with the same values in the dimensions.
CSV files are not ideal for maintaining dimensions, so, in order to read dimensions in a CSV file, there is the need for some rules.
Instance Attribute Summary collapse
-
#current_value ⇒ Object
readonly
Returns the value of attribute current_value.
-
#frozen ⇒ Object
readonly
Returns the value of attribute frozen.
-
#index(label) ⇒ Object
————————————————————————————.
-
#labels ⇒ Object
readonly
Returns the value of attribute labels.
-
#name ⇒ Object
readonly
Returns the value of attribute name.
-
#next_value ⇒ Object
readonly
Returns the value of attribute next_value.
Instance Method Summary collapse
-
#[](label) ⇒ Object
————————————————————————————.
-
#add_label(label) ⇒ Object
———————————————————————————— Adds a new label to this dimension and keeps track of its index.
-
#initialize(dim_name) ⇒ Dimension
constructor
———————————————————————————— dim_name is the dimension name.
-
#reset ⇒ Object
————————————————————————————.
-
#size ⇒ Object
(also: #length)
————————————————————————————.
Constructor Details
#initialize(dim_name) ⇒ Dimension
dim_name is the dimension name.
67 68 69 70 71 72 73 |
# File 'lib/dimensions.rb', line 67 def initialize(dim_name) @name = dim_name @frozen = false @next_value = 0 @max_value = 0 @labels = Hash.new end |
Instance Attribute Details
#current_value ⇒ Object (readonly)
Returns the value of attribute current_value.
58 59 60 |
# File 'lib/dimensions.rb', line 58 def current_value @current_value end |
#frozen ⇒ Object (readonly)
Returns the value of attribute frozen.
57 58 59 |
# File 'lib/dimensions.rb', line 57 def frozen @frozen end |
#index(label) ⇒ Object
61 62 63 |
# File 'lib/dimensions.rb', line 61 def index @index end |
#labels ⇒ Object (readonly)
Returns the value of attribute labels.
60 61 62 |
# File 'lib/dimensions.rb', line 60 def labels @labels end |
#name ⇒ Object (readonly)
Returns the value of attribute name.
56 57 58 |
# File 'lib/dimensions.rb', line 56 def name @name end |
#next_value ⇒ Object (readonly)
Returns the value of attribute next_value.
59 60 61 |
# File 'lib/dimensions.rb', line 59 def next_value @next_value end |
Instance Method Details
#[](label) ⇒ Object
158 159 160 |
# File 'lib/dimensions.rb', line 158 def[](label) index(label) end |
#add_label(label) ⇒ Object
Adds a new label to this dimension and keeps track of its index. Labels are indexed starting at 0 and always incrementing. All labels in the dimension are distinct. If trying to add a label that already exists, will:
-
add it if it is a new label and return its index;
-
return the index of an already existing label if the index is non-decreasing and monotonically increasing or if it is back to 0. That is, if the last returned index is 5, then the next index is either 5 or 6 (new label), or 0.
-
If the last returned index is 0, then the dimension becomes frozen and no more labels can be added to it. After this point, add_label has to be called always in the same order that it was called previously.
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
# File 'lib/dimensions.rb', line 98 def add_label(label) if (@labels.has_key?(label)) # Just read one more line with the same label. No problem, keep reading if (@labels[label] == @current_value) elsif (@labels[label] == @next_value) # Reading next label @current_value = @next_value @next_value = (@next_value + 1) % (@max_value + 1) elsif (@labels[label] < @current_value && @labels[label] == 0) # reached the last label and going back to the first one reset return true else # Label read is out of order. Expected value is either 0 (starting over) or # the next value. Although we raise an exception, we allow the calling method # to catch the exception and let the program still run. expected_value = (@labels[label] < @current_value)? 0 : @next_value reset if @labels[label] < @current_value @current_value = @labels[label] + 1 @next_value = @current_value + 1 raise "Missing data: next expected label was '#{@labels.key(expected_value)}' but read '#{label}'." end else @current_value = @labels[label] = @next_value @next_value += 1 # Trying to add a label when the dimension is frozen raises an exception raise "Dimension '#{@name}' is frozen when adding label '#{label}'." if frozen end false end |
#reset ⇒ Object
137 138 139 140 141 142 143 144 |
# File 'lib/dimensions.rb', line 137 def reset if !@frozen @frozen = true @max_value = @current_value @current_value = 0 @next_value = 1 end end |
#size ⇒ Object Also known as: length
79 80 81 |
# File 'lib/dimensions.rb', line 79 def size @labels.size end |