Class: CTioga2::Data::Dataset

Inherits:

Object

Object
CTioga2::Data::Dataset

Defined in:: lib/ctioga2/data/dataset.rb

Overview

This is the central class of the data manipulation in ctioga. It is a series of ‘Y’ DataColumn indexed on a unique ‘X’ DataColumn. This can be used to represent multiple XY data sets, but also XYZ and even more complex data. The actual signification of the various ‘Y’ columns are left to the user.

Instance Attribute Summary collapse

#name ⇒ Object

The name of the Dataset, such as one that could be used in a legend (like for the –auto-legend option of ctioga).
#x ⇒ Object

The X DataColumn.
#ys ⇒ Object

All Y DataColumn (an Array of DataColumn).

Class Method Summary collapse

.dataset_from_spec(name, spec) ⇒ Object

Creates a new Dataset from a specification.

Instance Method Summary collapse

#<<(dataset) ⇒ Object

Concatenates another Dataset to this one.
#column_names ⇒ Object

Returns an array with Column names.
#each_values ⇒ Object

Iterates over all the values of the Dataset.
#initialize(name, columns) ⇒ Dataset constructor

Creates a new Dataset object with the given data columns (Dvector or DataColumn).
#select!(&block) ⇒ Object

Modifies the dataset to only keep the data for which the block returns true.
#select_formula!(formula) ⇒ Object

Same as #select!, but you give it a text formula instead of a block.
#size ⇒ Object

The overall number of columns.
#sort! ⇒ Object

Sorts all columns according to X values.
#trim!(nb) ⇒ Object

Trims all data columns.
#y ⇒ Object

The main Y column (ie, the first one).
#z ⇒ Object

The Z column, if applicable.

Constructor Details

#initialize(name, columns) ⇒ `Dataset`

Creates a new Dataset object with the given data columns (Dvector or DataColumn). #x is the first one

# File 'lib/ctioga2/data/dataset.rb', line 47

def initialize(name, columns)
  columns.each_index do |i|
    if columns[i].is_a? Dobjects::Dvector
      columns[i] = DataColumn.new(columns[i])
    end
  end
  @x = columns[0]
  @ys = columns[1..-1]
  @name = name
end

Instance Attribute Details

#name ⇒ `Object`

The name of the Dataset, such as one that could be used in a legend (like for the –auto-legend option of ctioga).



43
44
45

# File 'lib/ctioga2/data/dataset.rb', line 43

def name
  @name
end

#x ⇒ `Object`

The X DataColumn



36
37
38

# File 'lib/ctioga2/data/dataset.rb', line 36

def x
  @x
end

#ys ⇒ `Object`

All Y DataColumn (an Array of DataColumn)



39
40
41

# File 'lib/ctioga2/data/dataset.rb', line 39

def ys
  @ys
end

Class Method Details

.dataset_from_spec(name, spec) ⇒ `Object`

Creates a new Dataset from a specification. This function parses a specification in the form of:

a:b:c+
spec=a:spec2=b+

It yields each of the unprocessed text, not necessarily in the order they were read, and expects a Dvector as a return value.

It then builds a suitable Dataset object with these values, and returns it.

It is strongly recommended to use this function for reimplementations of Backends::Backend#query_dataset.

# File 'lib/ctioga2/data/dataset.rb', line 71

def self.dataset_from_spec(name, spec)
  specs = []
  i = 0
  for s in spec.split(/:/)
    if s =~ /^(x|y\d*|z)(#{DataColumn::ColumnSpecsRE})=(.*)/i
      which, mod, s = $1.downcase,($2 && $2.downcase) || "value",$3
      
      case which
      when /x/
        idx = 0
      when /y(\d+)?/
        if $1
          idx = $1.to_i
        else
          idx = 1
        end
      when /z/
        idx = 2
      end
      specs[idx] ||= {}
      specs[idx][mod] = yield s
    else
      specs[i] = {"value" =>  yield(s)}
    end
    i += 1
  end
  columns = []
  for s in specs
    columns << DataColumn.from_hash(s)
  end
  return Dataset.new(name, columns)
end

Instance Method Details

#<<(dataset) ⇒ `Object`

Concatenates another Dataset to this one

# File 'lib/ctioga2/data/dataset.rb', line 154

def <<(dataset)
  if dataset.size != self.size
    raise "Can't concatenate datasets that don't have the same number of columns: #{self.size} vs #{dataset.size}"
  end
  @x << dataset.x
  @ys.size.times do |i|
    @ys[i] << dataset.ys[i]
  end
end

#column_names ⇒ `Object`

Returns an array with Column names.

# File 'lib/ctioga2/data/dataset.rb', line 129

def column_names
  retval = @x.column_names("x")
  @ys.each_index do |i|
    retval += @ys[i].column_names("y#{i+1}")
  end
  return retval
end

#each_values ⇒ `Object`

Iterates over all the values of the Dataset

# File 'lib/ctioga2/data/dataset.rb', line 138

def each_values
  @x.size.times do |i|
    v = @x.values_at(i)
    for y in @ys
      v += y.values_at(i)
    end
    yield i, *v
  end
end

#select!(&block) ⇒ `Object`

Modifies the dataset to only keep the data for which the block returns true. The block should take the following arguments, in order:

x, xmin, xmax, y, ymin, ymax, y1, y1min, y1max,

_z_, _zmin_, _zmax_, _y2_, _y2min_, _y2max_, _y3_, _y3min_, _y3max_

# File 'lib/ctioga2/data/dataset.rb', line 180

def select!(&block)
  target = []
  @x.size.times do |i|
    args = @x.values_at(i, true)
    args.concat(@ys[0].values_at(i, true) * 2)
    if @ys[1]
      args.concat(@ys[1].values_at(i, true) * 2)
      for yvect in @ys[2..-1]
        args.concat(yvect.values_at(i, true))
      end
    end
    if block.call(*args)
      target << i
    end
  end
  for col in all_columns
    col.reindex(target)
  end
end

#select_formula!(formula) ⇒ `Object`

Same as #select!, but you give it a text formula instead of a block. It internall calls #select!, by the way ;-)…

# File 'lib/ctioga2/data/dataset.rb', line 202

def select_formula!(formula)
  names = @x.column_names('x', true)
  names.concat(@x.column_names('y', true))
  names.concat(@x.column_names('y1', true))
  if @ys[1]
    names.concat(@x.column_names('z', true))
    names.concat(@x.column_names('y2', true))
    i = 3
    for yvect in @ys[2..-1]
      names.concat(@x.column_names("y#{i}", true))
      i += 1
    end
  end
  block = eval("proc do |#{names.join(',')}|\n#{formula}\nend")
  select!(&block)
end

#size ⇒ `Object`

The overall number of columns



149
150
151

# File 'lib/ctioga2/data/dataset.rb', line 149

def size
  return 1 + @ys.size
end

#sort! ⇒ `Object`

Sorts all columns according to X values

# File 'lib/ctioga2/data/dataset.rb', line 115

def sort!
  idx_vector = Dobjects::Dvector.new(@x.values.size) do |i|
    i
  end
  f = Dobjects::Function.new(@x.values.dup, idx_vector)
  f.sort
  # Now, idx_vector contains the indices that make X values
  # sorted.
  for col in all_columns
    col.reindex(idx_vector)
  end
end

#trim!(nb) ⇒ `Object`

Trims all data columns. See DataColumn#trim!

# File 'lib/ctioga2/data/dataset.rb', line 166

def trim!(nb)
  for col in all_columns
    col.trim!(nb)
  end
end

#y ⇒ `Object`

The main Y column (ie, the first one)



105
106
107

# File 'lib/ctioga2/data/dataset.rb', line 105

def y
  return @ys[0]
end

#z ⇒ `Object`

The Z column, if applicable



110
111
112

# File 'lib/ctioga2/data/dataset.rb', line 110

def z
  return @ys[1]
end

Class: CTioga2::Data::Dataset

Overview

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(name, columns) ⇒ Dataset

Instance Attribute Details

#name ⇒ Object

#x ⇒ Object

#ys ⇒ Object

Class Method Details

.dataset_from_spec(name, spec) ⇒ Object

Instance Method Details

#<<(dataset) ⇒ Object

#column_names ⇒ Object

#each_values ⇒ Object

#select!(&block) ⇒ Object

#select_formula!(formula) ⇒ Object

#size ⇒ Object

#sort! ⇒ Object

#trim!(nb) ⇒ Object

#y ⇒ Object

#z ⇒ Object