Class: RStore::CSV

Inherits:
Object
  • Object
show all
Defined in:
lib/rstore/csv.rb

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(&block) ⇒ CSV

This constructor takes a block yielding an implicit instance of self. Within the block, the following methods need to be called:

Examples:

RStore::CSV.new do
  from '../easter/children', :recursive => true                   # select a directory or
  from '../christmas/children/toys.csv'                           # file, or
  from 'www.example.com/sweets.csv', :selector => 'pre div.line'  # URL
  to   'company.products'                                         # provide database and table name
  run                                                             # run the program
end


37
38
39
40
41
42
43
44
45
46
47
48
49
50
# File 'lib/rstore/csv.rb', line 37

def initialize &block
  @data_hash  = {}
  @data_array = []
  @database   = nil
  @table      = nil

  # Tracking method calls to #from, #to, and #run.
  @from = false
  @to   = false
  @run  = false

  instance_eval(&block) if block_given?

end

Instance Attribute Details

#data_arrayArray<Data> (readonly)



20
21
22
# File 'lib/rstore/csv.rb', line 20

def data_array
  @data_array
end

#databaseBaseDB (readonly)



16
17
18
# File 'lib/rstore/csv.rb', line 16

def database
  @database
end

#tableBaseTable (readonly)



18
19
20
# File 'lib/rstore/csv.rb', line 18

def table
  @table
end

Class Method Details

.change_default_options(options)

This method returns an undefined value.

Change default options recognized by #from The new option values apply to all following instances of RStore::CSV Options can be reset to their defaults by calling reset_default_options See #from for a list of all options and their default values.

Examples:

# Search directories recursively and handle the first row of a file as data by default
RStore::CSV.change_default_options(:recursive => true, :has_headers => false)


275
276
277
# File 'lib/rstore/csv.rb', line 275

def self.change_default_options options
  Configuration.change_default_options(options)
end

.query(db_table) {|table| ... }

This method returns an undefined value.

Easy querying by yielding a Sequel::Dataset instance of your table.

Examples:

RStore::CSV.query('company.products') do |table|    # table = Sequel::Dataset object
  table.all                                         # fetch everything
  table.all[3]                                      # fetch row number 4
  table.filter(:id => 2).update(:on_stock => true)  # update entry
  table.filter(:id => 3).delete                     # delete entry
end

Yield Parameters:

  • table (Sequel::Dataset)

    The dataset of your table



245
246
247
248
249
250
# File 'lib/rstore/csv.rb', line 245

def self.query db_table, &block
  database, table = database_table(db_table)
  database.connect do |db|
    block.call(db[table.name]) if block_given?  # Sequel::Dataset
  end
end

.reset_default_options

This method returns an undefined value.

Reset the options recognized by #from to their default values.

Examples:

RStore::CSV.reset_default_options


285
286
287
# File 'lib/rstore/csv.rb', line 285

def self.reset_default_options
  Configuration.reset_default_options
end

Instance Method Details

#from(source, options) #from(source)

This method returns an undefined value.

Specify the source of the csv file(s) There can be several calls to this method on given instance of RStore::CSV. This method has to be called before #run.

Examples:

store = RStore::CSV.new
# fetching data from a file
store.from '../christmas/children/toys.csv'
# fetching data from a directory
store.from '../easter/children', :recursive => true
# fetching data from an URL
store.from 'www.example.com/sweets.csv', :selector => 'pre div.line'

Overloads:

  • #from(source, options)

    Options Hash (options):

    • :has_headers (Boolean)

      When set to false, the first line of a file is processed as data, otherwise it is discarded. (default: true)

    • :recursive (Boolean)

      When set to true and a directory is given, recursively search for files. Non-csv files are skipped. (default: false]

    • :selector (String)

      Mandatory css selector when fetching data from an URL. Uses the same syntax as Nokogiri, default: ""

    • :col_sep (String)

      The String placed between each field. (default: ",")

    • :row_sep (String, Symbol)

      The String appended to the end of each row. (default: :auto)

    • :quote_car (String)

      The character used to quote fields. (default: '"')

    • :field_size_limit (Integer, Nil)

      The maximum size CSV will read ahead looking for the closing quote for a field. (default: nil)

    • :skip_blanks (Boolean)

      When set to a true value, CSV will skip over any rows with no content. (default: false)

    • :digit_seps (Array)

      The thousands separator and decimal mark used for numbers in the data source (default: [',', '.']). Different countries use different thousands separators and decimal marks, and setting this options ensures that parsing of these numbers succeeds. Note that all numbers will still be stored in the format that Ruby recognizes, that is with a point (.) as the decimal mark.



89
90
91
92
93
# File 'lib/rstore/csv.rb', line 89

def from source, options={}
  crawler = FileCrawler.new(source, :csv, options)
  @data_hash.merge!(crawler.data_hash)
  @from = true
end

#ran_once?Boolean

Test if the data has been inserted into the database table.



261
262
263
# File 'lib/rstore/csv.rb', line 261

def ran_once?
  @run == true
end

#run

This method returns an undefined value.

Start processing the csv files, storing the data into a database table. Both methods, #from and #to, have to be called before this method.

Raises:

  • (Exception)


132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
# File 'lib/rstore/csv.rb', line 132

def run
  return  if ran_once?   # Ignore subsequent calls to #run
  raise Exception, "At least one method 'from' has to be called before method 'run'"  unless @from == true
  raise Exception, "Method 'to' has to be called before method 'run'"                 unless @to   == true

  @data_hash.each do |path, data|
    content = read_data(data)
    @data_array << Data.new(path, content, :raw, data.options)
  end

  @database.connect do |db|

    create_table(db)
    name = @table.name

    prepared_data_array = @data_array.map do |data|
      data.parse_csv.convert_fields(db, name)
    end

    insert_all(prepared_data_array, db, name)

    @run = true
    message = <<-TEXT.gsub(/^\s+/, '')
    ===============================
    All data has been successfully inserted into table '#{database.name}.#{table.name}'"
    -------------------------------
    You can retrieve all table data with the following code:
    -------------------------------
    #{self.class}.query('#{database.name}.#{table.name}') do |table|
      table.all
    end
    ===============================
    TEXT
    puts message
  end
end

#to(db_table)

This method returns an undefined value.

Choose the database table to store the csv data into. This method has to be called before #run.

Examples:

store = RStore::CSV.new
store.to('company.products')


107
108
109
110
# File 'lib/rstore/csv.rb', line 107

def to db_table
  @database, @table = CSV.database_table(db_table)
  @to       = true
end