Class: RemoteTable::Properties
- Inherits:
-
Object
- Object
- RemoteTable::Properties
- Defined in:
- lib/remote_table/properties.rb
Overview
Represents the properties of a RemoteTable, whether they are explicitly set by the user or inferred automatically.
Instance Attribute Summary collapse
-
#current_options ⇒ Object
readonly
Returns the value of attribute current_options.
-
#t ⇒ Object
readonly
Returns the value of attribute t.
Instance Method Summary collapse
-
#column_css ⇒ Object
The CSS selector used to find columns.
-
#column_xpath ⇒ Object
The XPath used to find columns.
-
#compression ⇒ Object
The compression type.
-
#crop ⇒ Object
Crop rows after this line.
-
#cut ⇒ Object
Cut columns up to this character.
-
#delimiter ⇒ Object
The delimiter.
-
#errata ⇒ Object
A hash of options to create a new Errata instance (see the Errata gem at github.com/seamusabshere/errata) to be used on every row.
- #external_encoding ⇒ Object
- #external_encoding_iconv ⇒ Object
-
#filename ⇒ Object
The filename, which can be used to pick a file out of an archive.
-
#form_data ⇒ Object
Form data to send in with the download request.
-
#format ⇒ Object
Get the format in the form of RemoteTable::Format::Excel, etc.
-
#glob ⇒ Object
The glob used to pick a file out of an archive.
-
#headers ⇒ Object
The headers specified by the user.
-
#initialize(t) ⇒ Properties
constructor
A new instance of Properties.
- #internal_encoding ⇒ Object
-
#keep_blank_rows ⇒ Object
Whether to keep blank rows.
- #output_class ⇒ Object
-
#packing ⇒ Object
The packing type.
-
#reject ⇒ Object
A proc to call to decide whether to return a row.
-
#row_css ⇒ Object
The CSS selector used to find rows.
-
#row_xpath ⇒ Object
The XPath used to find rows.
-
#schema ⇒ Object
The fixed-width schema, given as an array.
-
#schema_name ⇒ Object
The name of the fixed-width schema according to FixedWidth.
-
#select ⇒ Object
A proc to call to decide whether to return a row.
-
#sheet ⇒ Object
The sheet specified by the user as a number or a string.
-
#skip ⇒ Object
How many rows to skip.
-
#streaming ⇒ Object
Whether to stream the rows without caching them.
- #update(options) ⇒ Object
-
#uri ⇒ Object
The parsed URI of the file to get.
- #use_first_row_as_header? ⇒ Boolean
-
#warn_on_multiple_downloads ⇒ Object
Defaults to true.
Constructor Details
#initialize(t) ⇒ Properties
Returns a new instance of Properties.
8 9 10 11 |
# File 'lib/remote_table/properties.rb', line 8 def initialize(t) @t = t @current_options = t..symbolize_keys end |
Instance Attribute Details
#current_options ⇒ Object (readonly)
Returns the value of attribute current_options.
6 7 8 |
# File 'lib/remote_table/properties.rb', line 6 def @current_options end |
#t ⇒ Object (readonly)
Returns the value of attribute t.
5 6 7 |
# File 'lib/remote_table/properties.rb', line 5 def t @t end |
Instance Method Details
#column_css ⇒ Object
The CSS selector used to find columns
116 117 118 |
# File 'lib/remote_table/properties.rb', line 116 def column_css [:column_css] end |
#column_xpath ⇒ Object
The XPath used to find columns
106 107 108 |
# File 'lib/remote_table/properties.rb', line 106 def column_xpath [:column_xpath] end |
#compression ⇒ Object
The compression type.
Default: guessed from URI.
Can be specified as: :gz, :zip, :bz2, :exe (treated as :zip)
125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
# File 'lib/remote_table/properties.rb', line 125 def compression if .has_key?(:compression) return [:compression] end case ::File.extname(uri.path).downcase when /gz/, /gunzip/ :gz when /zip/ :zip when /bz2/, /bunzip2/ :bz2 when /exe/ :exe end end |
#crop ⇒ Object
Crop rows after this line
177 178 179 |
# File 'lib/remote_table/properties.rb', line 177 def crop [:crop] end |
#cut ⇒ Object
Cut columns up to this character
172 173 174 |
# File 'lib/remote_table/properties.rb', line 172 def cut [:cut] end |
#delimiter ⇒ Object
The delimiter
Default: “,”
96 97 98 |
# File 'lib/remote_table/properties.rb', line 96 def delimiter [:delimiter] || ',' end |
#errata ⇒ Object
A hash of options to create a new Errata instance (see the Errata gem at github.com/seamusabshere/errata) to be used on every row.
212 213 214 215 216 217 218 219 |
# File 'lib/remote_table/properties.rb', line 212 def errata return unless .has_key? :errata @errata ||= if [:errata].is_a? ::Hash ::Errata.new [:errata] else [:errata] end end |
#external_encoding ⇒ Object
85 86 87 |
# File 'lib/remote_table/properties.rb', line 85 def external_encoding 'UTF-8' end |
#external_encoding_iconv ⇒ Object
89 90 91 |
# File 'lib/remote_table/properties.rb', line 89 def external_encoding_iconv 'UTF-8//TRANSLIT' end |
#filename ⇒ Object
The filename, which can be used to pick a file out of an archive.
Example:
RemoteTable.new 'http://www.fueleconomy.gov/FEG/epadata/08data.zip', :filename => '2008_FE_guide_ALL_rel_dates_-no sales-for DOE-5-1-08.csv'
167 168 169 |
# File 'lib/remote_table/properties.rb', line 167 def filename [:filename] end |
#form_data ⇒ Object
Form data to send in with the download request
70 71 72 |
# File 'lib/remote_table/properties.rb', line 70 def form_data [:form_data] end |
#format ⇒ Object
Get the format in the form of RemoteTable::Format::Excel, etc.
Note: treats all spreadsheets.google.com URLs as Format::Delimited (i.e., CSV)
Default: guessed from file extension (which is usually the same as the URI, but sometimes not if you pick out a specific file from an archive)
Can be specified as: :xlsx, :xls, :delimited (aka :csv and :tsv), :ods, :fixed_width, :html
228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 |
# File 'lib/remote_table/properties.rb', line 228 def format return Format::Delimited if uri.host == 'spreadsheets.google.com' or @uri.host == 'docs.google.com' clue = if .has_key?(:format) [:format] else t.local_file.path end case clue.to_s.downcase when /xlsx/, /excelx/ Format::Excelx when /xls/, /excel/ Format::Excel when /csv/, /tsv/, /delimited/ Format::Delimited when /ods/, /open_?office/ Format::OpenOffice when /fixed_?width/ Format::FixedWidth when /htm/ Format::HTML when /xml/ Format::XML else Format::Delimited end end |
#glob ⇒ Object
The glob used to pick a file out of an archive.
Example:
RemoteTable.new 'http://www.fueleconomy.gov/FEG/epadata/08data.zip', :glob => '/*.csv'
159 160 161 |
# File 'lib/remote_table/properties.rb', line 159 def glob [:glob] end |
#headers ⇒ Object
The headers specified by the user
Default: :first_row
43 44 45 |
# File 'lib/remote_table/properties.rb', line 43 def headers [:headers].nil? ? :first_row : [:headers] end |
#internal_encoding ⇒ Object
81 82 83 |
# File 'lib/remote_table/properties.rb', line 81 def internal_encoding ([:encoding] || 'UTF-8').upcase end |
#keep_blank_rows ⇒ Object
Whether to keep blank rows
Default: false
65 66 67 |
# File 'lib/remote_table/properties.rb', line 65 def keep_blank_rows [:keep_blank_rows] || false end |
#output_class ⇒ Object
51 52 53 |
# File 'lib/remote_table/properties.rb', line 51 def output_class headers == false ? ::Array : ::ActiveSupport::OrderedHash end |
#packing ⇒ Object
The packing type.
Default: guessed from URI.
Can be specified as: :tar
146 147 148 149 150 151 152 153 |
# File 'lib/remote_table/properties.rb', line 146 def packing if .has_key?(:packing) return [:packing] end if uri.path =~ %r{\.tar(?:\.|$)}i :tar end end |
#reject ⇒ Object
A proc to call to decide whether to return a row.
207 208 209 |
# File 'lib/remote_table/properties.rb', line 207 def reject [:reject] end |
#row_css ⇒ Object
The CSS selector used to find rows
111 112 113 |
# File 'lib/remote_table/properties.rb', line 111 def row_css [:row_css] end |
#row_xpath ⇒ Object
The XPath used to find rows
101 102 103 |
# File 'lib/remote_table/properties.rb', line 101 def row_xpath [:row_xpath] end |
#schema ⇒ Object
The fixed-width schema, given as an array
Example:
RemoteTable.new('http://cloud.github.com/downloads/seamusabshere/remote_table/test2.fixed_width.txt',
:format => :fixed_width,
:skip => 1,
:schema => [[ 'header4', 10, { :type => :string } ],
[ 'spacer', 1 ],
[ 'header5', 10, { :type => :string } ],
[ 'spacer', 12 ],
[ 'header6', 10, { :type => :string } ]])
192 193 194 |
# File 'lib/remote_table/properties.rb', line 192 def schema [:schema] end |
#schema_name ⇒ Object
The name of the fixed-width schema according to FixedWidth
197 198 199 |
# File 'lib/remote_table/properties.rb', line 197 def schema_name [:schema_name] end |
#select ⇒ Object
A proc to call to decide whether to return a row.
202 203 204 |
# File 'lib/remote_table/properties.rb', line 202 def select [:select] end |
#sheet ⇒ Object
The sheet specified by the user as a number or a string
Default: 0
58 59 60 |
# File 'lib/remote_table/properties.rb', line 58 def sheet [:sheet] || 0 end |
#skip ⇒ Object
How many rows to skip
Default: 0
77 78 79 |
# File 'lib/remote_table/properties.rb', line 77 def skip [:skip] || 0 end |
#streaming ⇒ Object
Whether to stream the rows without caching them. Saves memory, but you have to re-download the file every time you…
-
call []
-
call each
Defaults to false.
31 32 33 |
# File 'lib/remote_table/properties.rb', line 31 def streaming [:streaming] || false end |
#update(options) ⇒ Object
13 14 15 |
# File 'lib/remote_table/properties.rb', line 13 def update() .update end |
#uri ⇒ Object
The parsed URI of the file to get.
18 19 20 21 22 23 24 25 |
# File 'lib/remote_table/properties.rb', line 18 def uri return @uri if @uri.is_a?(::URI) @uri = ::URI.parse t.url if @uri.host == 'spreadsheets.google.com' or @uri.host == 'docs.google.com' @uri.query = 'output=csv&' + @uri.query.sub(/\&?output=.*?(\&|\z)/, '\1') end @uri end |
#use_first_row_as_header? ⇒ Boolean
47 48 49 |
# File 'lib/remote_table/properties.rb', line 47 def use_first_row_as_header? headers == :first_row end |
#warn_on_multiple_downloads ⇒ Object
Defaults to true.
36 37 38 |
# File 'lib/remote_table/properties.rb', line 36 def warn_on_multiple_downloads [:warn_on_multiple_downloads] != false end |