Class: Fech::Filing
- Inherits:
-
Object
- Object
- Fech::Filing
- Defined in:
- lib/fech/filing.rb
Overview
Fech::Filing downloads an Electronic Filing given its ID, and will search rows by row type. Using a child Translator object, the data in each row is automatically mapped at runtime into a labeled Hash. Additional Translations may be added to change the way that data is mapped and cleaned.
Direct Known Subclasses
Constant Summary collapse
- FIRST_V3_FILING =
first filing number using the version >=3.00 format note that there are plenty of <v3 filings after this, so readable? still needs to be checked
11850
Instance Attribute Summary collapse
-
#download_dir ⇒ Object
Returns the value of attribute download_dir.
-
#filing_id ⇒ Object
Returns the value of attribute filing_id.
-
#translator ⇒ Object
Returns the value of attribute translator.
Class Method Summary collapse
-
.map_for(row_type, opts = {}) ⇒ Object
Returns the column names for given row type and version in the order they appear in row data.
Instance Method Summary collapse
-
#amendment? ⇒ Boolean
Whether this filing amends a previous filing or not.
-
#amends ⇒ Object
Returns the filing ID of the past filing this one amends, nil if this is a first-draft filing.
-
#custom_file_path ⇒ Object
The file path where custom versions of a filing are to be saved.
-
#delimiter ⇒ String
The delimiter used in the filing’s version.
-
#download ⇒ Object
Saves the filing data from the FEC website into the default download directory.
-
#each_row(opts = {}) {|Array| ... } ⇒ Object
Iterates over and yields the Filing’s lines.
-
#each_row_with_index(&block) ⇒ Object
Wrapper around .each_row to include indexes.
-
#file_contents ⇒ Object
The raw contents of the Filing.
- #file_name ⇒ Object
-
#file_path ⇒ Object
The location of the Filing on the file system.
- #filing_url ⇒ Object
-
#filing_version ⇒ Object
The version of the FEC software used to generate this Filing.
-
#fix_f99_contents ⇒ Object
Handle the contents of F99s by removing the [BEGINTEXT] and [ENDTEXT] delimiters and putting the text content onto the same line as the summary.
-
#form_type ⇒ Object
Determine the form type of the filing before it’s been parsed.
-
#hash_zip(keys, values) ⇒ Fech::Mapped, Hash
Combines an array of keys and values into an Fech::Mapped object, a type of Hash.
-
#header(opts = {}) ⇒ Hash
Access the header (first) line of the filing, containing information about the filing’s version and metadata about the software used to file it.
-
#initialize(filing_id, opts = {}) ⇒ Filing
constructor
Create a new Filing object, assign the download directory to system’s temp folder by default.
-
#map(row, opts = {}) ⇒ Object
Maps a raw row to a labeled hash following any rules given in the filing’s Translator based on its version and row type.
-
#map_for(row_type) ⇒ Object
Returns the column names for given row type and the filing’s version in the order they appear in row data.
-
#mappings ⇒ Object
Gets or creats the Mappings instance for this filing_version.
-
#parse_filing_version ⇒ Object
Pulls out the version number from the header line.
-
#parse_row?(row, opts = {}) ⇒ Boolean
Decides what to do with a given row.
-
#readable? ⇒ Boolean
Only FEC format 3.00 + is supported.
-
#resave_f99_contents ⇒ Object
Resave the “fixed” version of an F99.
-
#rows_like(row_type, opts = {}) {|Hash| ... } ⇒ Array
Access all lines of the filing that match a given row type.
-
#summary ⇒ Hash
Access the summary (second) line of the filing, containing aggregate and top-level information about the filing.
- #translate {|t| ... } ⇒ Object
Constructor Details
#initialize(filing_id, opts = {}) ⇒ Filing
Create a new Filing object, assign the download directory to system’s temp folder by default.
23 24 25 26 27 28 29 30 31 32 |
# File 'lib/fech/filing.rb', line 23 def initialize(filing_id, opts={}) @filing_id = filing_id @download_dir = opts[:download_dir] || Dir.tmpdir @translator = Fech::Translator.new(:include => opts[:translate]) @quote_char = opts[:quote_char] || '"' @csv_parser = opts[:csv_parser] || Fech::Csv @resaved = false @customized = false @encoding = opts[:encoding] || 'iso-8859-1:utf-8' end |
Instance Attribute Details
#download_dir ⇒ Object
Returns the value of attribute download_dir.
16 17 18 |
# File 'lib/fech/filing.rb', line 16 def download_dir @download_dir end |
#filing_id ⇒ Object
Returns the value of attribute filing_id.
16 17 18 |
# File 'lib/fech/filing.rb', line 16 def filing_id @filing_id end |
#translator ⇒ Object
Returns the value of attribute translator.
16 17 18 |
# File 'lib/fech/filing.rb', line 16 def translator @translator end |
Class Method Details
.map_for(row_type, opts = {}) ⇒ Object
Returns the column names for given row type and version in the order they appear in row data.
168 169 170 |
# File 'lib/fech/filing.rb', line 168 def self.map_for(row_type, opts={}) Fech::Mappings.for_row(row_type, opts) end |
Instance Method Details
#amendment? ⇒ Boolean
Whether this filing amends a previous filing or not.
183 184 185 |
# File 'lib/fech/filing.rb', line 183 def amendment? !amends.nil? end |
#amends ⇒ Object
Returns the filing ID of the past filing this one amends, nil if this is a first-draft filing. :report_id in the HDR line references the amended filing
190 191 192 |
# File 'lib/fech/filing.rb', line 190 def amends header[:report_id] end |
#custom_file_path ⇒ Object
The file path where custom versions of a filing are to be saved.
259 260 261 |
# File 'lib/fech/filing.rb', line 259 def custom_file_path File.join(download_dir, "fech_#{file_name}") end |
#delimiter ⇒ String
Returns the delimiter used in the filing’s version.
334 335 336 |
# File 'lib/fech/filing.rb', line 334 def delimiter filing_version.to_f < 6 ? "," : "\034" end |
#download ⇒ Object
Saves the filing data from the FEC website into the default download directory.
36 37 38 39 40 41 42 43 44 45 46 |
# File 'lib/fech/filing.rb', line 36 def download File.open(file_path, 'w') do |file| begin file << open(filing_url).read rescue file << open(filing_url).read.ensure_encoding('UTF-8', :external_encoding => Encoding::UTF_8, :invalid_characters => :drop) end end self end |
#each_row(opts = {}) {|Array| ... } ⇒ Object
Iterates over and yields the Filing’s lines
309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 |
# File 'lib/fech/filing.rb', line 309 def each_row(opts={}, &block) unless File.exists?(file_path) raise "File #{file_path} does not exist. Try invoking the .download method on this Filing object." end # If this is an F99, we need to parse it differently. resave_f99_contents if ['F99', '"F99"'].include? form_type c = 0 @csv_parser.parse_row(@customized ? custom_file_path : file_path, opts.merge(:col_sep => delimiter, :quote_char => @quote_char, :skip_blanks => true, :encoding => @encoding)) do |row| if opts[:with_index] yield [row, c] c += 1 else yield row end end end |
#each_row_with_index(&block) ⇒ Object
Wrapper around .each_row to include indexes
329 330 331 |
# File 'lib/fech/filing.rb', line 329 def each_row_with_index(&block) each_row(:with_index => true, &block) end |
#file_contents ⇒ Object
The raw contents of the Filing
236 237 238 |
# File 'lib/fech/filing.rb', line 236 def file_contents File.open(file_path, 'r') end |
#file_name ⇒ Object
297 298 299 |
# File 'lib/fech/filing.rb', line 297 def file_name "#{filing_id}.fec" end |
#file_path ⇒ Object
The location of the Filing on the file system
231 232 233 |
# File 'lib/fech/filing.rb', line 231 def file_path File.join(download_dir, file_name) end |
#filing_url ⇒ Object
301 302 303 |
# File 'lib/fech/filing.rb', line 301 def filing_url "http://docquery.fec.gov/dcdev/posted/#{filing_id}.fec" end |
#filing_version ⇒ Object
The version of the FEC software used to generate this Filing
204 205 206 |
# File 'lib/fech/filing.rb', line 204 def filing_version @filing_version ||= parse_filing_version end |
#fix_f99_contents ⇒ Object
Handle the contents of F99s by removing the
- BEGINTEXT
-
and [ENDTEXT] delimiters and
putting the text content onto the same line as the summary.
267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 |
# File 'lib/fech/filing.rb', line 267 def fix_f99_contents @customized = true content = file_contents.read if RUBY_VERSION > "1.9.2" content.encode!('UTF-16', 'UTF-8', :invalid => :replace, :undef => :replace, :replace => '?') content.encode!('UTF-8', 'UTF-16') else require 'iconv' ic = Iconv.new('UTF-8//IGNORE', 'UTF-8') content = ic.iconv(content + ' ')[0..-2] # add valid byte before converting, then remove it end regex = /\n\[BEGINTEXT\]\n(.*?)\[ENDTEXT\]\n/mi # some use eg [EndText] match = content.match(regex) if match repl = match[1].gsub(/"/, '""') content.gsub(regex, "#{delimiter}\"#{repl}\"") else content end end |
#form_type ⇒ Object
Determine the form type of the filing before it’s been parsed. This is needed for the F99 special case.
243 244 245 246 247 248 249 250 251 252 253 254 255 |
# File 'lib/fech/filing.rb', line 243 def form_type if RUBY_VERSION >= "2.0" lines = file_contents.each_line else lines = file_contents.lines end lines.each_with_index do |row, index| next if index == 0 return row.split(delimiter).first end end |
#hash_zip(keys, values) ⇒ Fech::Mapped, Hash
Combines an array of keys and values into an Fech::Mapped object, a type of Hash.
199 200 201 |
# File 'lib/fech/filing.rb', line 199 def hash_zip(keys, values) Fech::Mapped.new(self, values.first).merge(Hash[*keys.zip(values).flatten]) end |
#header(opts = {}) ⇒ Hash
Access the header (first) line of the filing, containing information about the filing’s version and metadata about the software used to file it.
51 52 53 54 55 |
# File 'lib/fech/filing.rb', line 51 def header(opts={}) each_row do |row| return parse_row?(row) end end |
#map(row, opts = {}) ⇒ Object
Maps a raw row to a labeled hash following any rules given in the filing’s Translator based on its version and row type. Finds the correct map for a given row, performs any matching Translations on the individual values, and returns either the entire dataset, or just those fields requested.
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
# File 'lib/fech/filing.rb', line 119 def map(row, opts={}) data = Fech::Mapped.new(self, row.first) full_row_map = map_for(row.first) # If specific fields were asked for, return only those if opts[:include] row_map = full_row_map.select { |k| opts[:include].include?(k) } else row_map = full_row_map end # Inserts the row into data, performing any specified preprocessing # on individual cells along the way row_map.each_with_index do |field, index| value = row[full_row_map.index(field)] translator.get_translations(:row => row.first, :version => filing_version, :action => :convert, :field => field).each do |translation| # User's Procs should be given each field's value as context value = translation[:proc].call(value) end data[field] = value end # Performs any specified group preprocessing / combinations combinations = translator.get_translations(:row => row.first, :version => filing_version, :action => :combine) row_hash = hash_zip(row_map, row) if combinations combinations.each do |translation| # User's Procs should be given the entire row as context value = translation[:proc].call(row_hash) field = translation[:field].source.gsub(/[\^\$]*/, "").to_sym data[field] = value end data end |
#map_for(row_type) ⇒ Object
Returns the column names for given row type and the filing’s version in the order they appear in row data.
160 161 162 |
# File 'lib/fech/filing.rb', line 160 def map_for(row_type) mappings.for_row(row_type) end |
#mappings ⇒ Object
Gets or creats the Mappings instance for this filing_version
226 227 228 |
# File 'lib/fech/filing.rb', line 226 def mappings @mapping ||= Fech::Mappings.new(filing_version) end |
#parse_filing_version ⇒ Object
Pulls out the version number from the header line. Must parse this line manually, since we don’t know the version yet, and thus the delimiter type is still a mystery.
211 212 213 214 215 216 217 218 |
# File 'lib/fech/filing.rb', line 211 def parse_filing_version first = File.open(file_path).first if first.index("\034").nil? @csv_parser.parse(first).flatten[2] else @csv_parser.parse(first, :col_sep => "\034").flatten[2] end end |
#parse_row?(row, opts = {}) ⇒ Boolean
Decides what to do with a given row. If the row’s type matches the desired type, or if no type was specified, it will run the row through #map. If :raw was passed true, a flat, unmapped data array will be returned.
99 100 101 102 103 104 105 106 107 108 109 |
# File 'lib/fech/filing.rb', line 99 def parse_row?(row, opts={}) return false if row.nil? || row.empty? # Always parse, unless :parse_if is given and does not match row if opts[:parse_if].nil? || \ Fech.regexify(opts[:parse_if]).match(row.first.downcase) opts[:raw] ? row : map(row, opts) else false end end |
#readable? ⇒ Boolean
Only FEC format 3.00 + is supported
221 222 223 |
# File 'lib/fech/filing.rb', line 221 def readable? filing_version.to_i >= 3 end |
#resave_f99_contents ⇒ Object
Resave the “fixed” version of an F99
291 292 293 294 295 |
# File 'lib/fech/filing.rb', line 291 def resave_f99_contents return true if @resaved File.open(custom_file_path, 'w') { |f| f.write(fix_f99_contents) } @resaved = true end |
#rows_like(row_type, opts = {}) {|Hash| ... } ⇒ Array
Access all lines of the filing that match a given row type. Will return an Array of all available lines if called directly, or will yield the mapped rows one by one if a block is passed.
78 79 80 81 82 83 84 85 86 87 88 89 90 |
# File 'lib/fech/filing.rb', line 78 def rows_like(row_type, opts={}, &block) data = [] each_row(:row_type => row_type) do |row| value = parse_row?(row, opts.merge(:parse_if => row_type)) next if value == false if block_given? yield value else data << value if value end end block_given? ? nil : data end |
#summary ⇒ Hash
Access the summary (second) line of the filing, containing aggregate and top-level information about the filing.
60 61 62 63 64 65 |
# File 'lib/fech/filing.rb', line 60 def summary each_row_with_index do |row, index| next if index == 0 return parse_row?(row) end end |
#translate {|t| ... } ⇒ Object
174 175 176 177 178 179 180 |
# File 'lib/fech/filing.rb', line 174 def translate(&block) if block_given? yield translator else translator end end |