FastCSV
A fast Ragel-based CSV parser.
Usage
require 'fastcsv'
# Read from file.
File.open(filename) do |f|
FastCSV.raw_parse(f) do |row|
# do stuff
end
end
# Read from an IO object.
FastCSV.raw_parse(StringIO.new("foo,bar\n")) do |row|
# do stuff
end
# Read from a string.
FastCSV.raw_parse("foo,bar\n") do |row|
# do stuff
end
# Transcode like with the CSV module.
FastCSV.raw_parse("\xF1\n", encoding: 'iso-8859-1:utf-8') do |row|
# ["ñ"]
end
Development
ragel -G2 ext/fastcsv/fastcsv.rl
ragel -Vp ext/fastcsv/fastcsv.rl | dot -Tpng -o machine.png
rake compile
gem uninstall fastcsv
rake install
Why?
We evaluated many CSV Ruby gems, and they were either too slow or had implementation errors. rcsv is fast and libcsv-based, but it skips blank rows (Ruby's CSV module returns an empty array) and silently fails on input with an unclosed quote; nonetheless, it's an excellent alternative if you find errors in FastCSV! We looked for Ragel-based CSV parsers to copy, but they either had implementation errors or could not handle large inputs. commas looks good, but it performs a memory check on each character, which is overkill.
Bugs? Questions?
This project's main repository is on GitHub: http://github.com/opennorth/fastcsv, where your contributions, forks, bug reports, feature requests, and feedback are greatly welcomed.
Acknowledgements
Started as a Ruby 2.1 fork of MoonWolf [email protected]'s CSVScan, found in this commit. CSVScan uses Ragel code from HPricot from this commit.
Copyright (c) 2014 Open North Inc., released under the MIT license