Wrapper for the wv2 library: parses Microsoft Word files. So far it fires callbacks to a TextHandler, SubDocumentHandler, TableHandler and InlineReplacementHandler if any of these are registered with the Parser.
tested with ruby 1.8, let us know about other versions!
De-Compress archive and enter its top directory. Then type:
$ ruby install.rb config [-- --with-wv2-include,lib,dir=path_to_wv2] $ ruby install.rb setup ($ su) # ruby install.rb install
You can also install files into your favorite directory by supplying setup.rb some options. Try "ruby setup.rb --help".
require 'rwv2' require 'rwv2/handlers'
class TextHandler < Rwv2::TextHandler def run_of_text(text, character_properties) puts text end end
parser = Rwv2.create_parser('test/data/test2.doc') parser.set_text_handler(TextHandler.new) parser.parse
Rwv2 does not yet support the full set of Wordfile-Properties.
Notably missing are: - Font Family Name (FFN) - Tab Descriptor (TabDescriptor) - Word-internal Date and Time (DTTM) - Shading Descriptor (SHD) - Paragraph Height (PHE) - Border Code (BRC) - Table Autoformat - Autonumbering and many more - I'm taking the YAGNI (you aren't gonna need it) approach to most of these, if you actually do need one of them or any other feature let me know...
wvWare writes errors and warnings and infos directly to std::cerr - this can
possibly be caught by replacing cerrs buffer. The tricky thing then is to raise/warn/ignore according to the buffers content, probably within a separate thread...
Some of the testing is unclean, as I've only tested with
URL: download.ywesee.com/ruby/rwv2 Authors: Rwv2 was written by Hannes Wyss <email@example.com> wvWare was written by Caol?n McNamara and is currently (22.8.2003) maintained by Dom Lachowicz ~ ~ ~