Wrapper for the wv2 library: parses Microsoft Word files. So far it fires callbacks to a TextHandler, SubDocumentHandler, TableHandler and InlineReplacementHandler if any of these are registered with the Parser.




  • libwv2

  • tested with ruby 1.8, let us know about other versions!


De-Compress archive and enter its top directory. Then type:

$ ruby install.rb config [-- --with-wv2-include,lib,dir=path_to_wv2] $ ruby install.rb setup ($ su) # ruby install.rb install

You can also install files into your favorite directory by supplying setup.rb some options. Try "ruby setup.rb --help".


require 'rwv2' require 'rwv2/handlers'

class TextHandler < Rwv2::TextHandler def run_of_text(text, character_properties) puts text end end

parser = Rwv2.create_parser('test/data/test2.doc') parser.set_text_handler( parser.parse


  • Rwv2 does not yet support the full set of Wordfile-Properties.

Notably missing are: - Font Family Name (FFN) - Tab Descriptor (TabDescriptor) - Word-internal Date and Time (DTTM) - Shading Descriptor (SHD) - Paragraph Height (PHE) - Border Code (BRC) - Table Autoformat - Autonumbering and many more - I'm taking the YAGNI (you aren't gonna need it) approach to most of these, if you actually do need one of them or any other feature let me know...

  • wvWare writes errors and warnings and infos directly to std::cerr - this can

possibly be caught by replacing cerrs buffer. The tricky thing then is to raise/warn/ignore according to the buffers content, probably within a separate thread...

  • Some of the testing is unclean, as I've only tested with

OpenOffice-exported Wordfiles

  • Documentation



URL: Authors: Rwv2 was written by Hannes Wyss <> wvWare was written by Caol?n McNamara and is currently (22.8.2003) maintained by Dom Lachowicz ~ ~ ~