Class: Treat::Workers::Formatters::Readers::ODT

Inherits:
Object
  • Object
show all
Defined in:
lib/treat/workers/formatters/readers/odt.rb

Overview

A reader for the ODT (Open Office) document format.

Based on work by Mark Watson, licensed under the GPL.

Original project website: www.markwatson.com/opensource/

Todo: reimplement with Nokogiri and use XML node information to better translate the format of the text.

Defined Under Namespace

Classes: ODTXmlHandler

Class Method Summary collapse

Class Method Details

.read(document, options = {}) ⇒ Object

Extract the readable text from an ODT file.

Options: none.

Raises:



21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# File 'lib/treat/workers/formatters/readers/odt.rb', line 21

def self.read(document, options = {})
  f = nil
  Zip::ZipFile.open(document.file,
  Zip::ZipFile::CREATE) do |zipfile|
    f = zipfile.read('content.xml')
  end
  raise Treat::Exception, 
  "Couldn't unzip dot file " +
  "#{document.file}!" unless f
  xml_h = ODTXmlHandler.new
  REXML::Document.parse_stream(f, xml_h)

  document.value = xml_h.plain_text
  document.set :format, 'odt'
  document

end