= RODS - Ruby Open Document Spreadsheet
This class provides a convenient interface for reading and writing 
spreadsheets conforming to Open Document Format v1.1.
Installiation of an office-application (LibreOffice, OpenOffice.org) is not required as the code directly 
manipulates the XML-files in the zipped *.ods-container. The mere prerequisites are 
  gem install zip
  gem install rods

On my website http://ruby.homelinux.com/ruby/rods/ you can see the output of the script below.
You can contact me at [email protected] (and tell me about experiences, problems you encountered or drop me a line, if you like it ;-).

link:images/Rods.jpg

= Copyright
Copyright (c) <em>Dr. Heinz Breinlinger</em> (2011).
Licensed under the same terms as Ruby. No warranty is provided.

= Changelog
* 0.8.0
  * new function
    * insertColumn
  cf. Example 0.8.0 below
* 0.7.5
  * new functions
    * insertRowAbove, insertRowBelow, insertCellBefore, insertCellAfter
  cf. Example 0.7.5 below
* 0.7.0 
  * added new function
    * getCellsAndIndicesFor(expression)
    This function returns all cells and their indices for which 
    the external representation matches the given expression.
  cf. Example 0.7.0 below
* 0.6.2
  * added style-attributes
    * text-underline-style
    * text-underline-width
    * text-underline-color
  cf. Example 0.6.2 below
* 0.6.1
  * added new functions
    * insertTableBefore()
    * insertTableAfter()
  cf. Example 0.6.1 below
* 0.6.0
  * changed interface from
    * getNextRow, getPreviousRow, getNextCell, getPreviousCell to
    * getNextExistentRow, getPreviousExistentRow, getNextExistentCell, getPreviousExistentCell
  See explanation below
* 0.5.1 
  * made readCellFromRow() more user-friendly by returning defaults for text and type where 
    till recently error-messages were thrown due to missing values
* 0.5.0
  * new performance-boosters for reading lots of rows (cf. chapter and second example-script below)
    * getNextRow, getPreviousRow, getNextCell, getPreviousCell
* 0.1.1 
  * Bug-Fix: readCellFromRow() did not return value in some cases
* 0.1.0 
  * improved support for compound-formulas 
      (=SUMME(WENN((A$12:A$985="x")*(I$12:I$983="Hauskonto");G$12:G$983)))
  * logging set as default

= Disclaimer

The example code was tested on
* Linux
  * Ubuntu 10.10 64-Bit (German)
  * OpenOffice.org 3.2 (German)
  * Ruby 1.9.1 and 1.8.7
* Mac
  * OS X 10.6 (German)
  * OpenOffice.org 3.2 (German)
  * Ruby 1.8.7 (1.9.1 not tried)
* Windows 
  * WinXP 32-Bit (German)
  * OpenOffice.org 3.0 (German)
  * Ruby 1.9.1 and 1.8.7

= What you must know
When you open a (new) spreadsheet in an application like OpenOffice.org, you'll see an infinity of empty cells and
might be tempted to assume that all these cells actually "exist" in the underlying file format/XML-structure.
This is not the case ! A "newly born" spreadsheet does not contain even a single cell ! They're created on demand along
with all other cells of lesser x- or y-coordinate.
That is: If you write a cell with coordinates (10,10) all cells from (1,1) to (10,10)
spring into existence or vice versa: If the cell farthest down the sheet and farthest to the right
having a visible value has coordinates (90,87), then all cells from row 1-90 and column 1-87 exist, 
but no cell out of this virutal rectangle is "alive".
This is very important for using the interface appropriately: All routines matching 'Existent' merely return "living"
cells, while all other routines bearing a "get|Get" create a row or cell if necessary (and implicitly all other cells
within the new boundaries).
In other words: Use
* getNextExistentRow, getPreviousExistentRow, getNextExistentCell, getPreviousExistentCell
  whenever you just want to read values you already have (seemlessly skipping visibly emtpy lines and columns) and
* getCell, writeGetCell and writeGetCellFromRow whenever you want a cell to be created if it does not exist (which
  is undoubtedly the case, when you want to write to it or assign it a style) !
The first return nil, if an element does not exist yet, the second always return the desired element.

= Howto
  $ sudo gem install rods (root-privileges necessary)

  # coding: UTF-8
  #
  # Author: Dr. Heinz Breinlinger
  #
  require 'rubygems'
  require 'rods'
  # 
  #-----------------------------------------------------------------
  # Open a spreadsheet in the working directory.
  # If the file is elsewhere, prepend the path according to your 
  # operating systems notation, e.g. "/home/heinz/myfile.ods".
  #-----------------------------------------------------------------
  mySheet=Rods.new("Template.ods") # empty or prefilled sheet
  #-----------------------------------------------------------------
  # Rename, insert and delete some tables
  #-----------------------------------------------------------------
  mySheet.renameTable("Tabelle1","not needed") 
  mySheet.insertTable("example")  
  #-----------------------------------------------------------------
  # Very important: tell RODS, what table you want to work with subsequently, 
  # if the table is not the first in the spreadsheet.
  #-----------------------------------------------------------------
  mySheet.setCurrentTable("example")
  mySheet.deleteTable("Tabelle2")
  mySheet.deleteTable("Tabelle3")
  #-----------------------------------------------------------------
  # Fill the sheet with values. All values are strings at the 
  # user-level and are accompanied by a type-token.
  # The type affects the internal representation and the style of
  # the corresponding cell.
  # When the type is a formula, the resulting type is appended.
  # Valid types are
  #  "string","time","date","float","percent","currency","formula",
  #  "formula:float","formula:time","formula:date","formula:currency".
  #-----------------------------------------------------------------
  mySheet.writeCell(1,1,"date","31.12.2010") 
  mySheet.writeCell(2,1,"formula:date","=A1+1") 
  #-----------------------------------------------------------------
  # Alter the data-style for the following 2 date-values.
  #
  # Data-styles affect, how values are displayed, i.e. their "meaning".
  # Within that "meaning" styles (not data-styles !) affect the look
  # (color, indentation, 'italic', etc.)
  # Data-styles are predefined by RODS and (so far) unique for every type.
  # Only date-values have 2 different data-styles implemented.
  #-----------------------------------------------------------------
  mySheet.setDateFormat("myDateDay")  # "05.01.2011" -> "Mi" (if German)
  #-----------------------------------------------------------------
  # Add 2 formulas with date-values as results.
  #-----------------------------------------------------------------
  mySheet.writeCell(1,2,"formula:date","=A1")
  mySheet.writeCell(2,2,"formula:date","=A2")
  #-----------------------------------------------------------------
  # Reset the data-style for date-values
  #-----------------------------------------------------------------
  mySheet.setDateFormat("myDate")     # back to "DD.MM.YYYY"
  #-----------------------------------------------------------------
  # Continue with 2 time-values
  #-----------------------------------------------------------------
  mySheet.writeCell(1,3,"time","13:37") 
  mySheet.writeCell(2,3,"time","20:15")
  #-----------------------------------------------------------------
  # Write 2 currency-values
  #-----------------------------------------------------------------
  mySheet.writeCell(1,4,"currency","19,99") 
  mySheet.writeCell(2,4,"currency","-7,78")
  #-----------------------------------------------------------------
  # Insert a formula for the time-difference
  #-----------------------------------------------------------------
  cell=mySheet.writeGetCell(3,3,"formula:time","=C2-C1")
  #-----------------------------------------------------------------
  # Set a border, center the text, change background- and font-color.
  #-----------------------------------------------------------------
  mySheet.setAttributes(cell,{ "border" => "0.01cm solid turquoise",
                               "text-align" => "center",           
                               "background-color" => "yellow2",   
                               "color" => "blue"})               
  #-----------------------------------------------------------------
  # Inset a formula for the sum of the above currency-values
  #-----------------------------------------------------------------
  cell=mySheet.writeGetCell(3,4,"formula:currency","=D2+D1")
  #-----------------------------------------------------------------
  # Apply different borders and display the font as italic and bold.
  #-----------------------------------------------------------------
  mySheet.setAttributes(cell,{ "border-right" => "0.05cm solid magenta4",
                               "border-bottom" => "0.03cm solid lightgreen",
                               "border-top" => "0.08cm solid salmon",
                               "font-style" => "italic",
                               "font-weight" => "bold"})
  #-----------------------------------------------------------------
  # Create a new style for percent-values and apply it to a new
  # percent-value.
  # Be aware that a valid data-style is chosen ("myPercentFormat"
  # is RODS' default-data-style for percent-values.)
  #-----------------------------------------------------------------
  mySheet.writeStyleAbbr({"name" => "myNewPercentStyle",        
                          "margin-left" => "0.3cm",
                          "text-align" => "start",
                          "color" => "blue",
                          "border" => "0.01cm solid black",
                          "font-style" => "italic",
                          "data-style-name" => "myPercentFormat", # <- data-style !
                          "font-weight" => "bold"})
  cell=mySheet.writeGetCell(4,2,"percent","4,71")
  mySheet.setStyle(cell,"myNewPercentStyle")
  #-----------------------------------------------------------------
  # Add a comment-field, change the font-color and font-style.
  #-----------------------------------------------------------------
  mySheet.writeComment(cell,"by Dr. Heinz Breinlinger")
  cell=mySheet.writeGetCell(4,3,"formula:time","=B4*C3")
  mySheet.setAttributes(cell,{ "color" => "lightmagenta",
                               "font-style" => "italic"})
  #-----------------------------------------------------------------
  # Insert a formula for the percentage of a currency-value.
  #-----------------------------------------------------------------
  cell=mySheet.writeGetCell(4,4,"formula:currency","=B4*D3")
  #-----------------------------------------------------------------
  # Change some attributes of the cell.
  #-----------------------------------------------------------------
  mySheet.setAttributes(cell,{ "color" => "turquoise7",
                               "text-align" => "center",
                               "font-style" => "bold"})
  #-----------------------------------------------------------------
  # Create a new style and apply it to 2 new text-values.
  #-----------------------------------------------------------------
  mySheet.writeStyleAbbr({"name" => "myBold",
                         "text-align" => "end",
                         "font-weight" => "bold",
                         "background-color" => "purple"})
  cell=mySheet.writeGetCell(3,1,"string","Diff/Sum")
  mySheet.setStyle(cell,"myBold")
  cell=mySheet.writeGetCell(4,1,"string","Percent")
  mySheet.setStyle(cell,"myBold")
  #-----------------------------------------------------------------
  # Insert a text with an annotation.
  #-----------------------------------------------------------------
  cell=mySheet.writeGetCell(6,1,"string","annotation")
  mySheet.writeComment(cell,"C3,C4,D3,D4 are formulas")
  #-----------------------------------------------------------------
  # Draw a long green vertical line to frame what we created.
  #-----------------------------------------------------------------
  1.upto(7){ |row|
    cell=mySheet.getCell(row,5)
    mySheet.setAttributes(cell,{ "border-right" => "0.07cm solid green6" }) 
  }
  #-----------------------------------------------------------------
  # Complete it with a red horicontal line right across.
  #-----------------------------------------------------------------
  1.upto(5){ |col|
    cell=mySheet.getCell(7,col)
    mySheet.setAttributes(cell,{ "border-bottom" => "0.085cm solid red5" }) # 
  }
  #-----------------------------------------------------------------
  # Read and calculate 2 currency-values.
  #-----------------------------------------------------------------
  amount=0.0
  1.upto(2){ |i|
    row=mySheet.getRow(i)
    text,type=mySheet.readCellFromRow(row,4)
    if(type == "currency")
      amount+=text.to_f
    end
  }
  #-----------------------------------------------------------------
  # Delete the table we do not need.
  #-----------------------------------------------------------------
  mySheet.deleteTable("not needed")
  #-----------------------------------------------------------------
  # Save the sheet under a different name (if you don not want to
  # override the original).
  #-----------------------------------------------------------------
  mySheet.saveAs("Example.ods")
  #-----------------------------------------------------------------
  # Print the result of the former computation.
  #-----------------------------------------------------------------
  puts("Sums up to: #{amount}")  # -> "Sums up to: 12.21"
  #-----------------------------------------------------------------
  # Open the file we wrote in the previous step again.
  #-----------------------------------------------------------------
  mySheet=Rods.new("Example.ods") 
  #-----------------------------------------------------------------
  # Set the current table for subsequent operations.
  # This is not necessary here as the table of interest is the 
  # first in the spreadsheet and automatically becomes the default-table.
  #-----------------------------------------------------------------
  mySheet.setCurrentTable("example") 
  #-----------------------------------------------------------------
  # Read a currency-value from the sheet and print it.
  # Remember that all values are passed to and from the spreadsheet
  # as strings which get their meaning by the accompanying types.
  #-----------------------------------------------------------------
  text,type=mySheet.readCell(2,4)
  if(text && type)
    if(type == "currency")
      puts("not so much: #{text} bucks") # -> "not so much: -7.78 bucks"
    end
  end

= Example 0.8.0

  #  #!/usr/bin/ruby
  #  # coding: UTF-8
  #  #
  #  # Author: Dr. Heinz Breinlinger
  #  #
  #  require 'rubygems'
  #  require 'rods'
  #
  #  mySheet=Rods.new("Template.ods")
  #  1.upto(3){ |row|
  #    1.upto(4){ |col|
  #      mySheet.writeCell(row,col,"time","13:47")
  #    }
  #  }
  #  #-----------------------------------------------------
  #  # inserting new column -> shifting existing columns
  #  #-----------------------------------------------------
  #  mySheet.insertColumn(3) 
  #  1.upto(3){ |row|
  #    mySheet.writeCell(row,3,"string","o' clock")
  #  }
  #  mySheet.saveAs("Test7.ods")

= Example 0.7.5

  #  #!/usr/bin/ruby
  #  # coding: UTF-8
  #  #
  #  # Author: Dr. Heinz Breinlinger
  #  #
  #  require 'rubygems'
  #  require 'rods'
  #
  #  mySheet=Rods.new("Template.ods")
  #  mySheet.writeCell(1,1,"string","oneOne")
  #  mySheet.writeCell(1,2,"string","oneTwo")
  #  mySheet.writeCell(2,1,"string","twoOne") # finally becomes cell 3,1
  #  mySheet.writeCell(2,2,"string","twoTwo") # finally becomes cell 3,2
  #  row=mySheet.getRow(1)
  #  newRow=mySheet.insertRowBelow(row)
  #  mySheet.writeCell(2,1,"string","twoNewOne")
  #  cell=mySheet.writeGetCell(2,2,"string","moved") # finally becomes cell "2,3"
  #  mySheet.insertCellBefore(cell)
  #  mySheet.writeCell(2,2,"string","twoNewTwoNew")  # new cell "2,2"
  #  mySheet.saveAs("Test6.ods")

= Example 0.7.0

  #!/usr/bin/ruby
  # coding: UTF-8
  #
  # Author: Dr. Heinz Breinlinger
  #
  require 'rubygems'
  require 'rods'

  mySheet=Rods.new("Konten.ods")
  # Finds all cells with content 'content' and returns them along with the
  # indices of row and column as an array of hashes.
  # [{ :cell => cell,
  #    :row  => rowIndex,
  #    :col  => colIndex},
  #  { :cell => cell,
  #    :row  => rowIndex,
  #    :col  => colIndex}]
  #  
  # Regular expressions for 'content' are allowed but must be enclosed in 
  # single (not double) quotes !
  #
  # In case of no matches at all, an empty array is returned.
  # 
  # Keep in mind that the content of a cell with a formula is not the formula, but the
  # current value of the computed result.
  #
  # Also consider that you have to search for the external (i.e. visible)
  # represenation of a cell's content, not it's internal computational value.
  # For instance, when looking for a currency value of 1525 (that is shown as
  # '1.525 $'), you'll have to code
  #
  #   result=mySheet.getCellsAndIndicesFor('1[.,]525')
  #
  # The following finds all occurences of a comma- or dot-separated number,
  # consisting of 1 digit before and 2 digits behind the decimal-separator.
  #-------------------------------------------------------------------------
  result=mySheet.getCellsAndIndicesFor('\d{1}[.,]\d{2}')
  result.each{ |cellHash|
    puts("----------------------------------------------")
    puts("Node: #{cellHash[:cell]}") # Be aware: Prints out whole node ! ;-)
    puts("Row: #{cellHash[:row]}")
    puts("Column: #{cellHash[:col]}")
  }
  puts("done")

= Example for additions in 0.6.2

  #!/usr/bin/ruby
  # coding: UTF-8
  #
  # Author: Dr. Heinz Breinlinger
  #
  require 'rubygems'
  require 'rods'

  mySheet=Rods.new("Template.ods")
  cell=mySheet.writeGetCell(3,3,"string","Underline")
  mySheet.setAttributes(cell,{ "style:text-underline-color" => "blue",
                               "style:text-underline-style" => "solid",
                               "style:text-underline-width" => "auto"})
  cell=mySheet.writeGetCell(4,4,"string","Underline_Default_with_Black")
  #----------------------------------------------------------------------
  # if not specified otherwise, width and color are set to default
  # - black
  # - solid
  #----------------------------------------------------------------------
  mySheet.setAttributes(cell,{ "style:text-underline-style" => "solid" })
  mySheet.saveAs("Test3.ods")
  puts("done")

= Example for additions in 0.6.1

  #!/usr/bin/ruby
  # coding: UTF-8
  #
  # Author: Dr. Heinz Breinlinger
  #
  require 'rubygems'
  require 'rods'

  mySheet=Rods.new("Template.ods")
  mySheet.insertTableAfter("Tabelle1","Neue Tabelle")
  mySheet.insertTableAfter("Neue Tabelle","Neue Tabelle2")
  mySheet.insertTableAfter("Neue Tabelle2","Neue Tabelle3")
  mySheet.insertTableAfter("Tabelle3","Neue Tabelle4")
  mySheet.insertTableBefore("Tabelle1","Vor1")
  mySheet.insertTableBefore("Neue Tabelle4","Vor4")
  mySheet.saveAs("Test2.ods")
  puts("done")

= Caveat

The XML-structure of a <file>.ods is 
* first rows 
* then cells

As a result

  1.upto(500){ |i|
    text1,type1=readCell(i,3)  # XML-Parser starts from the top-node 
    text2,type2=readCell(i,4)  # XML-Parser starts from the top-node
    puts("Read #{text1} of #{type1} and #{text2} of #{type2}")
  }

is significantly slower than the slight variation

  1.upto(500){ |i|
    row=getRow(i)                      # XML-Parser starts from top-node
    text1,type1=readCellFromRow(row,3) # XML-Parser continues from row-node
    text2,type2=readCellFromRow(row,4) # XML-Parser continues from row-node
    puts("Read #{text1} of #{type1} and #{text2} of #{type2}
  }

This difference hardly matters while dealing with small documents, but degrades performance significantly when you 
move up or down a sheet with a lot of rows and process several cells on the same row !

Provided you just want to read exisiting cells (cf. explanation above), the following is a real speed-buster.

  # coding: UTF-8
  #
  # Author: Dr. Heinz Breinlinger
  #
  require 'rubygems'
  require 'rods'

  mySheet=Rods.new("Konten.ods")
  sum=0.0
  i=0
  row=mySheet.getRow(1)
  #---------------------------------------------------------------------------------
  # The routines
  # - getNextExistentRow(row)
  # - getPreviousExistentRow(row)
  # - getNextExistentCell(cell)
  # - getPreviousExistentCell(cell)
  # allow the XML-Parser to just continue from the "adjacent" node and 
  # return the previsous/next element without having to start from the top-node 
  # of the document over and over again !
  #---------------------------------------------------------------------------------
  while(row=mySheet.getNextExistentRow(row))  
    i+=1
    puts("#{i}")
    confirmed,type=mySheet.readCellFromRow(row,1)
    next if (! confirmed || confirmed.empty?())
    account,type=mySheet.readCellFromRow(row,9)
    next if (! account || account.empty?())
    transfer,type=mySheet.readCellFromRow(row,7)
    next if (! transfer || transfer.empty?())
    if(confirmed.match(/x/i) && account.match(/Hauskonto/))
      puts("#{i}: --> #{transfer} €")
      sum+=transfer.to_f
    end
  end
  puts("------------")
  puts("Sum: #{sum}")

On the ATOM-Nettop I developed the gem, even the first script above took just 2-3 seconds and on my Core-i7-Notebook it was finished so quickly
that I supposed it had halted on an immediate error, so: don't be concerned and just experiment :-).

= Standards
* Open Document Format for Office Applications (Open Document) v1.1 based on OpenOffice.org (OOo)
  * http://www.oasis-open.org/specs/#opendocumentv1.1
    * Hint: Download the pdf-version. Reading the html-version online was to slow at least 
      on my attempts.
* W3C Extensible Stylesheet Language (XSL) Version 1.0 (W3C Recommendation 15 October 2001)
  * http://www.w3.org/TR/2001/REC-xsl-20011015/