Introduction

Generally, the goal of the project is the conversion of .msg files into proper rfc2822 emails, independent of outlook, or any platform dependencies etc. In fact its currently pure ruby, so it should be easy to get started with.

There’s also work-in-progess pst support (unfortunately outlook 97 only currently), based on libpst, making this project more of a general ruby mapi message store conversion library now (though some significant cleaning up has to happen first).

It draws on msgconvert.pl, but tries to take a cleaner and more complete approach. Neither are complete yet, however, but I think that this project provides a clean foundation upon which to work on a good converter for msg files for use in outlook migrations etc.

I am happy to accept patches, give commit bits etc.

Please let me know how it works for you, any feedback would be welcomed.

Features

Broad features of the project:

  • Can be used as a general msg library, where conversion to and working on a standard format doesn’t make sense.

  • Supports conversion of msg files to standard formats, like rfc2822 emails, vCards, etc.

  • Well commented, and easily extended.

  • Most key .msg structures are understood, and the only the parsing code should require minor tweaks. Most of remaining work is in achieving high-fidelity conversion to standards formats (see [TODO]).

Features of the lower-level msg handling:

  • Supports both types of property storage (large ones in substg files, and small ones in the properties file).

  • Complete support for named properties in different GUID namespaces.

  • Support for mapping property codes to symbolic names, with many included.

  • RTF decompression support included, as well as HTML extraction from RTF where appropriate (both in pure ruby, see lib/msg/rtf.rb)

  • Initial RTF converter, for providing a readable body when only RTF exists (needs work)

  • Initial support for handling embedded ole files, converting nested .msg files to message/rfc822 attachments, and serializing others as ole file attachments (allows you to view embedded excel for example).

Usage

At the command line, it is simple to convert individual msg files to .eml, or to convert a batch to an mbox format file. See help for details:

msgtool -c some_email.msg > some_email.eml
msgtool -m *.msg > mbox

There is also a fairly complete and easy to use high level library access:

require 'msg'

msg = Msg.open filename

# access to the 3 main data stores, if you want to poke with the msg
# internals
msg.recipients
# => [#<Recipient:'\'Marley, Bob\' <[email protected]>'>]
msg.attachments
# => [#<Attachment filename='blah1.tif'>, #<Attachment filename='blah2.tif'>]
msg.properties
# => #<Properties ... normalized_subject='Testing' ... 
# creation_time=#<DateTime: 2454042.45074714,0,2299161> ...>

To completely abstract away all msg peculiarities, convert the msg to a mime object. The message as a whole, and some of its main parts support conversion to mime objects.

msg.attachments.first.to_mime
# => #<Mime content_type='application/octet-stream'>
mime = msg.to_mime
puts mime.to_tree
# =>
- #<Mime content_type='multipart/mixed'>
  |- #<Mime content_type='multipart/alternative'>
  |  |- #<Mime content_type='text/plain'>
  |  \- #<Mime content_type='text/html'>
  |- #<Mime content_type='application/octet-stream'>
  \- #<Mime content_type='application/octet-stream'>

# convert mime object to serialised form,
# inclusive of attachments etc. (not ideal in memory, but its wip).
puts mime.to_s

Other

For more information, see