Class: Msg

Inherits:
Object
  • Object
show all
Defined in:
lib/msg.rb,
lib/msg/rtf.rb,
lib/msg/properties.rb

Overview

Introduction

Primary class interface to the vagaries of .msg files.

The core of the work is done by the Msg::Properties class.

Defined Under Namespace

Modules: RTF Classes: Attachment, Properties, Recipient

Constant Summary collapse

VERSION =
'1.3.1'
SUPPORT_DIR =

we look here for the yaml files in data/, and the exe files for support decoding at the moment.

File.dirname(__FILE__) + '/..'
Log =
Logger.new_with_callstack

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(root) ⇒ Msg

Create an Msg from root, an Ole::Storage::Dirent object



44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
# File 'lib/msg.rb', line 44

def initialize root
	@root = root
	@close_parent = false
	@attachments = []
	@recipients = []
	@properties = Properties.load @root

	# process the children which aren't properties
	@properties.unused.each do |child|
		if child.dir?
			case child.name
			# these first 2 will actually be of the form
			# 1\.0_#([0-9A-Z]{8}), where $1 is the 0 based index number in hex
			# should i parse that and use it as an index?
			when /__attach_version1\.0_/
				attach = Attachment.new(child)
				@attachments << attach if attach.valid?
			when /__recip_version1\.0_/
				@recipients << Recipient.new(child)
			when /__nameid_version1\.0/
				# FIXME: ignore nameid quietly at the moment
			else ignore child
			end
		end
	end

	# if these headers exist at all, they can be helpful. we may however get a
	# application/ms-tnef mime root, which means there will be little other than
	# headers. we may get nothing.
	# and other times, when received from external, we get the full cigar, boundaries
	# etc and all.
	# sometimes its multipart, with no boundaries. that throws an error. so we'll be more
	# forgiving here
	@mime = Mime.new props.transport_message_headers.to_s, true
	populate_headers
end

Instance Attribute Details

#attachmentsObject (readonly)

Returns the value of attribute attachments.



30
31
32
# File 'lib/msg.rb', line 30

def attachments
  @attachments
end

#close_parentObject

Returns the value of attribute close_parent.



31
32
33
# File 'lib/msg.rb', line 31

def close_parent
  @close_parent
end

#headersObject (readonly)

Returns the value of attribute headers.



30
31
32
# File 'lib/msg.rb', line 30

def headers
  @headers
end

#propertiesObject (readonly) Also known as: props

Returns the value of attribute properties.



30
31
32
# File 'lib/msg.rb', line 30

def properties
  @properties
end

#recipientsObject (readonly)

Returns the value of attribute recipients.



30
31
32
# File 'lib/msg.rb', line 30

def recipients
  @recipients
end

#rootObject (readonly)

Returns the value of attribute root.



30
31
32
# File 'lib/msg.rb', line 30

def root
  @root
end

Class Method Details

.open(arg, mode = nil) ⇒ Object

Alternate constructor, to create an Msg directly from arg and mode, passed directly to Ole::Storage (ie either filename or seekable IO object).



36
37
38
39
40
41
# File 'lib/msg.rb', line 36

def self.open arg, mode=nil
	msg = Msg.new Ole::Storage.open(arg, mode).root
	# we will close the ole when we are #closed
	msg.close_parent = true
	msg
end

Instance Method Details

#body_to_mimeObject



234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
# File 'lib/msg.rb', line 234

def body_to_mime
	# to create the body
	# should have some options about serializing rtf. and possibly options to check the rtf
	# for rtf2html conversion, stripping those html tags or other similar stuff. maybe want to
	# ignore it in the cases where it is generated from incoming html. but keep it if it was the
	# source for html and plaintext.
	if props.body_rtf or props.body_html
		# should plain come first?
		mime = Mime.new "Content-Type: multipart/alternative\r\n\r\n"
		# its actually possible for plain body to be empty, but the others not.
		# if i can get an html version, then maybe a callout to lynx can be made...
		mime.parts << Mime.new("Content-Type: text/plain\r\n\r\n" + props.body) if props.body
		# this may be automatically unwrapped from the rtf if the rtf includes the html
		mime.parts << Mime.new("Content-Type: text/html\r\n\r\n"  + props.body_html) if props.body_html
		# temporarily disabled the rtf. its just showing up as an attachment anyway.
		#mime.parts << Mime.new("Content-Type: text/rtf\r\n\r\n"   + props.body_rtf)  if props.body_rtf
		# its thus currently possible to get no body at all if the only body is rtf. that is not
		# really acceptable FIXME
		mime
	else
		# check no header case. content type? etc?. not sure if my Mime class will accept
		Log.debug "taking that other path"
		# body can be nil, hence the to_s
		Mime.new "Content-Type: text/plain\r\n\r\n" + props.body.to_s
	end
end

#closeObject



81
82
83
# File 'lib/msg.rb', line 81

def close
	@root.ole.close if @close_parent
end

#convertObject


beginnings of conversion stuff



223
224
225
226
227
228
229
230
231
232
# File 'lib/msg.rb', line 223

def convert
	# 
	# for now, multiplex between returning a Mime object,
	# a Vpim::Vcard object,
	# a Vpim::Vcalendar object
	#
	# all of which should support a common serialization,
	# to save the result to a file.
	#
end

#ignore(obj) ⇒ Object



199
200
201
# File 'lib/msg.rb', line 199

def ignore obj
	Log.warn "* ignoring #{obj.name} (#{obj.type.to_s})"
end

#inspectObject



213
214
215
216
217
218
# File 'lib/msg.rb', line 213

def inspect
	str = %w[from to cc bcc subject type].map do |key|
		send(key) and "#{key}=#{send(key).inspect}"
	end.compact.join(' ')
	"#<Msg #{str}>"
end

#populate_headersObject

copy data from msg properties storage to standard mime. headers i’ve now seen it where the existing headers had heaps on stuff, and the msg#props had practically nothing. think it was because it was a tnef - msg conversion done by exchange.



92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
# File 'lib/msg.rb', line 92

def populate_headers
	# construct a From value
	# should this kind of thing only be done when headers don't exist already? maybe not. if its
	# sent, then modified and saved, the headers could be wrong?
	# hmmm. i just had an example where a mail is sent, from an internal user, but it has transport
	# headers, i think because one recipient was external. the only place the senders email address
	# exists is in the transport headers. so its maybe not good to overwrite from.
	# recipients however usually have smtp address available.
	# maybe we'll do it for all addresses that are smtp? (is that equivalent to 
	# sender_email_address !~ /^\//
	name, email = props.sender_name, props.sender_email_address
	if props.sender_addrtype == 'SMTP'
		headers['From'] = if name and email and name != email
			[%{"#{name}" <#{email}>}]
		else
			[email || name]
		end
	elsif !headers.has_key?('From')
		# some messages were never sent, so that sender stuff isn't filled out. need to find another
		# way to get something
		# what about marking whether we thing the email was sent or not? or draft?
		# for partition into an eventual Inbox, Sent, Draft mbox set?
		# i've now seen cases where this stuff is missing, but exists in transport message headers,
		# so maybe i should inhibit this in that case.
		if email
			Log.warn "* no smtp sender email address available (only X.400). creating fake one"
			# this is crap. though i've specially picked the logic so that it generates the correct
			# email addresses in my case (for my organisation).
			# this user stuff will give valid email i think, based on alias.
			user = name ? name.sub(/(.*), (.*)/, "\\2.\\1") : email[/\w+$/].downcase
			domain = (email[%r{^/O=([^/]+)}i, 1].downcase + '.com' rescue email)
			headers['From'] = [name ? %{"#{name}" <#{user}@#{domain}>} : "<#{user}@#{domain}>" ]
		elsif name
			# we only have a name? thats screwed up.
			Log.warn "* no smtp sender email address available (only name). creating fake one"
			headers['From'] = [%{"#{name}"}]
		else
			Log.warn "* no sender email address available at all. FIXME"
		end
	# else we leave the transport message header version
	end

	# for all of this stuff, i'm assigning in utf8 strings.
	# thats ok i suppose, maybe i can say its the job of the mime class to handle that.
	# but a lot of the headers are overloaded in different ways. plain string, many strings
	# other stuff. what happens to a person who has a " in their name etc etc. encoded words
	# i suppose. but that then happens before assignment. and can't be automatically undone
	# until the header is decomposed into recipients.
	recips_by_type = recipients.group_by { |r| r.type }
	# i want to the the types in a specific order.
	[:to, :cc, :bcc].each do |type|
		# don't know why i bother, but if we can, we try to sort recipients by the numerical part
		# of the ole name, or just leave it if we can't
		recips = recips_by_type[type]
		recips = (recips.sort_by { |r| r.obj.name[/\d{8}$/].hex } rescue recips)
		# switched to using , for separation, not ;. see issue #4
		# recips.empty? is strange. i wouldn't have thought it possible, but it was right?
		headers[type.to_s.sub(/^(.)/) { $1.upcase }] = [recips.join(', ')] unless recips.empty?
	end
	headers['Subject'] = [props.subject] if props.subject

	# fill in a date value. by default, we won't mess with existing value hear
	if !headers.has_key?('Date')
		# we want to get a received date, as i understand it.
		# use this preference order, or pull the most recent?
		keys = %w[message_delivery_time client_submit_time last_modification_time creation_time]
		time = keys.each { |key| break time if time = props.send(key) }
		time = nil unless Date === time
		# can employ other methods for getting a time. heres one in a similar vein to msgconvert.pl,
		# ie taking the time from an ole object
		time ||= @root.ole.dirents.map(&:time).compact.sort.last

		# now convert and store
		# this is a little funky. not sure about time zone stuff either?
		# actually seems ok. maybe its always UTC and interpreted anyway. or can be timezoneless.
		# i have no timezone info anyway.
		# in gmail, i see stuff like 15 Jan 2007 00:48:19 -0000, and it displays as 11:48.
		# can also add .localtime here if desired. but that feels wrong.
		require 'time'
		headers['Date'] = [Time.iso8601(time.to_s).rfc2822] if time
	end

	# some very simplistic mapping between internet message headers and the
	# mapi properties
	# any of these could be causing duplicates due to case issues. the hack in #to_mime
	# just stops re-duplication at that point. need to move some smarts into the mime
	# code to handle it.
	mapi_header_map = [
		[:internet_message_id, 'Message-ID'],
		[:in_reply_to_id, 'In-Reply-To'],
		# don't set these values if they're equal to the defaults anyway
		[:importance, 'Importance', proc { |val| val.to_s == '1' ? nil : val }],
		[:priority, 'Priority', proc { |val| val.to_s == '1' ? nil : val }],
		[:sensitivity, 'Sensitivity', proc { |val| val.to_s == '0' ? nil : val }],
		# yeah?
		[:conversation_topic, 'Thread-Topic'],
		# not sure of the distinction here
		# :originator_delivery_report_requested ??
		[:read_receipt_requested, 'Disposition-Notification-To', proc { |val| from }]
	]
	mapi_header_map.each do |mapi, mime, *f|
		next unless q = val = props.send(mapi) or headers.has_key?(mime)
		next if f[0] and !(val = f[0].call(val))
		headers[mime] = [val.to_s]
	end
end

#to_mimeObject



261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
# File 'lib/msg.rb', line 261

def to_mime
	# intended to be used for IPM.note, which is the email type. can use it for others if desired,
	# YMMV
	Log.warn "to_mime used on a #{props.message_class}" unless props.message_class == 'IPM.Note'
	# we always have a body
	mime = body = body_to_mime

	# If we have attachments, we take the current mime root (body), and make it the first child
	# of a new tree that will contain body and attachments.
	unless attachments.empty?
		mime = Mime.new "Content-Type: multipart/mixed\r\n\r\n"
		mime.parts << body
		# i don't know any better way to do this. need multipart/related for inline images
		# referenced by cid: urls to work, but don't want to use it otherwise...
		related = false
		attachments.each do |attach|
			part = attach.to_mime
			related = true if part.headers.has_key?('Content-ID') or part.headers.has_key?('Content-Location')
			mime.parts << part
		end
		mime.headers['Content-Type'] = ['multipart/related'] if related
	end

	# at this point, mime is either
	# - a single text/plain, consisting of the body ('taking that other path' above. rare)
	# - a multipart/alternative, consiting of a few bodies (plain and html body. common)
	# - a multipart/mixed, consisting of 1 of the above 2 types of bodies, and attachments.
	# we add this standard preamble if its multipart
	# FIXME preamble.replace, and body.replace both suck.
	# preamble= is doable. body= wasn't being done because body will get rewritten from parts
	# if multipart, and is only there readonly. can do that, or do a reparse...
	# The way i do this means that only the first preamble will say it, not preambles of nested
	# multipart chunks.
	mime.preamble.replace "This is a multi-part message in MIME format.\r\n" if mime.multipart?

	# now that we have a root, we can mix in all our headers
	headers.each do |key, vals|
		# don't overwrite the content-type, encoding style stuff
		next if mime.headers.has_key? key
		# some new temporary hacks
		next if key =~ /content-type/i and vals[0] =~ /base64/
		next if mime.headers.keys.map(&:downcase).include? key.downcase
		mime.headers[key] += vals
	end
	# just a stupid hack to make the content-type header last, when using OrderedHash
	mime.headers['Content-Type'] = mime.headers.delete 'Content-Type'

	mime
end

#to_vcardObject



311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
# File 'lib/msg.rb', line 311

def to_vcard
	require 'rubygems'
	require 'vpim/vcard'
	# a very incomplete mapping, but its a start...
	# can't find where to set a lot of stuff, like zipcode, jobtitle etc
	# FIXME all the .to_s stuff is because i was to lazy to not set if nil. and setting when nil breaks
	# the Vcard#to_s later. find a neater way that scales to many properties like this.
	# property map perhaps, like:
	# { 
	#   :location => 'work',
	#   :street   => :business_address_street,
	#   :locality => proc { |props| [props.business_address_city, props.business_address_state].compact.join ', ' },
	#   ...
	# and then have the vcard filled in according to this (1-way) translation map.
	card = Vpim::Vcard::Maker.make2 do |m|
		# these are all standard mapi properties
		m.add_name do |n|
			n.given = props.given_name.to_s
			n.family = props.surname.to_s
			n.fullname = props.subject.to_s
		end

		# outlook seems to eschew the mapi properties this time,
		# like postal_address, street_address, home_address_city
		# so we use the named properties
		m.add_addr do |a|
			a.location = 'work'
			a.street = props.business_address_street.to_s
			# i think i can just assign the array
			a.locality = [props.business_address_city, props.business_address_state].compact.join ', '
			a.country = props.business_address_country.to_s
			a.postalcode = props.business_address_postal_code.to_s
		end

		# right type?
		m.birthday = props.birthday if props.birthday
		m.nickname = props.nickname.to_s

		# photo available?
		# FIXME finish, emails, telephones etc
	end
end

#typeObject

redundant?



204
205
206
# File 'lib/msg.rb', line 204

def type
	props.message_class[/IPM\.(.*)/, 1].downcase rescue nil
end