Class: RMail::Mailbox::MBoxReader

Inherits:
Parser::PushbackReader show all
Defined in:
lib/rmail/mailbox/mboxreader.rb

Overview

Class that can parse Unix mbox style mailboxes. These mailboxes separate individual messages with a line beginning with the string “From ”.

Typical usage:

File.open("file.mbox") { |file|
  RMail::Mailbox::MBoxReader.new(file).each_message { |input|
    message = RMail::Parser.read(input)
    # do something with the message
  end
}

Or see RMail::Mailbox.parse_mbox for a more convenient interface.

Instance Attribute Summary

Attributes inherited from Parser::PushbackReader

#chunk_size

Instance Method Summary collapse

Methods inherited from Parser::PushbackReader

maybe_contains_re, #pushback, #read, #standard_read_chunk

Constructor Details

#initialize(input, line_separator = $/) ⇒ MBoxReader

Creates a new MBoxReader that reads from ‘input’ with lines that end with ‘line_separator’.

‘input’ can either be an IO source (an object that responds to the “read” method in the same way as a standard IO object) or a String.

‘line_separator’ defaults to $/, and useful values are probably limited to “n” (Unix) and “rn” (DOS/Windows).



61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# File 'lib/rmail/mailbox/mboxreader.rb', line 61

def initialize(input, line_separator = $/)
  super(input)
  @end_of_message = false
  @chunk_minsize = 0
  @sep = line_separator
  @tail = nil

  # This regexp will match a From_ header, or some prefix.
  re_string = RMail::Parser::PushbackReader.
    maybe_contains_re("#{@sep}From ")
  @partial_from_re = Regexp.new(re_string)

  # This regexp will match an entire From_ header.
  @entire_from_re = /\A#{@sep}From .*?#{@sep}/
end

Instance Method Details

#each_messageObject

Yield self until eof, calling next after each yield.

This method makes it simple to read messages successively out of the mailbox. See the class description for a code example.



128
129
130
131
132
133
# File 'lib/rmail/mailbox/mboxreader.rb', line 128

def each_message
  while !eof
    yield self
    self.next
  end
end

#eofObject

Returns true if the next call to read_chunk will return nil.



120
121
122
# File 'lib/rmail/mailbox/mboxreader.rb', line 120

def eof
  parent_eof and @tail.nil?
end

#nextObject

Advances to the next message to be read. Call this after #read returns nil.

Note: Once #read returns nil, you can call #eof before or after calling #next to tell if there actually is a next message to read.



112
113
114
115
# File 'lib/rmail/mailbox/mboxreader.rb', line 112

def next
  @end_of_message = false
  @tail = nil
end

#parent_eofObject



117
# File 'lib/rmail/mailbox/mboxreader.rb', line 117

alias_method :parent_eof, :eof

#parent_read_chunkObject



77
# File 'lib/rmail/mailbox/mboxreader.rb', line 77

alias_method :parent_read_chunk, :read_chunk

#read_chunk(size) ⇒ Object

Reads some data from the current message and returns it. The ‘size’ argument is just a suggestion, and the returned string can be larger or smaller. When ‘size’ is nil, then the entire message is returned.

Once all data from the current message has been read, #read returns nil and #next must be called to begin reading from the next message. You can use #eof to tell if there is any more data to be read from the input source.



88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
# File 'lib/rmail/mailbox/mboxreader.rb', line 88

def read_chunk(size)
  chunk = read_chunk_low(size)
  if chunk
    if chunk.length > @sep.length
      @tail = chunk[-@sep.length .. -1]
    else
      @tail ||= ''
      @tail << chunk
    end
  elsif @tail
    if @tail[-@sep.length .. -1] != @sep
      chunk = @sep
    end
    @tail = nil
  end
  chunk
end