Class: Mechanize::PluggableParser

Inherits:
Object
  • Object
show all
Defined in:
lib/mechanize/pluggable_parsers.rb

Overview

This class is used to register and maintain pluggable parsers for Mechanize to use.

Mechanize allows different parsers for different content types. Mechanize uses PluggableParser to determine which parser to use for any content type. To use your own pluggable parser or to change the default pluggable parsers, register them with this class.

The default parser for unregistered content types is Mechanize::File.

The module Mechanize::Parser provides basic functionality for any content type, so you may use it in custom parsers you write. For small files you wish to perform in-memory operations on, you should subclass Mechanize::File. For large files you should subclass Mechanize::Download as the content is only loaded into memory in small chunks.

Example

To create your own parser, just create a class that takes four parameters in the constructor. Here is an example of registering a pluggable parser that handles CSV files:

require 'csv'

class CSVParser < Mechanize::File
  attr_reader :csv

  def initialize uri = nil, response = nil, body = nil, code = nil
    super uri, response, body, code
    @csv = CSV.parse body
  end
end

agent = Mechanize.new
agent.pluggable_parser.csv = CSVParser
agent.get('http://example.com/test.csv')  # => CSVParser

Now any response with a content type of ‘text/csv’ will initialize a CSVParser and return that object to the caller.

To register a pluggable parser for a content type that pluggable parser does not know about, use the hash syntax:

agent.pluggable_parser['text/something'] = SomeClass

To set the default parser, use #default:

agent.pluggable_parser.default = Mechanize::Download

Now all unknown content types will be saved to disk and not loaded into memory.

Constant Summary collapse

CONTENT_TYPES =
{
  :html  => 'text/html',
  :wap   => 'application/vnd.wap.xhtml+xml',
  :xhtml => 'application/xhtml+xml',
  :pdf   => 'application/pdf',
  :csv   => 'text/csv',
  :xml   => 'text/xml',
}

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializePluggableParser

Returns a new instance of PluggableParser.



71
72
73
74
75
76
77
78
79
# File 'lib/mechanize/pluggable_parsers.rb', line 71

def initialize
  @parsers = {
    CONTENT_TYPES[:html]  => Mechanize::Page,
    CONTENT_TYPES[:xhtml] => Mechanize::Page,
    CONTENT_TYPES[:wap]   => Mechanize::Page,
  }

  @default = Mechanize::File
end

Instance Attribute Details

#defaultObject

Returns the value of attribute default.



69
70
71
# File 'lib/mechanize/pluggable_parsers.rb', line 69

def default
  @default
end

Instance Method Details

#[](content_type) ⇒ Object

Retrieves the parser for content_type content



132
133
134
# File 'lib/mechanize/pluggable_parsers.rb', line 132

def [](content_type)
  @parsers[content_type]
end

#[]=(content_type, klass) ⇒ Object

Sets the parser for content_type content to klass



139
140
141
# File 'lib/mechanize/pluggable_parsers.rb', line 139

def []=(content_type, klass)
  @parsers[content_type] = klass
end

#csv=(klass) ⇒ Object

Registers klass as the parser for text/csv content



118
119
120
# File 'lib/mechanize/pluggable_parsers.rb', line 118

def csv=(klass)
  register_parser(CONTENT_TYPES[:csv], klass)
end

#html=(klass) ⇒ Object

Registers klass as the parser for text/html and application/xhtml+xml content



96
97
98
99
# File 'lib/mechanize/pluggable_parsers.rb', line 96

def html=(klass)
  register_parser(CONTENT_TYPES[:html], klass)
  register_parser(CONTENT_TYPES[:xhtml], klass)
end

#parser(content_type) ⇒ Object

Returns the parser registered for the given content_type



84
85
86
# File 'lib/mechanize/pluggable_parsers.rb', line 84

def parser(content_type)
  content_type.nil? ? default : @parsers[content_type] || default
end

#pdf=(klass) ⇒ Object

Registers klass as the parser for application/pdf content



111
112
113
# File 'lib/mechanize/pluggable_parsers.rb', line 111

def pdf=(klass)
  register_parser(CONTENT_TYPES[:pdf], klass)
end

#register_parser(content_type, klass) ⇒ Object

:nodoc:



88
89
90
# File 'lib/mechanize/pluggable_parsers.rb', line 88

def register_parser(content_type, klass) # :nodoc:
  @parsers[content_type] = klass
end

#xhtml=(klass) ⇒ Object

Registers klass as the parser for application/xhtml+xml content



104
105
106
# File 'lib/mechanize/pluggable_parsers.rb', line 104

def xhtml=(klass)
  register_parser(CONTENT_TYPES[:xhtml], klass)
end

#xml=(klass) ⇒ Object

Registers klass as the parser for text/xml content



125
126
127
# File 'lib/mechanize/pluggable_parsers.rb', line 125

def xml=(klass)
  register_parser(CONTENT_TYPES[:xml], klass)
end