Class: Http::NativeParser

Inherits:
Object
  • Object
show all
Defined in:
lib/http/native_parser.rb

Overview

This is a native ruby implementation of the http parser. It is also the reference implementation for this library. Later there will be one written in C for performance reasons, and it will have to pass the same specs as this one.

Defined Under Namespace

Classes: MethodInfo

Constant Summary collapse

DefaultOptions =

The default set of parse options for the request.

{
  # maximum length of an individual header line.
  :max_header_length => 10240, 
  # maximum number of headers that can be passed to the server
  :max_headers => 100,
  # the size of the request body before it will be spilled
  # to a tempfile instead of being stored in memory.
  :min_tempfile_size => 1048576,
  # the class to use to create and manage the temporary file.
  # Must conform to the same interface as the stdlib Tempfile class
  :tempfile_class => Tempfile,
}
Methods =
{
  "OPTIONS" => MethodInfo[false, true],
  "GET" => MethodInfo[false, false],
  "HEAD" => MethodInfo[false, false],
  "POST" => MethodInfo[true, true],
  "PUT" => MethodInfo[true, true],
  "DELETE" => MethodInfo[false, false],
  "TRACE" => MethodInfo[false, false],
  "CONNECT" => MethodInfo[false, false],
}
RequestLineMatch =

Regex used to match the Request-Line

%r{^([a-zA-Z]+) (.+) HTTP/([0-9]+)\.([0-9]+)\r?\n}
HeaderLineMatch =

Regex used to match a header line. Lines suspected of being headers are also checked against the HeaderContinueMatch to deal with multiline headers

%r{^([a-zA-Z-]+):[ \t]*([[:print:]]+?)\r?\n}
HeaderContinueMatch =
%r{^[ \t]+([[:print:]]+?)\r?\n}
EmptyLineMatch =
%r{^\r?\n}
ChunkSizeLineMatch =

Regex used to match a size specification for a chunked segment

%r{^[0-9a-fA-F]+\r?\n}
AnyLineMatch =

Used as a fallback in error detection for a malformed request line or header.

%r{^.+?\r?\n}

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(options = DefaultOptions) ⇒ NativeParser

Returns a new instance of NativeParser.



76
77
78
79
80
81
82
83
84
# File 'lib/http/native_parser.rb', line 76

def initialize(options = DefaultOptions)
  @method = nil
  @path = nil
  @version = nil
  @headers = {}
  @body = nil
  @state = :request_line
  @options = DefaultOptions.merge(options)
end

Instance Attribute Details

#bodyObject (readonly)

The body of the request as a stream object. May be either a StringIO or a TempFile, depending on request length.



32
33
34
# File 'lib/http/native_parser.rb', line 32

def body
  @body
end

#headersObject (readonly)

A hash of headers passed to the server with the request. All headers will be normalized to ALLCAPS_WITH_UNDERSCORES for consistency’s sake.



28
29
30
# File 'lib/http/native_parser.rb', line 28

def headers
  @headers
end

#methodObject (readonly)

The HTTP method string used. Will always be a string and all-capsed. Valid values are: “GET”, “HEAD”, “POST”, “PUT”, “DELETE”. Other values will cause an exception since then we don’t know whether the request has a body.



15
16
17
# File 'lib/http/native_parser.rb', line 15

def method
  @method
end

#pathObject (readonly)

The path given by the client as a string. No processing is done on this and nearly anything is considered valid.



19
20
21
# File 'lib/http/native_parser.rb', line 19

def path
  @path
end

#versionObject (readonly)

The HTTP version of the request as an array of two integers.

1,0

and [1,1] are the most likely values currently.



23
24
25
# File 'lib/http/native_parser.rb', line 23

def version
  @version
end

Instance Method Details

#can_have_body?Boolean

Returns true if the http method being parsed (if known at this point in the parse) can have a body. If the method hasn’t been determined yet, returns false.

Returns:

  • (Boolean)


96
97
98
# File 'lib/http/native_parser.rb', line 96

def can_have_body?
  Methods[@method].can_have_body
end

#done?Boolean

Returns true if the request is completely done.

Returns:

  • (Boolean)


317
318
319
# File 'lib/http/native_parser.rb', line 317

def done?
  @state == :done
end

#done_body?Boolean

Returns true if the request’s body has been consumed (really the same as done?)

Returns:

  • (Boolean)


330
331
332
# File 'lib/http/native_parser.rb', line 330

def done_body?
  done?
end

#done_headers?Boolean

Returns true if all the headers from the request have been consumed.

Returns:

  • (Boolean)


326
327
328
# File 'lib/http/native_parser.rb', line 326

def done_headers?
  [:body_identity, :body_chunked, :body_chunked_tail, :done].include?(@state)
end

#done_request_line?Boolean

Returns true if the request has parsed the request-line (GET / HTTP/1.1)

Returns:

  • (Boolean)


322
323
324
# File 'lib/http/native_parser.rb', line 322

def done_request_line?
  [:headers, :body_identity, :body_chunked, :body_chunked_tail, :done].include?(@state)
end

#fill_rack_env(env = {}) ⇒ Object

Given a basic rack environment, will properly fill it in with the information gleaned from the parsed request. Note that this only fills the subset that can be determined by the parser library. Namely, the only rack. variable set is rack.input. You should also have defaults in place for SERVER_NAME and SERVER_PORT, as they are required.



295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
# File 'lib/http/native_parser.rb', line 295

def fill_rack_env(env = {})
  env["rack.input"] = @body || StringIO.new
  env["REQUEST_METHOD"] = @method
  env["SCRIPT_NAME"] = ""
  env["REQUEST_URI"] = @path
  env["PATH_INFO"], query = @path.split("?", 2)
  env["QUERY_STRING"] = query || ""
  if (@headers["HOST"] && !env["SERVER_NAME"])
    env["SERVER_NAME"], port = @headers["HOST"].split(":", 2)
    env["SERVER_PORT"] = port if port
  end
  @headers.each do |key, val|
    if (key == 'CONTENT_LENGTH' || key == 'CONTENT_TYPE')
      env[key] = val
    else
      env["HTTP_#{key}"] = val
    end
  end
  return env
end

#has_body?Boolean

Returns true if the request has a body.

Returns:

  • (Boolean)


101
102
103
# File 'lib/http/native_parser.rb', line 101

def has_body?
  @body
end

#must_have_body?Boolean

Returns true if the http method being parsed (if known at this point in the parse) must have a body. If the method hasn’t been determined yet, returns false.

Returns:

  • (Boolean)


89
90
91
# File 'lib/http/native_parser.rb', line 89

def must_have_body?
  Methods[@method].must_have_body
end

#parse(str) ⇒ Object

Takes a string and runs it through the parser. Note that it does not consume anything it can’t completely parse, so you should always pass complete request chunks (lines or body data) to this method. It’s mostly for testing and convenience. In practical use, you want to use parse!, which will remove parsed data from the string you pass in.



111
112
113
# File 'lib/http/native_parser.rb', line 111

def parse(str)
  parse!(str.dup)
end

#parse!(str) ⇒ Object

Consumes as much of str as it can and then removes it from str. This allows you to iteratively pass data into the parser as it comes from the client.



266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
# File 'lib/http/native_parser.rb', line 266

def parse!(str)
  scanner = StringScanner.new(str)
  begin
    while (!scanner.eos?)
      start_pos = scanner.pos
      send(:"parse_#{@state}", scanner)
      if (scanner.pos == start_pos)
        # if we didn't move forward, we've run out of useful string so throw it back.
        return str
      end
    end
  ensure
    # clear out whatever we managed to scan.
    str[0, scanner.pos] = ""
  end
end