Class: ApacheLogRegex

Inherits:
Object
  • Object
show all
Defined in:
lib/apache_log_regex.rb,
lib/apache_log_regex/version.rb

Overview

Apache Log Regex

Parse a line from an Apache log file into a hash.

This is a Ruby port of Peter Hickman’s Apache::LogRegex 1.4 Perl module, available at cpan.uwinnipeg.ca/~peterhi/Apache-LogRegex.

Example Usage

The following one is the most simple example usage. It tries to parse the ‘access.log` file and echoes each parsed line.

format = '%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"'
parser = ApacheLogRegex.new(format)

File.foreach('/var/apache/access.log') do |line|
  begin
    parser.parse(line)
    # {"%r"=>"GET /blog/index.xml HTTP/1.1", "%h"=>"87.18.183.252", ... }
  rescue ApacheLogRegex::ParseError => e
    puts "Error parsing log file: " + e.message
  end
end

More often, you might want to collect parsed lines and use them later in your program. The following example iterates all log lines, parses them and returns an array of Hash with the results.

format = '%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"'
parser = ApacheLogRegex.new(format)

File.readlines('/var/apache/access.log').collect do |line|
  begin
    parser.parse(line)
    # {"%r"=>"GET /blog/index.xml HTTP/1.1", "%h"=>"87.18.183.252", ... }
  rescue ApacheLogRegex::ParseError => e
    nil
  end
end

Defined Under Namespace

Modules: Version Classes: ParseError

Constant Summary collapse

NAME =
'ApacheLogRegex'
GEM =
'apachelogregex'
AUTHOR =
'Simone Carletti <[email protected]>'
VERSION =
Version::STRING
STATUS =
'alpha'
BUILD =
''.match(/(\d+)/).to_a.first

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(format) ⇒ ApacheLogRegex

Initializes a new parser instance with given log format.



96
97
98
99
100
# File 'lib/apache_log_regex.rb', line 96

def initialize(format)
  @regexp = nil
  @names  = []
  @format = parse_format(format)
end

Instance Attribute Details

#formatObject (readonly)

The normalized log file format. Some common formats:

Common Log Format (CLF)
'%h %l %u %t \"%r\" %>s %b'

Common Log Format with Virtual Host
'%v %h %l %u %t \"%r\" %>s %b'

NCSA extended/combined log format
'%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"'


86
87
88
# File 'lib/apache_log_regex.rb', line 86

def format
  @format
end

#namesObject (readonly)

The list of field names that extracted from log format.



92
93
94
# File 'lib/apache_log_regex.rb', line 92

def names
  @names
end

#regexpObject (readonly)

Regexp instance used for parsing a log line.



89
90
91
# File 'lib/apache_log_regex.rb', line 89

def regexp
  @regexp
end

Instance Method Details

#parse(line) ⇒ Object

Parses line according to current log format and returns an hash of log field => value on success. Returns nil if line doesn’t match current log format.



105
106
107
108
109
110
111
112
113
114
# File 'lib/apache_log_regex.rb', line 105

def parse(line)
  row = line.to_s
  row.chomp!
  row.strip!
  return unless match = regexp.match(row)

  data = {}
  names.each_with_index { |field, index| data[field] = match[index + 1] } # [0] == line
  data
end

#parse!(line) ⇒ Object

Same as ApacheLogRegex#parse but raises a ParseError if line doesn’t match current format.

Raises

ParseError

if line doesn’t match current format



123
124
125
# File 'lib/apache_log_regex.rb', line 123

def parse!(line)
  parse(line) || raise(ParseError, "Invalid format `%s` for line `%s`" % [format, line])
end