File: README — Documentation for iGEL-ua

About ua_parser

ua_parser will become a ruby gem to identify user agents like browsers or crawlers by the provided user agent string. I’m planning try to get most of the available information like GUI language of the browser or email addresses provided by a bot out of it.

I tried to identify common user agents first, reducing the necessary regexps for them. But I guess, it could be improved alot. Of course I’d like to get feedback. Even if you just revise my crappy English, send me an e-mail. ;-)

Project status (as of 2009-01-25):

Right know, the project is at a very early state. Of my 14 million hits sample, ua_parser can identify about 96.5 % of all hits.

I tried to cover as much as possible with tests. At the moment, I have 99 tests implemented.

Known browsers:

Chrome
Firefox and most other gecko based browsers
Internet Explorer
Opera, pure and pretending to be an Internet Explorer or Firefox
Safari >= Version 3

Known bots:

Baidubot
gigabot
gonzo (of suchen.de)
Googlebot, Googlebot-Images, Mediapartners-Google
mj12bot
msnbot and msnbot-media
seekbot
speedy spider
twiceler (of cuil.com)
Yahoo! Slurp
yeti (of naver.com)

Other known agents

Apache httpd
Jakarta Commons httpclient
Java
libwww-perl
SVN
TortoiseSVN
veoh service

Also, ua_parser tries to identify bots and feedreader, even if it doesn’t know about them. That way, the results should be close to 100%.