About ua_parser

ua_parser will become a ruby gem to identify user agents like browsers or crawlers by the provided user agent string. I’m planning try to get most of the available information like GUI language of the browser or email addresses provided by a bot out of it.

I tried to identify common user agents first, reducing the necessary regexps for them. But I guess, it could be improved alot. Of course I’d like to get feedback. Even if you just revise my crappy English, send me an e-mail. ;-)

Project status (as of 2009-01-25):

Right know, the project is at a very early state. Of my 14 million hits sample, ua_parser can identify about 96.5 % of all hits.

I tried to cover as much as possible with tests. At the moment, I have 99 tests implemented.

Known browsers:

  • Chrome

  • Firefox and most other gecko based browsers

  • Internet Explorer

  • Opera, pure and pretending to be an Internet Explorer or Firefox

  • Safari >= Version 3

Known bots:

  • Baidubot

  • gigabot

  • gonzo (of suchen.de)

  • Googlebot, Googlebot-Images, Mediapartners-Google

  • mj12bot

  • msnbot and msnbot-media

  • seekbot

  • speedy spider

  • twiceler (of cuil.com)

  • Yahoo! Slurp

  • yeti (of naver.com)

Other known agents

  • Apache httpd

  • Jakarta Commons httpclient

  • Java

  • libwww-perl

  • SVN

  • TortoiseSVN

  • veoh service

Also, ua_parser tries to identify bots and feedreader, even if it doesn’t know about them. That way, the results should be close to 100%.