Module: Geocoder::US

Defined in:
lib/geocoder/us.rb,
lib/geocoder/us/numbers.rb,
lib/geocoder/us/database.rb,
lib/geocoder/us/constants.rb,
lib/geocoder/us/address.rb

Overview

Imports the Geocoder::US::Database and Geocoder::US::Address modules.

General usage is as follows:

>> require 'geocoder/us'
>> db = Geocoder::US::Database.new("/opt/tiger/geocoder.db")
>> p db.geocode("1600 Pennsylvania Av, Washington DC")

[{:pretyp=>"", :street=>"Pennsylvania", :sufdir=>"NW", :zip=>"20502",
  :lon=>-77.037528, :number=>"1600", :fips_county=>"11001", :predir=>"",
  :precision=>:range, :city=>"Washington", :lat=>38.898746, :suftyp=>"Ave",
  :state=>"DC", :prequal=>"", :sufqual=>"", :score=>0.906, :prenum=>""}]

See Geocoder::US::Database and README.txt for more details.

Defined Under Namespace

Classes: Address, Database, Import, Map, NumberMap

Constant Summary collapse

Cardinals =

The Cardinals constant maps digits to cardinal number words and back.

NumberMap[%w[
  zero one two three four five six seven eight nine ten
  eleven twelve thirteen fourteen fifteen sixteen seventeen
  eighteen nineteen
]]
Cardinal_Tens =
%w[ twenty thirty forty fifty sixty seventy eighty ninety ]
Ordinals =

The Ordinals constant maps digits to ordinal number words and back.

NumberMap[%w[
  zeroth first second third fourth fifth sixth seventh eighth ninth
  tenth eleventh twelfth thirteenth fourteenth fifteenth sixteenth
  seventeenth eighteenth nineteenth
]]
Directional =

The Directional constant maps compass direction words in English and Spanish to their 1- or 2- letter abbreviations. See 2008 TIGER/Line technical documentation Appendix C for more details.

Prefix_Qualifier =

The Prefix_Qualifier constant maps feature prefix qualifiers to their abbreviations. See 2008 TIGER/Line technical documentation Appendix D.

Suffix_Qualifier =

The Suffix_Qualifier constant maps feature suffix qualifiers to their abbreviations. See 2008 TIGER/Line technical documentation Appendix D.

Prefix_Canonical =

The Prefix_Canonical constant maps canonical TIGER/Line street type prefixes to their abbreviations. This list is the subset of the list from 2008 TIGER/Line technical documentation Appendix E that was extracted from a TIGER/Line database import.

{
  "Arcade"                            => "Arc",
  "Autopista"                         => "Autopista",
  "Avenida"                           => "Ave",
  "Avenue"                            => "Ave",
  "Boulevard"                         => "Blvd",
  "Bulevar"                           => "Bulevar",
  "Bureau of Indian Affairs Highway"  => "BIA Hwy",
  "Bureau of Indian Affairs Road"     => "BIA Rd",
  "Bureau of Indian Affairs Route"    => "BIA Rte",
  "Bureau of Land Management Road"    => "BLM Rd",
  "Bypass"                            => "Byp",
  "Calle"                             => "Cll",
  "Calleja"                           => "Calleja",
  "Callejón"                          => "Callejón",
  "Caminito"                          => "Cmt",
  "Camino"                            => "Cam",
  "Carretera"                         => "Carr",
  "Cerrada"                           => "Cer",
  "Círculo"                           => "Cír",
  "Commons"                           => "Cmns",
  "Corte"                             => "Corte",
  "County Highway"                    => "Co Hwy",
  "County Lane"                       => "Co Ln",
  "County Road"                       => "Co Rd",
  "County Route"                      => "Co Rte",
  "County State Aid Highway"          => "Co St Aid Hwy",
  "County Trunk Highway"              => "Co Trunk Hwy",
  "County Trunk Road"                 => "Co Trunk Rd",
  "Court"                             => "Ct",
  "Delta Road"                        => "Delta Rd",
  "District of Columbia Highway"      => "DC Hwy",
  "Driveway"                          => "Driveway",
  "Entrada"                           => "Ent",
  "Expreso"                           => "Expreso",
  "Expressway"                        => "Expy",
  "Farm Road"                         => "Farm Rd",
  "Farm-to-Market Road"               => "FM",
  "Fire Control Road"                 => "Fire Cntrl Rd",
  "Fire District Road"                => "Fire Dist Rd",
  "Fire Lane"                         => "Fire Ln",
  "Fire Road"                         => "Fire Rd",
  "Fire Route"                        => "Fire Rte",
  "Fire Trail"                        => "Fire Trl",
  "Forest Highway"                    => "Forest Hwy",
  "Forest Road"                       => "Forest Rd",
  "Forest Route"                      => "Forest Rte",
  "Forest Service Road"               => "FS Rd",
  "Highway"                           => "Hwy",
  "Indian Route"                      => "Indian Rte",
  "Indian Service Route"              => "Indian Svc Rte",
  "Interstate Highway"                => "I-",
  "Lane"                              => "Ln",
  "Logging Road"                      => "Logging Rd",
  "Loop"                              => "Loop",
  "National Forest Development Road"  => "Nat For Dev Rd",
  "Navajo Service Route"              => "Navajo Svc Rte",
  "Parish Road"                       => "Parish Rd",
  "Pasaje"                            => "Pasaje",
  "Paseo"                             => "Pso",
  "Passage"                           => "Psge",
  "Placita"                           => "Pla",
  "Plaza"                             => "Plz",
  "Point"                             => "Pt",
  "Puente"                            => "Puente",
  "Ranch Road"                        => "Ranch Rd",
  "Ranch to Market Road"              => "RM",
  "Reservation Highway"               => "Resvn Hwy",
  "Road"                              => "Rd",
  "Route"                             => "Rte",
  "Row"                               => "Row",
  "Rue"                               => "Rue",
  "Ruta"                              => "Ruta",
  "Sector"                            => "Sec",
  "Sendero"                           => "Sendero",
  "Service Road"                      => "Svc Rd",
  "Skyway"                            => "Skwy",
  "Square"                            => "Sq",
  "State Forest Service Road"         => "St FS Rd",
  "State Highway"                     => "State Hwy",
  "State Loop"                        => "State Loop",
  "State Road"                        => "State Rd",
  "State Route"                       => "State Rte",
  "State Spur"                        => "State Spur",
  "State Trunk Highway"               => "St Trunk Hwy",
  "Terrace"                           => "Ter",
  "Town Highway"                      => "Town Hwy",
  "Town Road"                         => "Town Rd",
  "Township Highway"                  => "Twp Hwy",
  "Township Road"                     => "Twp Rd",
  "Trail"                             => "Trl",
  "Tribal Road"                       => "Tribal Rd",
  "Tunnel"                            => "Tunl",
  "US Forest Service Highway"         => "USFS Hwy",
  "US Forest Service Road"            => "USFS Rd",
  "US Highway"                        => "US Hwy",
  "US Route"                          => "US Rte",
  "Vereda"                            => "Ver",
  "Via"                               => "Via",
  "Vista"                             => "Vis",
}
Prefix_Alternate =

The Prefix_Alternate constant maps alternate prefix street types to their canonical abbreviations. This list was merged in from the USPS list at www.usps.com/ncsc/lookups/abbr_suffix.txt.

{
  "Av"			=> "Ave",
  "Aven"			=> "Ave",
  "Avenu"			=> "Ave",
  "Avenue"			=> "Ave",
  "Avn"			=> "Ave",
  "Avnue"			=> "Ave",
  "Boul"			=> "Blvd",
  "Boulv"			=> "Blvd",
  "Bypa"			=> "Byp",
  "Bypas"			=> "Byp",
  "Byps"			=> "Byp",
  "Crt"			=> "Ct",
  "Exp"			=> "Expy",
  "Expr"			=> "Expy",
  "Express"			=> "Expy",
  "Expw"			=> "Expy",
  "Highwy"			=> "Hwy",
  "Hiway"			=> "Hwy",
  "Hiwy"			=> "Hwy",
  "Hway"			=> "Hwy",
  "La"			=> "Ln",
  "Lanes"			=> "Ln",
  "Loops"			=> "Loop",
  "Plza"			=> "Plz",
  "Sqr"			=> "Sq",
  "Sqre"			=> "Sq",
  "Squ"			=> "Sq",
  "Terr"			=> "Ter",
  "Tr"			=> "Trl",
  "Trails"			=> "Trl",
  "Trls"			=> "Trl",
  "Tunel"			=> "Tunl",
  "Tunls"			=> "Tunl",
  "Tunnels"			=> "Tunl",
  "Tunnl"			=> "Tunl",
  "Vdct"			=> "Via",
  "Viadct"			=> "Via",
  "Viaduct"			=> "Via",
  "Vist"			=> "Vis",
  "Vst"			=> "Vis",
  "Vsta"			=> "Vis"
}
Prefix_Type =

The Prefix_Type constant merges the canonical prefix type abbreviations with their USPS accepted alternates.

Suffix_Canonical =

The Suffix_Canonical constant maps canonical TIGER/Line street type suffixes to their abbreviations. This list is the subset of the list from 2008 TIGER/Line technical documentation Appendix E that was extracted from a TIGER/Line database import.

{
  "Alley"                             => "Aly",
  "Arcade"                            => "Arc",
  "Avenida"                           => "Ave",
  "Avenue"                            => "Ave",
  "Beltway"                           => "Beltway",
  "Boulevard"                         => "Blvd",
  "Bridge"                            => "Brg",
  "Bypass"                            => "Byp",
  "Causeway"                          => "Cswy",
  "Circle"                            => "Cir",
  "Common"                            => "Cmn",
  "Commons"                           => "Cmns",
  "Corners"                           => "Cors",
  "Court"                             => "Ct",
  "Courts"                            => "Cts",
  "Crescent"                          => "Cres",
  "Crest"                             => "Crst",
  "Crossing"                          => "Xing",
  "Cutoff"                            => "Cutoff",
  "Drive"                             => "Dr",
  "Driveway"                          => "Driveway",
  "Esplanade"                         => "Esplanade",
  "Estates"                           => "Ests",
  "Expressway"                        => "Expy",
  "Forest Highway"                    => "Forest Hwy",
  "Fork"                              => "Frk",
  "Four-Wheel Drive Trail"            => "4WD Trl",
  "Freeway"                           => "Fwy",
  "Grade"                             => "Grade",
  "Heights"                           => "Hts",
  "Highway"                           => "Hwy",
  "Jeep Trail"                        => "Jeep Trl",
  "Landing"                           => "Lndg",
  "Lane"                              => "Ln",
  "Logging Road"                      => "Logging Rd",
  "Loop"                              => "Loop",
  "Motorway"                          => "Mtwy",
  "Oval"                              => "Oval",
  "Overpass"                          => "Opas",
  "Parkway"                           => "Pkwy",
  "Pass"                              => "Pass",
  "Passage"                           => "Psge",
  "Path"                              => "Path",
  "Pike"                              => "Pike",
  "Place"                             => "Pl",
  "Plaza"                             => "Plz",
  "Point"                             => "Pt",
  "Pointe"                            => "Pointe",
  "Promenade"                         => "Promenade",
  "Railroad"                          => "RR",
  "Railway"                           => "Rlwy",
  "Ramp"                              => "Ramp",
  "River"                             => "Riv",
  "Road"                              => "Rd",
  "Roadway"                           => "Roadway",
  "Route"                             => "Rte",
  "Row"                               => "Row",
  "Rue"                               => "Rue",
  "Service Road"                      => "Svc Rd",
  "Skyway"                            => "Skwy",
  "Spur"                              => "Spur",
  "Square"                            => "Sq",
  "Stravenue"                         => "Stra",
  "Street"                            => "St",
  "Strip"                             => "Strip",
  "Terrace"                           => "Ter",
  "Thoroughfare"                      => "Thoroughfare",
  "Tollway"                           => "Tollway",
  "Trace"                             => "Trce",
  "Trafficway"                        => "Trfy",
  "Trail"                             => "Trl",
  "Trolley"                           => "Trolley",
  "Truck Trail"                       => "Truck Trl",
  "Tunnel"                            => "Tunl",
  "Turnpike"                          => "Tpke",
  "Viaduct"                           => "Viaduct",
  "View"                              => "Vw",
  "Vista"                             => "Vis",
  "Walk"                              => "Walk",
  "Walkway"                           => "Walkway",
  "Way"                               => "Way",
}
Suffix_Alternate =

The Suffix_Alternate constant maps alternate suffix street types to their canonical abbreviations. This list was merged in from the USPS list at www.usps.com/ncsc/lookups/abbr_suffix.txt.

{
  "Allee"			=> "Aly",
  "Ally"			=> "Aly",
  "Av"			=> "Ave",
  "Aven"			=> "Ave",
  "Avenu"			=> "Ave",
  "Avenue"			=> "Ave",
  "Avn"			=> "Ave",
  "Avnue"			=> "Ave",
  "Boul"			=> "Blvd",
  "Boulv"			=> "Blvd",
  "Brdge"			=> "Brg",
  "Bypa"			=> "Byp",
  "Bypas"			=> "Byp",
  "Byps"			=> "Byp",
  "Causway"			=> "Cswy",
  "Circ"			=> "Cir",
  "Circl"			=> "Cir",
  "Crcl"			=> "Cir",
  "Crcle"			=> "Cir",
  "Crecent"			=> "Cres",
  "Cresent"			=> "Cres",
  "Crscnt"			=> "Cres",
  "Crsent"			=> "Cres",
  "Crsnt"			=> "Cres",
  "Crssing"			=> "Xing",
  "Crssng"			=> "Xing",
  "Crt"			=> "Ct",
  "Driv"			=> "Dr",
  "Drv"			=> "Dr",
  "Exp"			=> "Expy",
  "Expr"			=> "Expy",
  "Express"			=> "Expy",
  "Expw"			=> "Expy",
  "Freewy"			=> "Fwy",
  "Frway"			=> "Fwy",
  "Frwy"			=> "Fwy",
  "Height"			=> "Hts",
  "Hgts"			=> "Hts",
  "Highwy"			=> "Hwy",
  "Hiway"			=> "Hwy",
  "Hiwy"			=> "Hwy",
  "Ht"			=> "Hts",
  "Hway"			=> "Hwy",
  "La"			=> "Ln",
  "Lanes"			=> "Ln",
  "Lndng"			=> "Lndg",
  "Loops"			=> "Loop",
  "Ovl"			=> "Oval",
  "Parkways"			=> "Pkwy",
  "Parkwy"			=> "Pkwy",
  "Paths"			=> "Path",
  "Pikes"			=> "Pike",
  "Pkway"			=> "Pkwy",
  "Pkwys"			=> "Pkwy",
  "Pky"			=> "Pkwy",
  "Plza"			=> "Plz",
  "Rivr"			=> "Riv",
  "Rvr"			=> "Riv",
  "Spurs"			=> "Spur",
  "Sqr"			=> "Sq",
  "Sqre"			=> "Sq",
  "Squ"			=> "Sq",
  "Str"			=> "St",
  "Strav"			=> "Stra",
  "Strave"			=> "Stra",
  "Straven"			=> "Stra",
  "Stravn"			=> "Stra",
  "Strt"			=> "St",
  "Strvn"			=> "Stra",
  "Strvnue"			=> "Stra",
  "Terr"			=> "Ter",
  "Tpk"			=> "Tpke",
  "Tr"			=> "Trl",
  "Traces"			=> "Trce",
  "Trails"			=> "Trl",
  "Trls"			=> "Trl",
  "Trnpk"			=> "Tpke",
  "Trpk"			=> "Tpke",
  "Tunel"			=> "Tunl",
  "Tunls"			=> "Tunl",
  "Tunnels"			=> "Tunl",
  "Tunnl"			=> "Tunl",
  "Turnpk"			=> "Tpke",
  "Vist"			=> "Vis",
  "Vst"			=> "Vis",
  "Vsta"			=> "Vis",
  "Walks"			=> "Walk",
  "Wy"			=> "Way",
}
Suffix_Type =

The Suffix_Type constant merges the canonical suffix type abbreviations with their USPS accepted alternates.

Unit_Type =

The Unit_Type constant lists acceptable USPS unit type abbreviations from www.usps.com/ncsc/lookups/abbr_sud.txt.

Std_Abbr =
Name_Abbr =

The Name_Abbr constant maps common toponym abbreviations to their full word equivalents. This list was constructed partly by hand, and partly by matching USPS alternate abbreviations with feature names found in the TIGER/Line dataset.

State =

The State constant maps US state and territory names to their 2-letter USPS abbreviations.

Match =

Defines the matching of parsed address tokens.

{
  # FIXME: shouldn't have to anchor :number and :zip at start/end
  :number   => /^(\d+\W|[a-z]+)?(\d+)([a-z]?)\b/io,
  :street   => /(?:\b(?:\d+\w*|[a-z'-]+)\s*)+/io,
  :city     => /(?:\b[a-z'-]+\s*)+/io,
  :state    => Regexp.new(State.regexp.source + "\s*$", Regexp::IGNORECASE),
  :zip      => /(\d{5})(?:-\d{4})?\s*$/o,
  :at       => /\s(at|@|and|&)\s/io,
  :po_box => /\b[P|p]*(OST|ost)*\.*\s*[O|o|0]*(ffice|FFICE)*\.*\s*[B|b][O|o|0][X|x]\b/
}
VERSION =
"2.0.0"