Comrite Unix Man page/Perldoc/Info page, English-Chinese Dictionary, Chinese-English Dictionary

Geo::StreetAddress::US--3pm

Command: man perldoc info search(apropos)  


 
US(3pm)               User Contributed Perl Documentation              US(3pm)



NAME
       Geo::StreetAddress::US - Perl extension for parsing US street addresses

SYNOPSIS
         use Geo::StreetAddress::US;

         my $hashref = Geo::StreetAddress::US->parse_location(
                       "1005 Gravenstein Hwy N, Sebastopol CA 95472" );

         my $hashref = Geo::StreetAddress::US->parse_location(
                       "Hollywood & Vine, Los Angeles, CA" );

         my $hashref = Geo::StreetAddress::US->parse_address(
                       "1600 Pennsylvania Ave, Washington, DC" );

         my $hashref = Geo::StreetAddress::US->parse_intersection(
                       "Mission Street at Valencia Street, San Francisco, CA" );

         my $normal = Geo::StreetAddress::US->normalize_address( \%spec );
             # the parse_* methods call this automatically...

DESCRIPTION
       Geo::StreetAddress::US is a regex-based street address and street
       intersection parser for the United States. Its basic goal is to be as
       forgiving as possible when parsing user-provided address strings.
       Geo::StreetAddress::US knows about directional prefixes and suffixes,
       fractional building numbers, building units, grid-based addresses (such
       as those used in parts of Utah), 5 and 9 digit ZIP codes, and all of
       the official USPS abbreviations for street types and state names.

RETURN VALUES
       Most Geo::StreetAddress::US methods return a reference to a hash con-
       taining address or intersection information as one of their arguments.
       This "address specifier" hash may contain any of the following fields
       for a given address. If a given field is not present in the address,
       the corresponding key will be set to "undef" in the hash.

       ADDRESS SPECIFIER


       number
           House or street number.

       prefix
           Directional prefix for the street, such as N, NE, E, etc.  A given
           prefix should be one to two characters long.

       street
           Name of the street, without directional or type qualifiers.

       type
           Abbreviated street type, e.g. Rd, St, Ave, etc. See the USPS offi-
           cial type abbreviations at
           <http://www.usps.com/ncsc/lookups/abbr_suffix.txt>; for a list of
           abbreviations used.

       suffix
           Directional suffix for the street, as above.

       city
           Name of the city, town, or other locale that the address is situ-
           ated in.

       state
           The state which the address is situated in, given as its two-letter
           postal abbreviation. See
           <http://www.usps.com/ncsc/lookups/abbr_state.txt>; for a list of
           abbreviations used.

       zip Five digit ZIP postal code for the address, including leading zero,
           if needed.

       INTERSECTION SPECIFIER


       prefix1, prefix2
           Directional prefixes for the streets in question.

       street1, street2
           Names of the streets in question.

       type1, type2
           Street types for the streets in question.

       suffix1, suffix2
           Directional suffixes for the streets in question.

       city
           City or locale containing the intersection, as above.

       state
           State abbreviation, as above.

       zip Five digit ZIP code, as above.

GLOBAL VARIABLES
       Geo::StreetAddress::US contains a number of global variables which it
       uses to recognize different bits of US street addresses. Although you
       will probably not need them, they are documented here for complete-
       ness's sake.

       %Directional
           Maps directional names (north, northeast, etc.) to abbreviations
           (N, NE, etc.).

       %Direction_Code
           Maps directional abbreviations to directional names.

       %Street_Type
           Maps lowercased USPS standard street types to their canonical
           postal abbreviations as found in TIGER/Line.  See
           eg/get_street_abbrev.pl in the distrbution for how this map was
           generated.

       %State_Code
           Maps lowercased US state and territory names to their canonical
           two-letter postal abbreviations. See eg/get_state_abbrev.pl in the
           distrbution for how this map was generated.

       %State_FIPS
           Maps two-digit FIPS-55 US state and territory codes (including the
           leading zero!) as found in TIGER/Line to the state's canonical two-
           letter postal abbreviation. See eg/get_state_fips.pl in the distr-
           bution for how this map was generated. Yes, I know the FIPS data
           also has the state names. Oops.

       %Addr_Match
           A hash of compiled regular expressions corresponding to different
           types of address or address portions. Defined regexen include type,
           number, fraction, state, direct(ion), dircode, zip, corner, street,
           place, address, and intersection.

CLASS METHODS
Geo::StreetAddress::US->parse_location( $string )
Parses any address or intersection string and returns the appropriate speci-
fier, by calling parse_intersection() or parse_address() as needed.

Geo::StreetAddress::US->parse_address( $address_string )
Parses a street address into an address specifier, returning undef if the
address cannot be parsed. You probably want to use parse_location() instead.

Geo::StreetAddress::US->parse_intersection( $intersection_string )
Parses an intersection string into an intersection specifier, returning undef
if the address cannot be parsed. You probably want to use parse_location()
instead.

Geo::StreetAddress::US->normalize_address( $spec )
Takes an address or intersection specifier, and normalizes its components,
stripping out all leading and trailing whitespace and punctuation, and substi-
tuting official abbreviations for prefix, suffix, type, and state values.
Also, city names that are prefixed with a directional abbreviation (e.g. N,
NE, etc.) have the abbreviation expanded.  The normalized specifier is
returned.

Typically, you won't need to use this method, as the "parse_*()" methods call
it for you.

N.B., "normalize_address()" crops 9-digit ZIP codes to 5 digits. This is for
the benefit of Geo::Coder::US and may not be what you want. E-mail me if this
is a problem and I'll see what I can do to fix it.

BUGS, CAVEATS, MISCELLANY
       Geo::StreetAddress::US might not correctly parse house numbers that
       contain hyphens, such as those used in parts of Queens, New York. Also,
       some addresses in rural Michigan and Illinois may contain letter pre-
       fixes to the building number that may cause problems. Fixing these edge
       cases is on the to-do list, to be sure. Patches welcome!

       This software was originally part of Geo::Coder::US (q.v.) but was
       split apart into an independent module for your convenience. Therefore
       it has some behaviors which were designed for Geo::Coder::US, but which
       may not be right for your purposes. If this turns out to be the case,
       please let me know.

       Geo::StreetAddress::US does NOT perform USPS-certified address normal-
       ization.

SEE ALSO
       This software was originally part of Geo::Coder::US(3pm).

       Lingua::EN::AddressParse(3pm) and Geo::PostalAddress(3pm) both do some-
       thing very similar to Geo::StreetAddress::US, but are either too
       strict/limited in their address parsing, or not really specific enough
       in how they break down addresses (for my purposes). If you want USPS-
       style address standardization, try Scrape::USPS::ZipLookup(3pm). Be
       aware, however, that it scrapes a form on the USPS website in a way
       that may not be officially permitted and might break at any time. If
       this module does not do what you want, you might give the othersa try.
       All three modules are available from the CPAN.

       You can see Geo::StreetAddress::US in action at <http://geocoder.us/>;.

APPRECIATION
       Many thanks to Dave Rolsky for submitting a very useful patch to fix
       fractional house numbers, dotted directionals, and other kinds of edge
       cases, e.g. South St. He even submitted additional tests!

AUTHOR
       Schuyler D. Erle <schuyler AT geocoder.us>

COPYRIGHT AND LICENSE
       Copyright (C) 2005 by Schuyler D. Erle.

       This library is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself, either Perl version 5.8.4 or, at
       your option, any later version of Perl 5 you may have available.



perl v5.8.7                       2005-05-17                           US(3pm)
 

©2005 Comrite