Understanding the Process
The next process step will be to geocode the data file against the locator file. Geocoding works by comparing the data file containing street address, city, state and zip code to a locator file. Each record in the file is done independently. The locator contains thousands of records and the search must be narrowed by comparing zip code, city and state. Once the location is down to the street address, the location of the individual address is determined as a ratio of the street address compared with the length of the street segment and the address range. For example if the address is 445 W. Maple St. in Louisville, KY, the street is located within the database for the appropriate state, city and zip code. The located street is broken into blocks or segments, such as the 400 block of W. Maple Street, 500 block, 600 block and so on. If the 400 block has a length of .2 miles and an address range 401 to 451 odd numbers (odd numbers are always on one side of the street and even on the other). There are a potential of 26 addresses on this side of the street (1 to 51). 445 is the 23rd address in that sequence.
Therefore the address is 88% of the way down the .2 mile segment of the street on the odd number side of the street. The x and y coordinates of this location are determined, by knowing the length of the line segment for our case .2 miles, therefore our point is determined to be .176 miles down this street. The coordinates of the starting location and ending location are known and thus a position can be determined for this address. Depending on how well the name matches the name in the locator file, a tolerance level (confidence) is defined for each point, the lower the tolerance level set in the process the more matches that will occur but the quality will be diminished. The unmatched data points can also be manually matched by the user. Note: post office boxes cannot be coded accurately. In general if geocoding is at 85% or higher of the data then a good sample has been obtained. Usually new streets do not code properly because they are not in the locator database, typographical problems are also a major source of error, as is putting apartment numbers in the address line. In general rural areas are harder to code than urban areas.