| Quicklinks: Full documentation for v5.0 v4.1 | (v3) | Data |
| Table of contents |
Basic purpose to access the LODES data
OnTheMap (http://lehd.did.census.gov/led/datatools/onthemap.html) is a web-based, interactive mapping application released by the LEHD program at the US Census Bureau. The objective is to show where people work and where workers live on maps with companion reports on their age, earnings, and industry distributions. The underlying data (LEHD Origin-Destination Employment Statistics, LODES) are public-use data available for access and download on the Cornell VirtualRDC (http://www.vrdc.cornell.edu/news/?page_id=4), an internet-accessible computing environment dedicated to the exploration and development of synthetic data.
What can be downloaded and accessed
Since 2005, the U.S. Census Bureau has released multiple versions of the LODES data underlying the OnTheMap (http://lehd.did.census.gov/led/datatools/onthemap.html) application. This site holds
- OnTheMap 2.0 data files (archived offline)
- OnTheMap 3.0 data files
- OnTheMap 4.0 data files
- LODES 5.0 data files on Census website
What users should know about the data
The place of residence counts are generated from a synthetic data model that conditions on disclosure-proofed place of work counts and other observable characteristics. Each of the implicate files available for OD and the RAC represents an independent draw from the synthetic data model. Detailed information on the full OTM data and the synthetic data model can be found in the data documentation (also see updates for OTM v3).
The U.S. Census Bureau wants to encourage use of the multiple implicates of the OTM data. LEHD Program research has found that three (3) implicates are usually sufficient to determine the extent to which the confidentiality protections affect the statistical results. Users who wish to explore the LODES data with additional implicates, please contact LEHD directly.
The base geography for version 3.0 of OnTheMap is TIGER 2006 Second Edition. An archival copy can be found here. Version 4.0 uses TIGER (TBD).
For further information on how to properly analyze multiply synthesized or imputed datasets, see- Raghunathan, Reiter, Rubin (2003), "Multiple Imputation for Statistical Disclosure Limitation," Journal of Official Statistics, 19:1, pgs. 1-16
- Reiter (2004), "New Approaches to Data Dissemination: A glimpse into the future (?)", Chance, 2004:17, pgs. 12-16
- Abowd and Lane (2003), "Synthetic data and confidentiality protection", Technical paper TP-2003-10, LEHD, U.S. Census Bureau
- led-qwi@lists.census.gov (QWI user community)
- lehd-ltd@lists.census.gov (Local Transportation Dynamics community under development)
- ctpp_news@chrispy.net (Census Transportation Planning Package community)
Technical requirements
Downloading data and analyzing it on own computer
In order to analyze the data on their own computers, users need to bring their own statistical software, and depending on the analysis, significant memory. Access is through a regular Web browser in the OnTheMap Download Area (http://www.vrdc.cornell.edu/onthemap/data/). The programs are available on the VirtualRDC OTM website (http://www.vrdc.cornell.edu/onthemap/).
Where to get help
For further information and assistance, contact the VirtualRDC administrators (mailto:virtualrdc@cornell.edu).
Funding and disclaimers
The VirtualRDC is not affiliated with the US Census Bureau. All data made available at this facility are public-use data. The VirtualRDC is partially funded by NSF Grants #0427889 (http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0427889), #0339191 (http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0339191) and #9978093 (http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=9978093) and donations by Novell (http://www.novell.com/linux/) and Intel (http://www.intel.com).