Econ 424: Computer Methods in Economics
Spring 2008
Session 1: Tuesday & Thursday 8:00am-9:15am
Session 2: Tuesday & Thursday 9:30am-10:45am
Plant Sciences 1129
Instructor:
Amy Knaup (Session 1)
Ginger Zhe Jin (Session 2)
All grades are available
online.
Contact and office hours
In any case, the best way to reach us is via e-mail.
| Session 1 | Session 2 |
|
Amy Knaup Office: 5110 Tydings Office Hours: Monday 11am-12pm Phone: 301-405-3526 E-mail: knaup@econ.umd.edu |
Ginger Z. Jin Office: 3115 H Tydings Office Hours: Wednesday 12:30-1:30pm Phone: 301-405-3484 E-mail: jin@econ.umd.edu Web: www.glue.umd.edu/~ginger/ |
Comments and suggestions for the class design are always welcome.
Goal
As a first step to hi-tech economics, Econ 424 introduces the most basic data handling techniques in economic studies. The ultimate goal is three-fold. At the end of the semester:
In order to fulfill this goal, all classes, including mid-term and final, will meet in a computer lab and use two popular statistical softwares -- Excel and SAS. In addition, students will learn how to use the World Wide Web and how to complete computer projects. Through hands-on experience, students are expected to master both softwares at the introductory level and apply them to economic issues in the real world.
- students should feel comfortable collecting, locating and analyzing real data;
- students should be able to read and interpret statistics generated by other people;
- Given a real data set, students should be able to generate basic statistics and interpret them in a way that makes statistical and economic sense.
Prerequisites
Eligible Students must major in Economics and have completed Econ 305 (Intermediate Macroeconomics Theory and Policy), Econ 306 (Intermediate Microeconomics Theory) and Econ 321 (Economic Statistics). we will devote a couple of classes to introduce Excel and SAS, so experience with either software is not required. However, if you need extra help in getting started, please contact us as soon as possible.
Waiting List Policy
Due to the limit of lab capacity, each session can only accomodate 36 students. If you are number x on the waiting list, you won't get in the class unless x enrolled students drop the class during the semester.
Recommended Textbooks
#1 --- Malcolm Getz, "e.stat for Business and Economics" (CD-ROM), published by Southern-Western, a division of Thompson-Learning. ISBN: 0-324-00895-3.
#2 --- Lora D. Delwiche and Susan J. Slaughter: "The little SAS book: A Primer," third edition (paperback). ISBN: 1-59047-333-7.
#3 --- Ron Cody & Ray Pass, "SAS Programming by Examples", published by SAS Institute Inc. ISBN: 1-55533-681-7.
All books are available at Amazon. We won't use SAS books until after midterm.
Evaluation
Grades for the course will be based on
Details are described below.
Class attendance
Hands-on teaching is much more effective than remote communication by emails. If you miss a class, you can download the lecture notes or consult your classmates. If you still have questions after reading the lecture notes, you are welcome to contact us via email or in person. Please don't expect the instructor or the teaching assistant to re-lecture every point covered in the missed class.
The course involves one mid-term and one final (grading weights in parentheses):
(30) Mid-term (new schedule!): Session 1 (March 27, 8-9:15am) Session 2(March 27, 9:30-10:45am), open book, on-line in Plant Sciences 1129 (the same room for every class meeting).
Old mid-term examples:(30) Final: cumulative, Session 1 (May 20, 10:30am-12:30pm) Session 2 (May 16, 8am-10am), open book, on-line in Plant Sciences 1129 (the same room for every class meeting).Spring 2003: Exam text DataAnswer Key
Spring 2002: Exam text Data Answer Key
Fall 2001: Exam text DataAnswer key
Spring 2001: Exam text and data Answer KeyPractice Final:Hardcopy handout (.doc)
Excel data set (.xls)
SAS data set 1(.csv)
SAS data set 2(.csv)
SAS program (.sas)
Answer Key (Corrected!)Old final examples:Fall 2001:Hardcopy handout (.doc)Spring 2001:
Excel data set (.xls)
SAS data set (.sas)
SAS program (.lst)
Answer KeyHardcopy handout (.doc)
Excel data set (.xls)
SAS data set (.csv)
SAS program (.sas)
SAS output (.lst)
Answer Key
Attention: Grades will be posted online as econ424-spring2008-publicgrade.xls. The same link is provided at the beginning of the syllabus. The file is to be listed by a class id, and class id will be assigned to every one in the first class. Please remember your class id so you can check your grades anytime online.
If you are going to miss the midterm or the final for a legitimate reason (following university policy), please notify the instructor AT LEAST 12 HOURS IN ADVANCE. Any excuse delivered after the exam is invalid and will result in zero test score.
Each student will complete six projects during the course. They fall into two categories, one requiring you to choose an interesting topic and the other based on a given topic. In either case, the quality of presentation matters, treat it as though you were giving it to your new employer. It is not necessarily fancy, but must be clear, right to the point and well explained. Let your classmate critique your project before you hand it in (examples of critiques). You may revise your project in light of your classmate's critique before submitting it.
You are required to submit projects by email to both Ginger Jin and Amy Knaup simultaneously (so that we have a back up in case something goes wrong), with your name and project number in the subject. For the files attached to the email, please name them following the convention of yourlastname_yourfirstname_project_number.
For example, if your name is Joe Smith, your first project should be named Smith_Joe_project_1.* where * denotes the file's extension name. Any project report won't be fully considered unless it is submitted by deadline. Should a submission be delayed for less than two hours, 20% of the full points (for that project) is deducted automatically. Delays over 2 hours is unacceptable. Should there be a legitimate reason for the delay (following university policy), the grading points will be carried over to your midterm or the final (whichever comes first).
Each project is described below, with grading weights in parentheses:
(5) Project 1: Descriptive Statistics
Develop some original data (not published). Make histograms and compute descriptive statistics for two random variables. Each variable should have 30 or more observations. Make clear the method used to generate the sample. Descriptive statistics should include minimum, maximum, mean, median, all the four quartiles, variance, standard deviation, trimmed mean, skewness and kurtosis for each variable. Make separate histogram for each variable. If you believe the data allows you to compare the two variables, plot relative frequency polygon for each variable and put the two polygons in one chart. What do you learn from this chart?(5) Project 2: Monto Carlo StudyThe final report should include (1) one excel sheet as a work sheet including all the detailed step-by-step calculation, and (2) the second excel sheet or a .doc file that includes your key results and explains why you are interested in these two variables, what question you have in mind, and what you have learned from the data summary. The second excel sheet or the .doc file should be clear, concise and to the point, as if you are submitting a summary report to your employer!
Examples: Go to two car dealerships and collect data on the sticker prices of cars. Drive through two areas to collect posted gasoline prices. Use the internet to find prices of comparable products from two sources. Survey students for their daily commuting time and methods. Survey freshmen and seniors for their monthly expenditure on long distance phone calls.
Project 1 is due at 11:59pm of Feb. 19. Each individual student should submit his/her own original report, including a dataset in excel format, summary statistics in the excel file, and a separate word file (or a second excel sheet with text box) describing why you collect this data set. Duplicate is not acceptable.
This project is designed for you to understand data simulation, central limit theorem and sample size.(5) Project 3: Regression1. Focus on a normal distribution by choosing the mean and the standard deviation at your own discretion. Take this normal distribution as the "population".
2. From the population, draw 100 random samples, each with 30 observations. Calculate sample mean for each sample. What does the Central Limit Theorem predicts for the distribution of the sample mean?
3. Observe the distribution of the sample mean. In one chart, plot a relative frequency polygon for the sample mean and a relative frequency polygon for the raw data you have drawn from the population. How is the distribution of the sample mean compared to the population? Is it similar to what the Central Limit Theorem predicts? Explain it in a summary paragraph.
4. Repeat all above exercise (1,2,3) for two different sample sizes, one bigger than 30 one smaller than 30 (you have the discretion of choosing sample sizes). How does sample size affect your results? Explain.
5. Repeat all above exercise (1,2,3,4) for a uniform distribution ranging from a and b (you have the discretion of choosing a and b). How do results differ? Explain.
Project 2 is due at 11:59pm, March 4. Each individual student should submit his/her own excel file with simulated data and a text box summarizing answers to questions listed above . Duplicate data or duplicate summary is not acceptable.
This project is designed for you to carry out basic data cleaning from a public data set and perform mean estimates, mean comparison, and ordinary least square regressions on the cleaned data.On www.census.gov/hhes/www/income/4person.html, the Census Bureau posts the estimated median income for 4-person families by state and year. Please go to the link and complete the following tasks (always use confidence level alpha=95%):
1. Before working on a data set, we must understand how the data owner constructs the data. Such information is usually described in the "notes" section below the table(s). The data we are going to work on were drawn from several surveys. Could you tell us the names of the surveys? Who conducted them originally?
2. The data appear like a table on the computer screen, but they are not in an excel format yet. Please copy and paste the data into an excel sheet, with the following variables:
STATE -- State name
MedFamInc2002 -- Median Income for 4-person Families in Calendar Year 2002
MedFamInc2001 -- Median Income for 4-person Families in Calendar Year 2001(... continue for each calendar year available on the website)MedFamInc1974 -- Median Income for 4-person Families in Calendar Year 1974
In this excel file, take each row of the data as an OBSERVATION, and each column of the data as a VARIABLE. How many observations and how many variables do we have in the data set?
Hint: after you copy and paste a block of text, you need to use "Data - text to columns" to parse the text into multiple columns. Given the data structure in the website, you need to copy and paste for multiple times. Each time, you must make sure numbers on the same row correspond to the same STATE. The original data also reports median family income for the whole United States. Treat it as an additional "state" when you copy and paste the data, and record its name for the variable "STATE" as "United States."
3. Which state has the highest median family income in calendar year 2004? Which state has the lowest median family income in year 1994? Which state's median family income is the closest to that of the whole United States in 1984? ? Hint: you can answer these questions by sorting the data.
4. Now delete the row labeled as "United States" and focus on the 51 states (including District of Columbia) in year 2004. For easy view of the table, you may want to hide the columns for other years. Compute the mean of MedFamInc2004 and its confidence interval. Conduct a statistical test of whether the mean of MedFamInc2004 is equal to $62,732. Set your null hypothesis as (mean of MedFamInc2004=$62,732) and your alternative hypothesis as (mean of MedFamInc2004 not equal to $62,732). Hint: is this a one-tail or two-tail test?
5. Compare years 2004 and 2003 for the 51 states. Conduct a statistical test on where the mean of MedFamInc2004 is equal to the mean of MedFamInc2003. Since we expect the economy to grow from 2003 to 2004, set your null as (mean of MedFamInc2004 = mean of MedFamInc2003) and your alternative hypothesis as (mean of MedFamInc2004 > mean of MedFamInc2003). Hint: is this a one-tail or two-tail test? You are comparing two samples, are they independent or matched pairs?
6. Focus on years 2004 and 1974 for the 51 states. Please draw a scatterplot with MedFamInc2004 on the y-axis and MedFamInc1974 on the x-axis. (Hint: in excel chart wizard, choose the chart type "x-y plot".) What do you lean from the graph? Use excel function CORREL to calculate the correlation coefficient between MedFamInc2004 and MedFamInc1974.
7. Continue from task 6, what would you get if you run a regression where the dependent variable is MedFamInc2004 and the independent variable is MedFamInc1974 (including intercept)? What model does the regression imply? (Write down the regression equation.) How do you interpret the economic meaning of the coefficient of MedFamInc1974? Is the coefficient of MedFamInc1974 significantly different from zero? Use "equal to zero" as your null hypothesis and "not equal to zero" as your alternative hypothesis.
Project 3's deadline is 11:59pm, March 31 (new schedule). Each student submits one excel file as the working sheet and one separate word file answering all the questions.
(5) Project 4: Data cleaning by SAS (I)You are expected to complete the following steps:
1. Save the 2001 Washingtonian ratings of top 100 restaurants in DC area into two comma delimited (.csv) files, first for the sheet "restaurant rating" and second for the sheet "city-demographics." (Note that you cannot save multiple Excel data sheets into one .csv file.)
2. Read in the comma delimited file by SAS
- In the "city-demographics" datasheet, some rows are hidden as I used them to calculate the average demographics for those restaurants who serve multiple zipcodes.
- When you save the sheet into .csv file, it includes all the rows, hidden or non-hidden. So you need to use command "if ... then delete;" in a "data" step to delete these hidden rows once you have read in the data by SAS.
3. Generate summary statistics by SAS:
Project 4 is due on 11:59pm, April 17. Each student should turn in his/her own answers. Each student's report should include a SAS program, a log file, an output file, and a separate word file answering questions. If you edit the output file with CLEAR answers to all questions, you can omit the word file.
- The unit of observation in each data sheet
- Mean, standard deviation, min, max, median, quartiles for at least one numeric variable in the "restaurant rating" sheet and one numeric variable in the "city-demographics" sheet. You can use the original variables in the data or create new variables from the data.
- One frequency chart for each of the above two variables.
- Tabulates for at least two catogorical variables from the "city-demographics" sheet. You can create catogrical variables based on the given continous variables. For example, you can group all observations into three classes by per capita income: high-income, median-income or low-income. Feel free to apply any definition of these classes as long as it is reasonable and informative. First construct one dimension tables for each and then construct a two dimension table for both variables.
- Draw a graph to show the relationship of two variables from the same data sheet. You have the freedom to define new variables and choose the two you are most interested in.
(5) Project 5: Data cleaning by SAS (I)This project continues from Project 4. Please complete the following steps:
1. Reshape the "restaurant-rating" data. For the rating data, each observation is a restaurant. Because sometimes a restaurant may serve multiple locations, we want to reshape the data so that each observation is a restaurant-location. To do this:
- Name "location1", "location2", "location3" and "location4" for the location name columns so that "location1" represents the first location occurred in each record, "location2" represents the second location occurred in each record, etc.
- Form a file that focuses on each restaurant's primary location. To do this, suppose you read in the rating data sheet into a data set called "ratedata". Use command
data rateloc1;Now "rateloc1" which only includes primary locations.
set ratedata;
location=location1;
drop location1 location2 location3 location4;
/* focus on location1 only*/;
run;- Form another file that only includes restaurant's secondary location. Use command:
data rateloc2;
set ratedata;
location=location2; /* focus on location2 only*/;
drop location1 location2 location3 location4;
if location="" then delete;
run;Now "rateloc2" should only include information on restaurants' secondary location.
Question: Why do I need the command "if location="" then delete;" for file rateloc2 but not for file rateloc1?
- Similarly, form a file "rateloc3" for restaurants' third locations, and "rateloc4" for restaurants' fourth location.
- Append "rateloc1" "rateloc2" "rateloc3" and "rateloc4" into one file, call it "rateloc". In this new file each observation is defined by restaurant-location.
Questions: How many observations do you have in "rateloc1", "rateloc2", "rateloc3", "rateloc4" and "rateloc"? What is the difference between "ratedata" and "rateloc"?
- Merge "rateloc" with city demographics data by restaurant location. You must sort both data by "location" before merge.
Questions: How many observation do you have after merge? How many restaurants are there in each location? What is the average rating of restaurants in each location?2. One newspaper article has challenged the authority of Washingtionian restaurant ratings by arguing that the ratings disproportionally favor West-European cuisine styles (including Modern American) and restaurants located in very rich areas such as Bethesda.
Project 5 is due on 11:59pm, May 1. Each student's report should include a SAS program, a log file, an output file, and a separate word file answering questions.
- To address this concern, can you tell me (1) how many Washingtonian rated restaurants are West-European styles? What is their average rating? (2) How many Washingtonian rated restaurants are located in rich neighborhoods? What is their average rating? (3) And how many are West-European styles located in rich areas? What is their average rating? Feel free to define "West European Styles" and "Rich Neighborhoods" in whichever way you like, but you must describe your criteria clearly.
- Define restaurants into two groups -- group 1 for restaurants with west-european styles or located in rich areas; and group 2 for all the others. Can you test if group 1 restaurants on average earn better ratings than group 2? By better ratings, I mean group 1's average rating is statistically better than group 2's average rating. Use "averge rating of group 1=average rating of group 2" as the null hypothesis and "average rating of group 1>average rating of group 2" as the alternative.
(15) Project 6: Create your own analysis
This project is a comprehensive review of all you learn in this class. You are given the freedom to create your own statistical study and hopefully you can show it to your mother and future job interviewer! Start from an interesting idea. Get the data from your hand collection (say the data you submit in Project 1) or from a published resource (trade magazines, consumer guides, mailing-in catalogs, internet shopping and public used data set available in library or on-line). Define two or more variables and explore their relationship by summary statistics, graphs, and regressions. Formulate and test appropriate hypotheses.Feel free to use Excel, SAS or both to facilitate your analysis.
Project 6 is due at 11:59pm of May 20. The final report should include all the computer files you have used for data processing, and a separate word file describing your research question, why the data is suitable for this question, how you answer the question, conclusion and limitations.
Jan. 29: Introduction
Introduce instructorJan. 30: Introduction to Excel
Discuss syllabus
Introduce textbooks
Login IDs and other computer clarification
Questionnaire
Manage a small data set collected from the questionnaireAssign project 1
Readings: Getz e.stat Chapters 1, 2.
Follow the excel tutoring from OIT peer training program.
Feb. 5: Data collection and data descriptionIf you need extra help in Excel, you can sign up the workshop(s) offered by OIT peer training program.or check other links for computer-related resources .
Lecture on data collection.Role playing
Feb. 7: Data DescriptionDivide students into six groups, each represents an institution involved in the subprime mortgage financial crisis . The name of the six institutions are:
- Federal Reserve
- Wall Street Journal
- New York Stock Exchange
- Countrywide Financial
- National Association of Realtors
- European Central Bank
Each group will discuss for 20 minutes and make a mini presentation on (1) one question about the subprime crisis you would like to answer, (2) the best data set available in your institution for the question, and (3) the methodology you would like to use to answer the question.
Before the class, we will allocate 6 seats to each group. You can choose the group that is most interesting to you, subject to seat availability. First come, first serve!
Follow the story in Getz e.stat Chapter 4Feb. 12: HistogramMean (weighted and non weighted)Use first-day questionaire as an example
Median
Order statistics
Variance and standard deviation
SkewnessReadings: All sections in e.stat Chapters 3 and 4 except 4.14 and 4.15.
HistogramFeb. 14: Probability
Relative frequency polygonReadings: e.stat 4.3, 4.4, 4.5.
Distinguish population and sampleFeb. 19: Distribution and simulationFlip coins and dice
Theory and practice, not necessary match, law of large numbersMarginal and conditional probability, statistical independence, expectation.Readings: Getz e.stat Chapter 5.1-5.6, 6.1-6.7.
Bernoulli Process (flip coins or roll dice)Feb. 21: Practice on simulation and the law of large numbers
Uniform PDF
Normal PDF
Data simulation and the law of large numbersReadings: Getz e.stat Chapters 7,8. Emphasis: 7.1-7.5, 7.12, 7.14, 7.18-7.20, 8.1-8.3, 8.8-8.13.
Project 1 due. Assign Project 2
Feb. 26: Mean estimation
Feb. 28: Hypothesis Testing
Use Monto Carlo to generate estimates for population mean
t distribution
Confidence interval
Testing
Why sample size matters?Readings: Getz e.stat Chapter 11,12. Emphasis: 11.1-11.4, 11.6, 11.7, 11.9, 12.1-12.9, 12.10-12.12.
Null hypothesis versus alternatives
Type I and Type II errors
One tail test vs. two tail testReadings: Getz e.stat Chapter 13.1-13.10, 13.13, 13.15-13.18.
March 4: Testing of two samples
Testing equal mean
Independent samples
Matched pairsReadings: Getz e.stat Chapter 14.1-14..5, 14.8-14.11, 14.14-14.16.
Project 2 due. Assign Project 3
March 6: Regression
Scatter plot
basic regression theory
R square , F test
Standard error of coefficients
Testing
Results Interpretation
Readings: Getz e.stat Chapters 19, 20, 21, 22. Emphasis: all sections in Chapter 19, 21.6-21.7, 21.10, 21.12-21.14, 22.1-22.5, 22.7
March 11: Practice of Regression
March 13-25: Review of data analysis in Excel
March 27: Midterm. Project 3 due on March 31.
April 1: Review of Midterm and Introduction to SAS
Shortcomings of Excel
Introduce SAS programming rationale
Read in data
Assign Project 4Readings: The Little SAS Book Chapters 1and 2.
April 3 - April 10: Data manipulation of a single data set
Read in complicated data
Data recoding
Converting Date to values
SAS functions
Use SAS to generate and present sample statistics
Readings: The Little SAS Book Chapters 2,3,4.
April 15 - April 22 : Data Manipulation of multiple data sets
Sort a data setApril 24 - April 29: Hypothesis testing in SAS May 1 - May 8: Regression and Testing
Generate a subset of data
Add in new observations to a data set
Merge and update data setsReadings: Cody & Pass SAS Programming by Examples Chapters 3, 4.
Project 4 due on April 17.
Assign Project 5 on April 15.
Generate dummy variablesMay 13: Review of Excel and SAS materials, practice final
Imputing missing values
Regression command
Testing linear restrictions
Comprehensive examples
Project 5 due on May 1.
Assign project 6 on May 1.
May 16: Session 2 Final 8am-10am Plant Sciences 1129
May 20: Session 1 Final 10:30am-12:30pm Plant Sciences 1129
Project 6 Due on May 20.
Feb. 19 -- Project 1 due
Mar. 4 -- Project 2 due
Mar. 27 -- Midterm
Mar. 31-- Project 3 due
Apr. 17 -- Project 4 due
May 1 -- Project 5 due
May 16 -- Session 2 Final
May 20 -- Session 1 Final
May 20 -- Project 6 due
Excel examples
Notes for Excel.ppt
data-summary-formula.doc
data-summary-example.xls
coinflip-dierolling-example.xls
simulation-formula.xls
show-central-limit-theorem.xls
SAS program
Notes for SAS Programming.ppt
sas-class1.sas
sas-class2.sas
sas-example-merge-meancomp.sas
A comprehensive example: reg-cityreg.sas
Final review
final-review.ppt
practice-final.doc
data used in practice final
On-line tutoring for Intermediate Excel - made available by UMD peer training program
MS Excel help
Hands-on Tutor for Windows, MS Office and Internet, a CD-ROM published by the Corporation for Research and Educational Networking (CREN), is available at the UMD information technology library (computer and space science building #1400) for a free on-site review or an on-site purchase with $20.
Microsoft Frequently-asked-questions and highlights for Excel 2000SAS on-line help
SAS Institute, Inc.Samples & SAS NotesUMD on campus SAS helpSAS Topics - Data Management (offered by UCLA)
SAS Topics - Regression (offered by UCLA)
Links to public data sets: on-campus, U.S. domestic and international
- UMD library data resources
Data by subjects including Agricultural/Life Sciences, Arts and Humanities, Business/Economics, Conferences and Proceedings, Education, Engineering/Technology, General/Multidisciplinary, Government, History, Law and Public Affairs, Mathematics/Computer Sciences, Medicine and Health, Newspapers/Current Events, Physical Sciences, Social Sciences.
- Common links for federal agencies (FEDSTATS)
Statistics from over 100 U.S. federal agencies. Data available under Topic links - A to Z (alphabetical listing of topics); Mapstats with statistical profiles of states, counties, congressional districts, and federal judicial districts; Statistics by geography from U.S. agencies with international comparisons, national, state, country, and local; Statistical reference shelf with published collections of statistics available online including Statistical Abstract of the United States, State and Metropolitan Area Data Book, Health United States, Digest of Education Statistics 1999, The Condition of Education 2000, Projection of Education Statistics to 2010, Energy Information Administration?s Quick Stats, Report on the America Workforce 1999. Broad topics include agriculture, crime, population and demographics, education, energy and environment, labor force, business and banking, health, international, national accounts. Link to other statistical sites under Additional links.
- U.S. Census Bureau
Social and economic statistics for the U.S. available under Census 2000 - Resident Population and Apportionment, Product Overview and Schedule, American Fact Finder; People - Topics include estimates, projections, population profile, age births, children, deaths, education, fertility, immigration, income, marital status, occupation, poverty, race, school enrollment, voting and registration, etc.; Business - Topics include companies, small business, concentration ratios, construction, government, international trade, manufacturing, mining, retail, services, transportation, wholesale, etc.
- Bureau of Labor Statistics
Labor and economic data for the U.S. available under Data - Most Requested Data, Selective Access, Economy at a Glance, News Releases, Series Report. Topics include prices, employment and unemployment, compensation and working conditions, productivity, geographical access. Link to other web sources for U.S. and international data under Other Statistical Sites.
- National Science Foundation
Scientific and engineering data for the U.S. Data available under Science Statistics - Science and Engineering Indicators 2000. Data also available under Science Statistics - Databases. Computer-Aided Science Policy Analysis and Research (WebCASPAR) with information about academic science and engineering resources; Scientists and Engineers Data System (SESTAT) with information about employment, educational, and demographic characteristics of scientists and engineers in the U.S.; Social and Economic Implications of Information Technologies with implications of IT for the home, education, community, government, science, employment and work, commerce, productivity, institutional structure, globalization, and selected policy issues.
- National Opinion Research Center (NORC)
NORC is a non-profit corporation affiliated with the University of Chicago that conducts survey research in the public interest for government agencies, educational institutions, private foundations, non-profit organizations, and private corporations. We collect data to help policy makers, researchers, educators, and others address the crucial issues facing the government, organizations, and the public. One example of NORC survey is the General Social Survey (GSS).
- Center for Medicare & Medicaid Services, previously Health Care Financing Administration (HCFA)
CMS (previously HCFA) is the federal agency that administers Medicare, Medicaid and the State Children's Health Insurance Program. HCFA also posts Medicare data and Medicaid data for public use. Data is available under Stats and Data - Data, including Public Use Data Files (PUF's) and 1999 Resource-Based Practice Expense Data Files; Stats and Data - Statistics with Medicare Estimated Benefit Payments by State (19997, 1998, 1999), National Healthcare Indicators and Expenditures (state healthcare expenditures, healthcare indicators, national health expenditures, national healthcare expenditures projections), Medicare Enrolment (Medicare county enrolment, Medicare national enrolment trends, Medicare state enrolment for 1998, Medicare state and county enrolment for 1998), Medicare Utilization and Expenditure Tables(Medicare provider and analysis review), Clinical Practice Expense Program (CPEP); Stats and Data - Publications with Publications and Data, HCFA's annual performance plan, resource-based malpractice RVU's, Statistical Publications; Stats and Data - Other Statistical and Data Links with social security income (SSI) data, Medicare and Medicaid data, links to other government statistical sites.
- Center of Disease Control
Data on health, disease, and disease prevention. Data available under Data and Statistics - Scientific Data including hazardous substance release/health effects database (HAZDAT), web-based injury statistics query and reporting system (WISQARS); Data and Statistics - Surveillance with assisted reproductive technology success rates, behavioral risk factor surveillance system, cancer registries program, HIV/AIDS surveillance report, pregnancy risk assessment monitoring system, sexually transmitted diseases, tuberculosis surveillance reports, youth risk behavior surveillance system; Data and Statistics - Health Statistics - National Center for Health Statistics - Surveys and Data; Data and Statistics - Laboratory Information.
- Agency for Healthcare Research and Quality (AHRQ)
Data on healthcare for the U.S. Data available under Data and Surveys - MEPS (Medical Expenditure Panel Survey) - MEPS Website - Data and Publications, which includes data on households, nursing homes, insurance, and projected health spending. Data also available under Data and Surveys - HCUP (Healthcare Cost and Utilization Project) - What?s New, which has data on hospitalization and under Data and Surveys - HCUP (Healthcare Cost and Utilization Project) - HCUP Data, which includes nationwide inpatient sample (NIS), state inpatient databases (SID), and state ambulatory surgery databases (SASD).
International Data sets
- State and local governments
The library of Congress provides links to state and local government information. Through the state name "Maryland", you can access Maryland state goverment and local
county/city governments such as Annapolis City, Baltimore City, Dorcheser County, Montgomery County and Prince Georges County. While the main purpose of these
websites is not to provide statistical data for economic studies, you may find lists of historical data available in these governments, for example land records and military records. Even if the data you are looking for is not available in these websites, you may obtain knowledge about whom to contact for further information.
- World Bank
Social and economic indicators for 206 countries and 17 country groups (based on regions, income, and economic development) reported in the World Development Indicators book. Data available under Country Data, Data by Topic, and Data Query. Broad topics include agriculture, domestic finance, development assistance, early childhood development, education, environment, gender, governance, health and population, HIV/Aids, informatics, infrastructure, industry, international economics, labor and employment, macroeconomics and growth, poverty, private sector development, public sector management, rural development, social development, transition, urban development.
- International Monetary Funds (IMF)
Financial data for member countries. Data available under About the IMF - World Economic Outlook - Statistical Appendix (downloadable PDF file): topics include output, inflation, financial policies, foreign trade, current account transactions, balance of payments and external financing, external debt and debt service, flow of funds. Data also available under IMF Finances - Daily and Monthly Exchange Rates and under IMF Finances - Member Financial Data (by country and by all countries) including disbursements and repayments, projected obligations to the IMF, IMF credit outstanding, lending arrangements, SDR allocations and holdings, arrears.
- United Nations
Social and economic data available for countries. Data available under MBS On-line - Economic Statistics - Monthly Bulletin of Statistics (MBS On-line) sample (free access) - Data Selection: broad topics include construction, electricity, finance, food, fuel imports, industrial production, labor force, metals, mining, motor vehicles, paper, petroleum products, population, prices, rubber, textile, trade, transport. Data also available under MBS On-line - Demographic and Social Statistics: topics are population of capital cities and social indicators such as population, youth and elderly populations, human settlements, water supply and sanitation, housing, health, child-bearing, education, literacy, income and economic activity, unemployment.