School of Economics
ASSIGNMENT
Semester 2 - 2024
ECMT1010 Introduction to Economic Statistics
Due: 11.59PM Friday 25 October 2024
Instructions
i.  Use your assigned data set (see below).  Use of the wrong data set will result in low marks (because most of your answers will be wrong).
ii.  Submit your answers as a PDF file.  DO NOT include a cover page as Turnitin will detect this as a similarity match.
iii. Answer all questions.  Show numerical answers to 3 decimal places.  Carry out all tests at a 5% level of significance.
iv. Work submitted after the due date is subject to a penalty of 1 mark (5%) per calendar day late. Work submitted after 11.59PM Sunday 10 November 2024 will receive a mark of 0.
Aim: The assignment uses Excel and StatKey to analyze data.
Data description: Your assigned data set contains height and weight information on 400 randomly-selected American adults. The adults are weighed in 2004 and again in 2011.
• Your assigned data set is available on Canvas.
•  The ID column identifies each individual (and can be ignored), FEMALE is each adult’s sex (female = 1, male = 0), HEIGHT is each adult’s height (in inches), WEIGHT04 is each adult’sweight in 2004, and WEIGHT11 is the adult’sweight in 2011 (weight is measured in pounds).
QUESTIONS
1. Provide a clearly-labelled scatter plot of height and weight in your 2004 sample.  Explain your choice of variables on the vertical axis and the horizontal axis of the scatter plot. Comment on the scatter plot. [2 marks]
2.  Estimate the simple regression equation for weight in 2004 and height.  Test for the statistical signif- icance of the relationship.  List your notation, the null and alternative hypotheses, the test statistic, decision rule, and conclusion to the test. [2 marks]
3.  Give an interpretation of the intercept and slope in the regression equation. [2 marks]
4.  Convert the height and 2004 weight data to metric using the conversions 1 pound = 0.45359237 kilograms and 1 inch = 2.54 centimetres.  Estimate the regression equation using the metric data. What do you notice when you compare the R2 for the metric data and non-metric data equations? How do you explain your finding? [2 marks]
5. You will see that the slope estimates from the metric data and non-metric data regression equations are quite different. Show how the slope estimate from the metric data can be derived exactly from the slope estimate from the non-metric data. [2 marks]
6.  Estimate the simple regression equation for weight in 2011 and height.  Explain why the R2  for 2011 weight is lower than in the regression reported in Question 2. [2 marks]
7.  Use the bootstrap to construct a 99% confidence interval for the population height difference between females and males.  Show your bootstrap distribution along with the lower and upper bounds of the confidence interval. What does the confidence interval imply about the null hypothesis that there is no difference in the population between average female and male height? [2 marks]
8.  Test the hypothesis that, on average, males have gained more weight than females between 2004 and 2011.  List your notation, the null and alternative hypotheses, the test statistic, decision rule, and conclusion to the test. [2 marks]
9. According to Tucker & Parker (2022) Journal of Obesity,  16% of American adults gained more than 2% of their body weight per year over a 10 year period.  To test support for this hypothesis over the 7-year period in your sample, construct a new variable called GAIN14PCT in your data set such that GAIN14PCT =  1 if an individual gained more than 14% of their initial weight between 2004 and 2011, and GAIN14PCT = 0 otherwise.  What is the interpretation, in plain English, of the mean of GAIN14PCT? [2 marks]
10. Test support for this hypothesis over the 7-year period in your sample. List your notation, the null and alternative hypotheses, the test statistic, decision rule, and conclusion to the test. [2 marks]