# Statistical Modelling for Business

Individual Assignment

Requirements:
• Complete your entire assignment in Jupyter Notebook, including your code and
markdown sections for your written answers. Use Latex in markdown sections
where needed.
• Submit the resulting downloaded html file as your entire assignment. Care must
be taken with presentation in this file, however unavoidable error messages and
page formatting issues will be ignored in marking.
• Only relevant analysis outputs (graphs, tables, etc) should appear in the assignment file and all output should appear together with the discussion of that
output, in the file.
This assignment follows the analysis conducted in the lectures regarding the dependence
between earnings and asset returns for companies listed on the NYSE. You will assess
whether earnings in one year (say t−1) affect asset returns in the subsequent year (say
t), and in particular whether returns are typically higher following positive, compared
to negative, earnings years and also assess whether there may be a linear relationship
between returns and lagged earnings.
Data: The data file for the analysis is “SampleData from US 90 08 wk3.csv” which
was sampled from “US 90 08 wk3.csv”.
2
Questions:
(a) Conduct an appropriate exploratory analysis on the asset returns, both individually
and in terms of one of the primary questions being considered in this assignment:
are returns in the subsequent year t typically higher following positive, compared to
negative, earnings years in year t − 1? Discuss any cleaning of the data you did,
including why and how you did it, or why you did not do it. (3 marks)
(b) Conduct the appropriate t-test (with α = 0.05), median and Mann-Whitney tests,
to assess whether returns are typically higher following positive, compared to negative,
earnings years. For median tests, use two-sided testing. Assess all assumptions made.
(10 marks)
(c) Which test’s result do you believe the most in part (b)? Discuss and explain. (2
marks)
(d) Conduct an appropriate exploratory analysis to assess whether there may be a
linear relationship between returns and lagged earnings. (3 marks)
(e) Conduct a simple linear regression analysis, using OLS estimation, for returns
on lagged earnings. Fully assess all assumptions of OLS. Also list and assess the
assumptions of LAD (no need to obtain the LAD estimates). Discuss any cleaning of
the data you did, including why and how you did it, or why you didn’t do it. (9 marks)
(f) Write a brief (< 0.5 page) report summarising and discussing your findings and
conclusions in layman’s terms. Include a discussion of whether you would recommend
an investment strategy based on your findings. (3 marks)
Task 2 (20 marks). Theoretical derivations:
Consider the population SLR model:
Yi = β0 + β1Xi + εi
and an observed, random sample of data (y1, x1), . . . ,(yn, xn) from that model. An
OLS regression is run on this data.
3
Questions:
(a) Show that the mean of the estimated residuals from the OLS regression exactly
equals 0, i.e. ¯e = 0. Hint: look at the first equation found when differentiating the
residual sum of squares with respect to β0. (2 marks)
(b) Show that the correlation between the estimated residuals and the observed x’s
exactly equals 0. How does this result relate to the 2nd LSA? Hint: look at the second
equation found when differentiating the residual sum of squares with respect to β1. (5
marks)
(c) Show that the equality T SS = RegSS + RSS holds, i.e. show that:
Xn
i=1
(yi − y¯)
2 =
Xn
i=1
(ˆyi − y¯)
2 +
Xn
i=1
(yi − yˆi)
2
Hint: add and subtract ˆyi
inside the square on the left side of the equation. (6 marks)
(d) Explain and show why SER2 = Var( d ) = Var( d Y |X). (7 marks)

Don't use plagiarized sources. Get Your Custom Essay on
Statistical Modelling for Business
Just from \$13/Page