Epidemiology and Statistics for Public Health

Value: 45% of course assessment
The assignment will be marked out of 72 and converted to a mark out of 45.
Answer all questions. Please show your working for calculations.
Formatting your assignment:
• Please submit your assignment online as one document in PDF format. This format
locks the information and is particularly important if you are using a Mac computer to
preserve the formatting, graphs and Stata output;
• Please number every page and include your SID in the header of every page;
• Clearly number the answer to each question; only include your answer (not the question);
• Use 11 point or 12 point font and 1.5 times spacing (not double spacing);
• Format figures and tables appropriately according to the guidelines given in this course.
Any hand drawn graphs should be scanned and pasted into your assignment; and
• Do not submit Stata data files.
Submitting your assignment
Please submit your assignment online through the course Moodle site BY 9.00 AM (AEST)
Tuesday 05 May 2020. Instructions about how to submit your assignment are provided on
pages 21-22 of the Course Outline and on the Moodle site (see Submitting your assignments
(A2 and A3)).
You must agree that the work is entirely your own before you can submit your assignment. You
will be submitting your assignment via Turnitin, which is a similarity detection program.
Late submission, collusion and plagiarism
Late submission without the permission of the course convenors will result in a penalty
of two percent (2%) per day. Workload is usually not considered a valid reason for requesting
an extension. However, given the current COVID-19 outbreak and major distributions to your
work, extensions will be considered on a case by case basis. Extensions for health-related
reasons will be given with appropriate evidence.
You must submit your own work. All assignments will be electronically checked against all
other assignments submitted for this course. Evidence of collusion constitutes Academic
Misconduct and will be investigated. If proven, the grade of 0% will be awarded and your name
will be placed on the University Academic Misconduct Register.
PHCM9498 Epidemiology and Statistics – Assignment 2 2020
Page 2 of 9
This assignment assesses your understanding of topics discussed in Modules 1 to 10 of the
course but focuses mainly on Modules 7 to 10.
It is important to present your answers to all questions with appropriate precision (i.e. in tables,
figures and text). Tables and figures should be presented in the way you would present them in
a scientific journal or a standard report. You should follow the presentation guidelines you have
been given in the course. When asked to write a conclusion or interpret values, your answer
should include relevant values you have calculated, and other data provided to you.
You will need to use Stata to answer Question 1 and Question 2. Question 3 requires you to
read the edited extract on Page 5.
Question 1 [25 marks]
A cross-sectional study was conducted to examine the effect of gestational age on systolic blood
pressure (SBP) of low birth weight babies who weigh less than 1500 gms. Data was collected on
60 such babies and posted on Moodle in the Excel file Assign2Q1.xls. The dataset contains the
following variables.
ID: Participant ID number
sbp = Systolic blood pressure (mmHg)
gestage = gestational age in weeks
a) What are the study factor and the outcome factor? [1 mark]
b) To explore the association, calculate the correlation coefficient and interpret it? [3 marks]
c) Conduct a simple linear regression using Stata and report the Stata output. What are the
assumptions for a linear regression? Examine the assumptions with the support of relevant
graphs and statistics. [12 Marks]
d) Write down the regression equation and interpret the regression coefficients and their 95%
confidence interval from part c. [6 Marks]
e) What is the expected systolic blood pressure of a newborn whose gestational age is 24
weeks? Show your workings. [3 Marks]
PHCM9498 Epidemiology and Statistics – Assignment 2 2020
Page 3 of 9
Question 2 [15 marks]
A case-control study was planned to investigate whether there was an association between a
mother being diagnosed with toxaemia (A condition in pregnancy, also known as pre-eclampsia
characterized by abrupt hypertension, albuminuria and oedema) and the baby being born with
low birth weight. The research team wished to recruit the cases and controls from antenatal
clinics. Based on a pilot study, the team expected that the odds ratio of the association in
question would be 2.5 using a two-sided significance test and the prevalence of toxaemia
among women giving birth to a normal weight baby is 6%.
a) If equal number of cases and controls could be recruited in this study, how many in each
group would be required to achieve 90% power at 5% level of significance? Include a
screenshot of your Stata command and output with your response. [2 Marks]
b) One of the researchers thought that prevalence of toxaemia among the controls would be
i. What effect will this have on the required sample size to detect an OR of 2.5 with the
same power and level of significance as in part a)? [2 Marks]
ii. If the prevalence of toxaemia in the control group is uncertain, would it be preferable
to assume that 4% or 6% of the control mothers have the condition? Describe your
reason. [2 Marks]
c) A similar study on the same source population found that approximately 80% of the mothers
approached for the study would agree to participate. From this information, how many
mothers of newborn will need to be approached to achieve the sample size that you
estimated in part a)? Show your workings. [3 Marks]
d) The research team found that they do not have sufficient number of cases as per the
calculation in part a) and decided to recruit 2 controls per case. How many cases and
controls would be required if the power, effect size, prevalence of exposure among the
controls and level of significance are the same as in part a)? Include a screenshot of your
Stata command and output with your response. [3 Marks]
e) For 28 of the newborns their apgar score at 5 minutes (ranging from 0 to 10) was available.
A description of apgar score can be found here. The research team also wanted to examine
if the apgar score differs by the history of toxaemia among the mothers. Apgar score is
highly negatively skewed in both groups and for some of the newborns it was zero. What
statistical test is appropriate for this study? Explain the reason. [3 Marks]
PHCM9498 Epidemiology and Statistics – Assignment 2 2020
Page 4 of 9
Question 3 [32 marks]
Read the Extract, which is a highly edited version of a published paper. Use the information
provided in the Extract to answer the following questions.
a) Using PICO format, write the research question for the study. [2 marks]
b) What type of study has been used to answer the research question? What would be the
best study design to answer this research question? Provide reasons for your answers.
[3 marks]
c) i. In your own words, briefly describe the source population and study participants.
[4 marks]
ii. Are there any issues of concern about selection bias in this study? Provide
reasons for your answer. [6 marks]
d) i. What is the main study factor? Are there any problems about measurement error
of the study factor? [3 marks]
ii. What is the outcome factor? Are there any concerns about measurement error of
the outcome factor? [3 marks]
e) i. How have the researchers dealt with potential confounding in this study?
[2 marks]
ii. Are there any problems with confounding? Provide a reason for your answer.
[2 marks]
f) In your own words, briefly summarise the main results of the study shown in Table 2 of
the extract. Please focus on 3 factors that are associated with LHC and one that is not
associated with LHC. [Note: You do not need to provide the formal explanation for each
set of values.] [4 marks]
g) What is your assessment of the internal and external validity of the study findings?
[3 marks]