# Random variables

   
page 2 of 5
Question 1, which covers topics in Unit 7, and Question 2, which covers
topics in Unit 8, form M248 TMA 04. Question 1 is marked out of 23;
Question 2 is marked out of 27.
Question 1 – 23 marks
You should be able to answer this question after working through Unit 7.
(a) Let X and Y be independent random variables both with the same
mean
µ ̸= 0. Defne a new random variable W = aX + bY , where a and
b are constants.
(i) Obtain an expression for
E(W). 
(ii) What constraint is there on the values of
a and b so that W is an
unbiased estimator of
µ? Hence write all unbiased versions of W as
a formula involving
a, X and Y only (and not b). 
(b) An otherwise fair six-sided die has been tampered with in an attempt to
cheat at a dice game. The eﬀect is that the 1 and 6 faces have a
diﬀerent probability of occurring than the 2, 3, 4 and 5 faces.
Let
θ be the probability of obtaining a 1 on this biased die. Then the
outcomes of rolling the biased die have the following probability mass
function.
Table 1 The p.m.f. of outcomes of rolls of a biased die
Outcome 1 2 3 4 5 6
Probability
θ 14(1 2θ) 1 4(1 2θ) 1 4(1 2θ) 1 4(1 2θ) θ
(i) By consideration of the p.m.f. in Table 1, explain why it is
necessary for
θ to be such that 0 < θ < 1/2. 
(ii) The value of
θ is unknown. Data from which to estimate the value
of
θ were obtained by rolling the biased die 1000 times. The result
of this experiment is shown in Table 2.
Table 2 Outcomes of 1000 independent rolls of a biased die
Outcome 1 2 3 4 5 6
Frequency 205 154 141 165 145 190
Show that the likelihood of
θ based on these data is
L(θ) = C θ395 (1 2θ)605,
where C is a positive constant, not dependent on θ. 
(iii) Show that
L(θ) = C θ394(1 2θ)604 (395 2000 θ). 
(iv) What is the value of the maximum likelihood estimate,
θb, of θ
based on these data? Justify your answer. What does the value of
θb suggest about the value of θ for this biased die compared with
the value of
θ associated with a fair, unbiased, die? 
page 3 of 5

(c) Studies of the size and range of wild animal populations often involve
tagging observed individual animals and recording how many times each
is caught in a trap (from which it is then released back into the wild).
The dataset presented in Table 3 consists of the numbers of times each
of
n = 334 wood mice were caught in a particular trap (over a two-year
time period). The data are also provided in the Minitab fle
wood-mice.mwx.
Table 3 Numbers of trappings of wood mice
Times trapped 1 2 3 4 5 6 7 8 9
Frequency 71 59 41 39 20 26 19 12 9
Times trapped 10 11 12 13 14 15 16 17 18
Frequency 5 8 4 9 2 1 3 3 3
The geometric distribution with parameter
p is a good model for these
data.

 (i) What is the maximum likelihood estimator of p for a geometric model?

1]
(ii) What is the maximum likelihood estimate of
p for the data in
Table 3? You are recommended to use Minitab to help you to
answer this part of the question. [
2]
Question 2 – 27 marks
You should be able to answer this question after working through Unit 8.
(a) In this part of the question, you should calculate the required
confdence interval by hand, using tables, and show your working. (You
Modern aircraft cockpit windscreens are complex items, comprising
several layers of material and a heating system. Such windscreens are
replaced upon damage to any of their components. A dataset was
collected on the times to replacement of
n = 84 windscreens of a
particular modern airliner. The sample mean windscreen replacement
time was 23 515 hours of ﬂight. The sample standard deviation of
windscreen replacement times was 5168 hours of ﬂight.
(i) Obtain an approximate 90% confdence interval for the mean
replacement time of this type of aircraft windscreen. What
property of the dataset justifes using this type of confdence
interval, and why? [
6]
(ii) Interpret the particular confdence interval that you found in
part (a)(i) in terms of repeated experiments. [
3]
page 4 of 5

(b) In this part of the question, you should calculate the required
confdence interval by hand, using tables, and show your working. (You
In a large study of patients who were being treated for hypertension
(high blood pressure), 148 out of 5493 patients receiving the
conventional treatment for hypertension later suﬀered a stroke. Also,
192 out of 5492 patients receiving an alternative drug to treat their
hypertension later suﬀered a stroke.
(i) Obtain an approximate 95% confdence interval for the diﬀerence
in proportions between the number of conventionally treated
hypertension patients who later suﬀered a stroke and the number
of hypertension patients treated with the alternative drug who
later suﬀered a stroke. (You are advised to work with proportions
rounded to four decimal places throughout; also, you may assume
that the numbers involved are large enough that your
approximation is a good one.) [
5]
(ii) Some clinicians had suggested that the proportions of hypertension
patients who suﬀered a stroke would not depend on which
treatment they were being given. Are the data consistent with that
2]
(c) In various places in this module, data on the silver content of coins
minted in the reign of the twelfth-century Byzantine king Manuel I
Comnenus have been considered. The full dataset is in the Minitab fle
coins.mwx. The dataset includes, among others, the values of the
silver content of nine coins from the frst coinage (variable
Coin1) and
seven from the fourth coinage (variable
Coin4) which was produced a
number of years later. (For the purposes of this question, you can
ignore the variables
Coin2 and Coin3.) In particular, in Activity 8 and
Exercise 2 of Computer Book B, it was argued that the silver contents
in both the frst and the fourth coinages can be assumed to be normally
distributed. The question of interest is whether there were diﬀerences in
the silver content of coins minted early and late in Manuel’s reign. You
are about to investigate this question using a two-sample
t-interval.
(i) Using Minitab, fnd either the sample standard deviations of the
two variables
Coin1 and Coin4, or their sample variances. Hence
check for equality of variances using the rule of thumb given in
Subsection 4.4 of Unit 8. [
3]
(ii) Whatever the outcome of part (c)(i), use Minitab to obtain a 90%
two-sample
t-interval for the diﬀerence E(X1) E(X4), where X1
denotes the silver content in coins of the frst coinage, and X4
denotes the silver content in coins of the fourth coinage. State that
interval and comment brieﬂy on what it tells us about the silver
content of coins in the earlier and later coinages. [
3]
(iii) Name the distribution used in constructing the confdence interval
in part (c)(ii), state the value of its parameter, and show why the
parameter takes the value that it does. [
2]
(iv) What would have been the outcome if you had obtained a 90%
two-sample
t-interval for E(X4) E(X1) instead of for
E(X1) E(X4)? Justify your conclusion in terms of the derivative
of the parameter transformation involved. [
3]
page 5 of 5