8.3.2 Finding Interval Estimators

Let $Z \sim N(0,1)$, find $x_l$ and $x_h$ such that \begin{align}%\label{} P\bigg(x_l \leq Z \leq x_h \bigg)= 0.95 \end{align}
- Solution
Here, $\alpha=0.05$ and the CDF of $Z$ is given by the $\Phi$ function. Thus, we can choose
x_l= \Phi^{-1}(0.025)=-1.96, \quad \textrm{and} \quad x_h= \Phi^{-1}(1-0.025)=1.96
Thus, for a standard normal random variable $Z$, we have
P\bigg(-1.96 \leq Z \leq 1.96 \bigg)= 0.95
More generally, we can find a $(1-\alpha)$ interval for the standard normal random variable. Assume $Z \sim N(0,1)$. Let us define a notation that is commonly used. For any $p \in [0,1]$, we define $z_p$ as the real value for which
P(Z > z_p)=p.
\Phi(z_p)=1-p, \quad z_p=\Phi^{-1}(1-p).
By symmetry of the normal distribution, we also conclude
Figure 8.3 shows $z_p$ and $z_{1-p}=-z_p$ on the real line. In MATLAB, to compute $z_p$ you can use the following command: $\mathtt{norminv(1-p)}$.
Figure 8.3 - By definition, $z_p$ is the real number, for which we have $\Phi(z_p)=1-p$. Figure 8.4 - A $(1-\alpha)$ interval for $N(0,1)$ distribution. In particular, in this figure, we have $P\left(Z \in \big[-z_{\frac{\alpha}{2}},z_{\frac{\alpha}{2}}\big] \right)= 1-\alpha$.
Now, let's talk about how we can find interval estimators. A general approach is to start with a point estimator $\hat{\Theta}$, such as the MLE, and create the interval $\big[\hat{\Theta}_l,\hat{\Theta}_h\big]$ around it such that $P\bigg(\theta \in \big[\hat{\Theta}_l,\hat{\Theta}_h\big] \bigg) \geq 1-\alpha$. How do we do this? Let's look at an example.
Let $X_1$, $X_2$, $X_3$, $...$, $X_n$ be a random sample from a normal distribution $N(\theta, 1)$. Find a $95 \%$ confidence interval for $\theta$.
- Solution
- Let's start with a point estimator $\hat{\Theta}$ for $\theta$. Since $\theta$ is the mean of the distribution, we can use the sample mean \begin{align}%\label{} \hat{\Theta}=\overline{X}=\frac{X_1+X_2+...+X_n}{n}. \end{align} Since $X_i \sim N(\theta, 1)$ and the $X_i$'s are independent, we conclude that \begin{align}%\label{} \overline{X} \sim N\left(\theta, \frac{1}{n}\right). \end{align} By normalizing $\overline{X}$, we conclude that the random variable \begin{align}%\label{} \frac{\overline{X}-\theta}{\frac{1}{\sqrt{n}}}=\sqrt{n}(\overline{X}-\theta) \end{align} has a $N(0,1)$ distribution. Therefore, by Example 8.12, we conclude \begin{align}%\label{} P\bigg(-1.96 \leq \sqrt{n}(\overline{X}-\theta) \leq 1.96 \bigg)=0.95 \end{align} which is equivalent to (by rearranging the terms) \begin{align}%\label{} P\bigg(\overline{X}-\frac{1.96}{\sqrt{n}} \leq \theta \leq \overline{X}+\frac{1.96}{\sqrt{n}}\bigg)=0.95 \end{align} Therefore, we can report the interval \begin{align}%\label{} [\hat{\Theta}_l, \hat{\Theta}_h]=\left[\overline{X}-\frac{1.96}{\sqrt{n}}, \overline{X}+\frac{1.96}{\sqrt{n}}\right] \end{align} as our $95 \%$ confidence interval for $\theta$.
At first, it might seem that our solution to Example 8.13 is not based on a systematic method. You might have asked: "How should I know that I need to work with the normalized $\overline{X}$?" However, by thinking more deeply about the way we solved this example, we can suggest a general method to solve confidence interval problems. The crucial fact about the random variable \begin{align}%\label{} \overline{X}-\theta \end{align} is that its distribution does not depend on the unknown parameter $\theta$. Thus, we could easily find a $95\%$ interval for the random variable $\sqrt{n}(\overline{X}-\theta)$ that did not depend on $\theta$. Such a random variable is called a pivot or a pivotal quantity. Let us define this more precisely.
Pivotal Quantity
Let $X_1$, $X_2$, $X_3$, $...$, $X_n$ be a random sample from a distribution with a parameter $\theta$ that is to be estimated. The random variable $Q$ is said to be a pivot or a pivotal quantity, if it has the following properties:
- It is a function of the observed data $X_1$, $X_2$, $X_3$, $...$, $X_n$ and the unknown parameter $\theta$, but it does not depend on any other unknown parameters: \begin{equation} Q=Q(X_1,X_2, \cdots, X_n, \theta). \end{equation}
- The probability distribution of $Q$ does not depend on $\theta$ or any other unknown parameters.
Check that the random variables $Q_1=\overline{X}-\theta$ and $Q_2=\sqrt{n}(\overline{X}-\theta)$ are both valid pivots in Example 8.13.
- Solution
- We note that $Q_1$ and $Q_2$ by definitions are functions of $\overline{X}$ and $\theta$. Since \begin{align}%\label{} \overline{X}=\frac{X_1+X_2+...+X_n}{n}, \end{align} we conclude $Q_1$ and $Q_2$ are both functions of the observed data $X_1$, $X_2$, $X_3$, $...$, $X_n$ and the unknown parameter $\theta$, and they do not depend on any other unknown parameters. Also, \begin{align}%\label{} Q_1 \sim N(0, \frac{1}{n}), \quad Q_2 \sim N(0,1). \end{align} Thus, their distributions do not depend on $\theta$ or any other unknown parameters. We conclude that $Q_1$ and $Q_2$ are both valid pivots.
To summarize, here are the steps in the pivotal method for finding confidence intervals:
- First, find a pivotal quantity $Q(X_1,X_2, \cdots, X_n, \theta)$.
- Find an interval for $Q$ such that \begin{align}%\label{} P\big(q_l \leq Q \leq q_h\big)= 1-\alpha. \end{align}
- Using algebraic manipulations, convert the above equation to an equation of the form \begin{align}%\label{} P\big(\hat{\Theta}_l \leq \theta \leq \hat{\Theta}_h\big)= 1-\alpha. \end{align}
Let $X_1$, $X_2$, $X_3$, $...$, $X_n$ be a random sample from a distribution with known variance $\textrm{Var}(X_i)=\sigma^2$, and unknown mean $EX_i=\theta$. Find a $(1-\alpha)$ confidence interval for $\theta$. Assume that $n$ is large.
- Solution
- As usual, to find a confidence interval, we start with a point estimate. Since $\theta=EX_i$, a natural choice is the sample mean \begin{align}%\label{} \overline{X}=\frac{X_1+X_2+...+X_n}{n}. \end{align} Since $n$ is large, by the Central Limit Theorem (CLT), we conclude that \begin{align}%\label{} Q=\frac{\overline{X}-\theta}{\frac{\sigma}{\sqrt{n}}} \end{align} has approximately $N(0,1)$ distribution. In particular, $Q$ is a function of the $X_i$'s and $\theta$, and its distribution does not depend on $\theta$, or any other unknown parameters. Thus, $Q$ is a pivotal quantity. The next step is to find a $(1-\alpha)$ interval for $Q$. As we saw before, a $(1-\alpha)$ interval for the standard normal random variable $Q$ can be stated as \begin{align}%\label{} P\left(-z_{\frac{\alpha}{2}} \leq Q \leq z_{\frac{\alpha}{2}} \right)= 1-\alpha. \end{align} Therefore, \begin{align}%\label{} P\left(-z_{\frac{\alpha}{2}} \leq \frac{\overline{X}-\theta}{\frac{\sigma}{\sqrt{n}}} \leq z_{\frac{\alpha}{2}} \right)= 1-\alpha. \end{align} which is equivalent to \begin{align}%\label{} P\left(\overline{X}- z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \leq \theta \leq \overline{X}+ z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \right)= 1-\alpha. \end{align} We conclude that $\left[\overline{X}- z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} , \overline{X}+ z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\right]$ is a $(1-\alpha)100\%$ confidence interval for $\theta$.
The above example is our first important case of known interval estimators, so let's summarize what we have shown:
Assumptions: A random sample $X_1$, $X_2$, $X_3$, $...$, $X_n$ is given from a distribution with known variance $\textrm{Var}(X_i)=\sigma^2<\infty$; $n$ is large.
Parameter to be Estimated: $\theta=EX_i$.
Confidence Interval: $\left[\overline{X}- z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} , \overline{X}+ z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\right]$ is approximately a $(1-\alpha)100\%$ confidence interval for $\theta$.
Note that to obtain the above interval, we used the CLT. Thus, what we found is an approximate confidence interval. Nevertheless, for large $n$, the approximation is very good.
An engineer is measuring a quantity $\theta$. It is assumed that there is a random error in each measurement, so the engineer will take $n$ measurements and report the average of the measurements as the estimated value of $\theta$. Here, $n$ is assumed to be large enough so that the central limit theorem applies. If $X_i$ is the value that is obtained in the $i$th measurement, we assume that \begin{align}%\label{} X_i=\theta+W_i, \end{align} where $W_i$ is the error in the $i$th measurement. We assume that the $W_i$'s are i.i.d. with $EW_i=0$ and $\textrm{Var}(W_i)=4$ square units. The engineer reports the average of the measurements \begin{align}%\label{} \overline{X}=\frac{X_1+X_2+...+X_n}{n}. \end{align} How many measurements does the engineer need to make until he is $90\%$ sure that the final error is less than $0.25$ units? In other words, what should the value of $n$ be such that \begin{equation} P\big(\theta-0.25 \leq \overline{X} \leq \theta+0.25\big) \geq .90 \, ? \end{equation}
- Solution
Note that, here, the $X_i$'s are i.i.d. with mean
and variance
Thus, we can restate the problem using our confidence interval terminology: "Let $X_1$, $X_2$, $X_3$, $...$, $X_n$ be a random sample from a distribution with known variance $\textrm{Var}(X_i)=\sigma^2=4$. How large $n$ should be so that the interval
\big[\overline{X}-0.25, \overline{X}+0.25\big]
is a $90 \%$ confidence interval for $\theta=EX_i$?"
By our discussion above, the $90 \%$ confidence interval for $\theta=EX_i$ is given by \begin{equation} \left[\overline{X}- z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} , \overline{X}+ z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\right] \end{equation} Thus, we need \begin{equation} z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}=0.25, \end{equation} where $\sigma=2$, $\alpha=1-0.90=0.1$. In particular, \begin{align}%\label{} z_{\frac{\alpha}{2}}=z_{0.05}=\Phi^{-1}(1-0.05)=1.645 \end{align} Thus, we need to have \begin{equation} 1.645 \frac{2}{\sqrt{n}}=0.25 \end{equation} We conclude that $n \geq 174$ is sufficient.
Now suppose that $X_1$, $X_2$, $X_3$, $...$, $X_n$ is a random sample from a distribution with unknown variance $\textrm{Var}(X_i)=\sigma^2$. Our goal is to find a $1-\alpha$ confidence interval for $\theta=EX_i$. We also assume that $n$ is large. By the above discussion, we can say \begin{align}%\label{} P\left(\overline{X}- z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \leq \theta \leq \overline{X}+ z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \right)= 1-\alpha. \end{align} However, there is a problem here. We do not know the value of $\sigma$. How do we deal with this issue? There are two general approaches: we can either find an upper bound for $\sigma$, or we can estimate $\sigma$.
- An upper bound for $\sigma^2$: Suppose that we can somehow show that \begin{align}%\label{} \sigma \leq \sigma_{max}, \end{align} where $\sigma_{max}<\infty$ is a real number. Then, if we replace $\sigma$ in $\left[\overline{X}- z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} , \overline{X}+ z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\right]$ by $\sigma_{max}$, the interval gets bigger. In other words, the interval \begin{align}%\label{} \left[\overline{X}- z_{\frac{\alpha}{2}} \frac{\sigma_{max}}{\sqrt{n}} , \overline{X}+ z_{\frac{\alpha}{2}} \frac{\sigma_{max}}{\sqrt{n}}\right] \end{align} is still a valid $(1-\alpha)100\%$ confidence interval for $\theta$.
- Estimate $\sigma^2$: Note that here, since $n$ is large, we should be able to find a relatively good estimate for $\sigma^2$. After estimating $\sigma^2$, we can use that estimate and $\left[\overline{X}- z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} , \overline{X}+ z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\right]$ to find an approximate $(1-\alpha)100\%$ confidence interval for $\theta$.
(Public Opinion Polling) We would like to estimate the portion of people who plan to vote for Candidate A in an upcoming election. It is assumed that the number of voters is large, and $\theta$ is the portion of voters who plan to vote for Candidate A. We define the random variable $X$ as follows. A voter is chosen uniformly at random among all voters and we ask her/him: "Do you plan to vote for Candidate $A$?" If she/he says "yes," then $X=1$, otherwise $X=0$. Then, \begin{align}%\label{} X \sim Bernoulli(\theta). \end{align} Let $X_1$, $X_2$, $X_3$, $...$, $X_n$ be a random sample from this distribution, which means that the $X_i$'s are i.i.d. and $X_i \sim Bernoulli(\theta)$. In other words, we randomly select $n$ voters (with replacement) and we ask each of them if they plan to vote for Candidate A. Find a $(1-\alpha)100\%$ confidence interval for $\theta$ based on $X_1$, $X_2$, $X_3$, $...$, $X_n$.
- Solution
- Note that, here, \begin{align}%\label{} EX_i=\theta. \end{align} Thus, we want to estimate the mean of the distribution. Note also that \begin{align}%\label{} \textrm{Var}(X_i)=\sigma^2=\theta(1-\theta). \end{align} Thus, to find $\sigma$, we need to know $\theta$. But $\theta$ is the parameter that we would like to estimate in the first place. By the above discussion, we know that if we can find an upper bound for $\sigma$, we can use it to build a confidence interval for $\theta$. Luckily, it is easy to find an upper bound for $\sigma$ in this problem. More specifically, if you define \begin{align}%\label{} f(\theta)=\theta(1-\theta), \quad \textrm{ for }\theta \in [0,1]. \end{align} By taking derivatives, you can show that the maximum value for $f(\theta)$ is obtained at $\theta=\frac{1}{2}$ and that \begin{align}%\label{} f(\theta)\leq f\left(\frac{1}{2} \right)=\frac{1}{4}, \quad \textrm{ for }\theta \in [0,1]. \end{align} We conclude that \begin{align}%\label{} \sigma_{max}=\frac{1}{2} \end{align} is an upper bound for $\sigma$. We conclude that the interval \begin{align}%\label{} \left[\overline{X}- z_{\frac{\alpha}{2}} \frac{\sigma_{max}}{\sqrt{n}} , \overline{X}+ z_{\frac{\alpha}{2}} \frac{\sigma_{max}}{\sqrt{n}}\right] \end{align} is a $(1-\alpha)100\%$ confidence interval for $\theta$, where $\sigma_{max}=\frac{1}{2}$. Thus, \begin{align}%\label{} \left[\overline{X}- \frac{z_{\frac{\alpha}{2}}}{2\sqrt{n}} , \overline{X}+ \frac{z_{\frac{\alpha}{2}}}{2\sqrt{n}}\right] \end{align} is a $(1-\alpha)100\%$ confidence interval for $\theta$. Note that we obtained the interval by using the CLT, so it is an approximate interval. Nevertheless, for large $n$, the approximation is very good. Also, since we have used an upper bound for $\sigma$, this confidence interval might be too conservative, specifically if $\theta$ is far from $\frac{1}{2}$.
The above setting is another important case of known interval estimators, so let's summarize it:
Assumptions: A random sample $X_1$, $X_2$, $X_3$, $...$, $X_n$ is given from a $Bernoulli(\theta)$; $n$ is large.
Parameter to be Estimated: $\theta$
Confidence Interval: $\left[\overline{X}- \frac{z_{\frac{\alpha}{2}}}{2\sqrt{n}} , \overline{X}+ \frac{z_{\frac{\alpha}{2}}}{2\sqrt{n}}\right]$ is approximately a $(1-\alpha)100\%$ confidence interval for $\theta$. This is a conservative confidence interval as it is obtained using an upper bound for $\sigma$.
There are two candidates in a presidential election: Candidate A and Candidate B. Let $\theta$ be the portion of people who plan to vote for Candidate A. Our goal is to find a confidence interval for $\theta$. Specifically, we choose a random sample (with replacement) of $n$ voters and ask them if they plan to vote for Candidate A. Our goal is to estimate the $\theta$ such that the margin of error is $3$ percentage points. Assume a $95 \%$ confidence level. That is, we would like to choose $n$ such that \begin{align}%\label{} P\left(\overline{X}-0.03 \leq \theta \leq \overline{X}+ 0.03 \right) \geq 0.95, \end{align} where $\overline{X}$ is the portion of people in our random sample that say they plan to vote for Candidate A. How large does $n$ need to be?
- Solution
- Based on the above discussion, \begin{align}%\label{} \left[\overline{X}- \frac{z_{\frac{\alpha}{2}}}{2\sqrt{n}} , \overline{X}+ \frac{z_{\frac{\alpha}{2}}}{2\sqrt{n}}\right] \end{align} is a valid $(1-\alpha)100\%$ confidence interval for $\theta$. Therefore, we need to have \begin{align}%\label{} \frac{z_{\frac{\alpha}{2}}}{2\sqrt{n}}=0.03 \end{align} Here, $\alpha=0.05$, so $z_{\frac{\alpha}{2}}=z_{0.025}=1.96$. Therefore, we obtain \begin{align}%\label{} n=\left(\frac{1.96}{2 \times 0.03}\right)^2. \end{align} We conclude $n \geq 1068$ is enough. The above calculation provides a reason why most polls before elections are conducted with a sample size of around one thousand.
As we mentioned, the above calculation might be a little conservative. Another approach would be to estimate $\sigma^2$ instead of using an upper bound. In this example, the structure of the problem suggests a way to estimate $\sigma^2$. Specifically, since \begin{align}%\label{} \sigma^2=\theta(1-\theta), \end{align} we may use \begin{align}%\label{} \hat{\sigma}^2&= \hat{\theta}(1-\hat{\theta})\\ &=\overline{X}(1-\overline{X}) \end{align} as an estimate for $\theta$, where $\hat{\theta}=\overline{X}$. The rationale behind this approximation is that since $n$ is large, $\overline{X}$ is likely a good estimate of $\theta$, thus $\hat{\sigma}^2= \hat{\theta}(1-\hat{\theta})$ is a good estimate of $\sigma^2$. After estimating $\sigma^2$, we can use $\left[\overline{X}- z_{\frac{\alpha}{2}} \frac{\hat{\sigma}}{\sqrt{n}} , \overline{X}+ z_{\frac{\alpha}{2}} \frac{\hat{\sigma}}{\sqrt{n}}\right]$ as an approximate $(1-\alpha)100\%$ confidence interval for $\theta$. To summarize, we have the following confidence interval rule:
Assumptions: A random sample $X_1$, $X_2$, $X_3$, $...$, $X_n$ is given from a $Bernoulli(\theta)$; $n$ is large.
Parameter to be Estimated: $\theta$
Confidence Interval: $\left[\overline{X}- z_{\frac{\alpha}{2}}\sqrt{\frac{\overline{X}(1-\overline{X})}{n}} , \overline{X}+ z_{\frac{\alpha}{2}}\sqrt{\frac{\overline{X}(1-\overline{X})}{n}}\right]$ is approximately a $(1-\alpha)100\%$ confidence interval for $\theta$.
Again, the above confidence interval is an approximate confidence interval because we used two approximations: the CLT and an approximation for $\sigma^2$.
The above scenario is a special case ($Bernoulli(\theta)$) for which we could come up with a point estimator for $\sigma^2$. Can we have a more general estimator for $\sigma^2$ that we can use for any distribution? We have already discussed such a point estimator and we called it the sample variance: \begin{align}%\label{} {S}^2=\frac{1}{n-1} \sum_{k=1}^n (X_k-\overline{X})^2=\frac{1}{n-1} \left(\sum_{k=1}^n X^2_k-n\overline{X}^2\right). \end{align} Thus, using the sample variance, ${S}^2$, we can have an estimate for $\sigma^2$. If $n$ is large, this estimate is likely to be close to the real value of $\sigma^2$. So let us summarize this discussion as follows:
Assumptions: A random sample $X_1$, $X_2$, $X_3$, $...$, $X_n$ is given from a distribution with unknown variance $\textrm{Var}(X_i)=\sigma^2<\infty$; $n$ is large.
Parameter to be Estimated: $\theta=EX_i$.
Confidence Interval: If $S$ is the sample standard deviation \begin{align}%\label{} S=\sqrt{\frac{1}{n-1} \sum_{k=1}^n (X_k-\overline{X})^2}=\sqrt{\frac{1}{n-1} \left(\sum_{k=1}^n X^2_k-n\overline{X}^2\right)}, \end{align} then the interval \begin{equation} \left[\overline{X}- z_{\frac{\alpha}{2}} \frac{S}{\sqrt{n}} , \overline{X}+ z_{\frac{\alpha}{2}} \frac{S}{\sqrt{n}}\right] \end{equation} is approximately a $(1-\alpha)100\%$ confidence interval for $\theta$.
We have collected a random sample $X_1$, $X_2$, $X_3$, $...$, $X_{100}$ from an unknown distribution. The sample mean and the sample variance for this random sample are given by \begin{equation} \overline{X}=15.6, S^2=8.4 \end{equation} Construct an approximate $99 \%$ confidence interval for $\theta=EX_i$.
- Solution
- Here, the interval \begin{equation} \left[\overline{X}- z_{\frac{\alpha}{2}} \frac{S}{\sqrt{n}} , \overline{X}+ z_{\frac{\alpha}{2}} \frac{S}{\sqrt{n}}\right] \end{equation} is approximately a $(1-\alpha)100\%$ confidence interval for $\theta$. Since $\alpha=0.01$, we have \begin{equation} z_{\frac{\alpha}{2}}=z_{0.005}=2.576 \end{equation} Using $n=100$, $\overline{X}=15.6$, $S^2=8.4$, we obtain the following interval \begin{equation} \left[15.6- 2.576 \frac{\sqrt{8.4}}{\sqrt{100}},15.6+ 2.576 \frac{\sqrt{8.4}}{\sqrt{100}}\right]=[14.85, 16.34]. \end{equation}