Archive for April, 2007

Network News

April 26, 2007

This is another hypothesis test. This time the claim made by the research involved corresponds to the alternative hypothesis.

Maybe the data this problem talks about is old. Can it really be 55% watching network news? What about all those people who only watch podcasts or only read google news?

Problem adapted from Larson/Faber’s Elementary Statistics

How Do You Know Which Test to Use?

April 26, 2007

When you are doing a hypothesis test how do you decide which form to use? Say we are going to test the population proportion .27. Which of the three possibilities do we choose?

\\H_0 :\theta  = .27\,\,\,\,\,\,\,\,\,\,\,\,\,\,H_0 :\theta  = .27\,\,\,\,\,\,\,\,\,\,\,\,\,\,H_0 :\theta  = .27\\H_a :\theta < .27\,\,\,\,\,\,\,\,\,\,\,\,\,\,H_a :\theta > .27\,\,\,\,\,\,\,\,\,\,\,\,\,\,H_a :\theta \ne .27

Part of what makes this hard to figure out is that “the claim” stated in a problem can correspond to either H_a or H_0.You can tell which form to use by the way the question is worded. Let’s look at some typical phrases used with these forms.

If you claim the population proportion “is smaller than 27%” or “is less than 27%” then you would use the form

\\H_0 :\theta  = .27 \\H_a :\theta < .27\text{\,\,      (your claim) }

and your claim corresponds to the alternative hypothesis.

If you claim the population proportion “is greater than 27%” or “is more than 27%” then you would use the form

\\H_0 :\theta  = .27 \\H_a :\theta > .27\text{\,\,      (your claim) }

and your claim corresponds to the alternative hypothesis.

If you claim the population proportion “differs from 27%” or “is not equal to 27%” then you would use this form

\\H_0 :\theta  = .27 \\H_a :\theta \ne .27 \text{\,\,      (your claim) }

and your claim corresponds to the alternative hypothesis.

If you claim the population proportion “is at most 27%” you will use

\\H_0 :\theta  = .27\text{\,\,      (your claim) } \\H_a :\theta > .27

since the alternative hypothesis to “is at most 27%” would be “is more than 27%”.

If you claim a population proportion “is at least 27%” you will use

\\H_0 :\theta  = .27\text{\,\,      (your claim) } \\H_a :\theta < .27

since the alternative hypothesis to “is at least 27%” would be “is less than 27%”.

Sometimes these last two cases are written differently by some books. The claim “is at most 27%” is written using the null hypothesis \\H_0 :\theta\leqslant .27 and the claim “is at least 27%” is written with the null hypothesis \\H_0 :\theta\geqslant .27. We’ll stick with our original three forms at the beginning of the post for everything we do though.

Do You Eat Breakfast

April 25, 2007

Here’s another example of a hypothesis test. Again you tell the type of test (right-tailed or left-tailed or two-tailed) from the form of the alternative hypothesis.

Notice in this one we got a fairly big p-value compared to the level of significance we were using. So we didn’t even come close to rejecting the null hypothesis this time.

Problem adapted from Larson/Farber’s Elementary Statistics 

Extraterrestrials

April 24, 2007

Here is an example of a two tailed hypothesis test.

For a two-tailed test you have to find the z-value and then take the p-value to be twice the area of the tail you get. Since for a two-tailed test the alternative hypothesis is H_a :\theta  \ne \theta_0 we have to allow that the test statistic could be bigger than \theta_0 or it could be smaller than \theta_0. That is why we need two tails.

Hmm… even if USA Today is right I’m beginning to wonder under what conditions people did see an extraterrestrial. Late at night I bet…

Problem adapted from Larson/Farber’s Elementary Statistics

When to Use p-value, When to Use Level of Significance

April 24, 2007

When you do a hypothesis test, you use a sample to come up with a test statistic (z-value). Then you use the z-value to come up with a p-value, where the p-value is the area of the “tail(s)” involved.

Now the book says that you decide the result of the hypothesis test based on the size of the p-value (page 438). The smaller the p-value the more evidence against the null hypothesis you have. This is true if you are not given a level of significance (ie no \alpha).

But what happens if the problem says to use 1% or 5% level of significance (\alpha=.01 or \alpha=.05) ?

This just means that when you are done you compare the p-value to this \alpha to decide whether you reject the null hypothesis or not.

If p \leqslant \alpha, then you reject the null hypothesis and say the result is “significant”.

If p >  \alpha, then you say there is not enough evidence to reject the null hypothesis and say the result is “not significant”.

Example (Using a level of significance in hypothesis test):

As an example suppose you are using a level of significance \alpha=.05 and your p-value is .032. Then reject the null hypothesis and say the result is significant.

Or suppose your level of significance is \alpha=.01 and your p-value is .025. Then you do not have enough evidence to reject the null hypothesis and say the result is not significant.

So the upshot is compare the p-value to the level of significance (\alpha) to decide whether to reject when you have a level of significance. If the problem doesn’t give a level of significance, then decide your conclusion by using the size of the p-value and the table on p. 438 of our book.

Outlawing Cigarettes

April 20, 2007

This problem shows an example of a hypothesis test. This one is a right tailed test, which you can tell from the form of the alternative hypothesis.

Note that if you are not using a level of significance to decide whether to reject or not, then you wind up with a p-value and make some conclusion about how much evidence there is against the null hypothesis based on the size of the p-value. The smaller the p-value the less likely you are to believe in the null hypothesis.

Problem adapted from Larson/Farber’s Elementary Statistics

 

Children Watching TV

April 14, 2007

Some people have been asking me to review how to find areas for normal distributions (from Module 4).  Since the rest of the course uses this heavily it is important we have this down cold. Here is a straightforward normal distribution problem where you have to find several areas by computing z-values and using the table. You should already know how to use the standard normal table (p. 579 in our book) before viewing this:

For these kinds of problems you always want to convert areas under the x-distribution into areas under the standard normal curve and use the table:

.xdistribution_background.jpg

To change the x-values into z-values use the formula

z = \frac{{data - mean}}{{stddev}} = \frac{{x - \mu }}{{\sigma}}

Notice that the mean \mu corresponds to z=0

Students On Diets

April 13, 2007

Here is problem involving sample proportions. This one is about dieting.

Make sure when you are doing these problems that you draw the curves that go with the information. It is much easier to understand what you are doing if you draw the curves and areas.

In all the problems using the sampling distribution you want to compute areas in some original distribution (the \hat p-distribution in this case) and you do that by changing things into z-scores and computing the areas under a standard normal curve instead:

phatdistribution.jpg

The \theta is the population proportion and it corresponds to z = 0

The \hat p is the sample proportion and it corresponds to a value in the left graph above. You change it into a z-value in the right graph above by using the formula for the z-score:

z = \frac{{data - mean}}{{stddev}} = \frac{{\hat p - \theta }}{{stddev}}

where the stddev is given in this case by:

stddev = \sqrt {\frac{{\theta (1 - \theta )}}{n}}

Problem adapted from Moore’s Basic Practice of Statistics

Having a Girl

April 10, 2007

This is another confidence interval problem involving estimating the percentage of adults who would prefer to have a girl if they could have only one child.

Notice in this one how it is pretty much impossible to even consider polling the entire population, which is all US adults. Looks like not many people would choose to have a girl if they could choose what their only child was!

Computing confidence intervals is pretty straightforward. Just get the E and subtract it from the sample proportion \hat p to get the left endpoint of the confidence interval. Then add it to \hat p to get the right endpoint of the confidence interval :

(\hat p - E,\hat p + E)

Then the confidence level tells us how likely we feel it is the population proportion \theta is in this interval.

Problem adapted from Larson/Farber’s Elementary Statistics

 

Addicted to Smoking

April 8, 2007

This is a confidence interval problem.

Notice that in this example we computed the z* ourselves based on the 90% confidence level, but for most of the standard confidence levels (90%, 95%, 99%) you should just look them up in the table in our book on page 399.

For 90% confidence level, z*=1.645

For 95% confidence level, z*=1.960

For 99% confidence level, z*=2.576

This one has the memorable quote “If it says confidence interval you know it’s going to be a confidence interval problem.” Yeah!

Problem adapted from Larson/Farber’s Elementary Statistics