+ - 0:00:00
Notes for current slide
Notes for next slide

Working with Multinomial Proportions

MPA 6010

Ani Ruhil

1 / 35

Agenda

  1. Hypothesis tests with Multinomial Proportions

  2. Hypothesis tests for one-group proportions

  3. Hypothesis tests for two-group proportions

2 / 35

Multinomial data

3 / 35

Data from a single population are often categorical

For example,

  • market shares of Internet Explorer; Firefox; Safari; Chrome

  • voters identifying themselves as Democrats; Independents; Republicans

  • wealth classified as Poor; Low Income; Middle Income; High Income; Rich

  • Number of Gold, Silver, Bronze medals at the Olympic games

  • Individuals who Strongly Disagree, Neither Disagree nor Agree, or Strongly Agree with a statement

The hypothesis test then involves looking at the sample proportions vis-a-vis those we suspect/expect to be true for the population

4 / 35

Mechanics of Hypothesis Test for Multinomial Population

\(H_{0}: P_{a} = x\%, P_{b} = y\%, P_{c} = z\%\)

\(H_{1}:P_{a} \neq x\%, P_{b} \neq y\%, P_{c} \neq z\%\)

Test Statistic is

$$\chi^{2} = \sum ^{k} _{i=1} \dfrac{(f_{i} - e_{i})^{2}}{e_{i}}$$

where ... \(f_{i}=\) Observed frequency, \(e_{i}=\) Expected frequency, and \(k=\) Number of categories

\(\chi^{2} \sim\) with \(df=k-1\) if \(e_{i} \geq 5\) for all categories

Reject \(H_{0}\) if \(p-value \leq \alpha\) or, alternatively, Reject \(H_{0}\) if Calculated \(\chi^{2} \geq\) Critical \(\chi^{2}\)

5 / 35

The \(\chi^2\) distribution

Here is an applet that shows you the distribution

6 / 35

Example 1

We have four health campaigns that air. Null hypothesis is that each is recalled by identical proportion of viewers.

\(H_{0}:P_{a}=0.25; P_{b}=0.25; P_{c}=0.25; P_{d}=0.25\) and \(H_{1}:\) Proportions are different

\(e_{a} = 0.25(300)=75; e_{b} = 0.25(300)=75;\); \(e_{c} = 0.25(300)=75; e_{d} = 0.25(300)=75\)

Campaign \(f_{i}\) \(e_{i}\) \((f_{i}-e_{i})\) \((f_{i}-e_{i})^{2}\) \((f_{i}-e_{i})^{2}/{e_{i}}\)
a 85 75 10 100 1.3333
b 95 75 20 400 5.3333
c 50 75 -25 625 8.3333
d 70 75 -5 25 0.3333
Total 300 300 15.3333

\(\chi^{2}_{df=3} = 15.3333\)

\(p-value < 0.005\); Reject \(H_{0}\); The proportions are different and so the health campaigns are not all equally effective

7 / 35

Example 2

M&M/MARS' manufacturing plants have different color mixes, ands these change over time. The 1997 color mix are in parentheses below. How does the actual distribution of colors in 506 M&Ms match that prescribed by the company?

Colors \(f_{i}\) \(e_{i}\) \((f_{i}-e_{i})\) \((f_{i}-e_{i})^{2}\) \((f_{i}-e_{i})^{2}/{e_{i}}\)
Blue (10%) 38 50.6 -12.6 158.76 3.1375
Brown (30%) 177 151.8 25.2 635.04 4.1834
Green (10%) 36 50.6 -14.6 213.16 4.2126
Orange (10%) 41 50.6 -9.6 92.16 1.8213
Red (20%) 79 101.2 -22.2 492.84 4.8700
Yellow (20%) 135 101.2 33.8 1142.44 11.2889
Total 506 29.5138

\(\chi^{2}_{df=5} = 29.5138\), and the \(p-value < 0.005\); Reject \(H_{0}\); Data do not reflect 1997 color percentages

8 / 35

The Chi-Squared test of Independence/Association

9 / 35

\(\chi^{2}\) tests can also be used to test independence of two variables

For e.g., look at the following contingency table/crosstabulation

Gender Light Regular Dark Total
Male 20 40 20 80
Female 30 30 10 70
Total 50 70 30 150

Research Question: Are coffee preferences independent of gender (i.e., is there any association between coffee preferences and gender)?

\(H_{0}:\) Coffee preference is independent of gender
\(H_{1}:\) Coffee preference is not independent of gender

10 / 35

For each cell in the contingency table, calculate

$$e_{ij} = \dfrac{\text{Row } i \text{ Total} \times \text{Column } j \text{ Total}}{\text{Sample Size}}$$

\(e_{11}=\dfrac{(80)(50)}{150}=\dfrac{4000}{150}=26.67\)

\(e_{12}=\dfrac{(80)(70)}{150}=\dfrac{5600}{150}=37.33\)

\(e_{13}=\dfrac{(80)(30)}{150}=\dfrac{2400}{150}=16.00\)

\(e_{21}=\dfrac{(70)(50)}{150}=\dfrac{3500}{150}=23.33\)

\(e_{22}=\dfrac{(70)(70)}{150}=\dfrac{4900}{150}=32.67\)

\(e_{23}=\dfrac{(70)(30)}{150}=\dfrac{2100}{150}=14.00\)

11 / 35

Calculate, for each cell in the contingency table, \(\dfrac{(f_{ij}-e_{ij})^{2}}{e_{ij}}\)

Add the resulting value over all cells

This yields $$\chi^{2} = \sum_{i} \sum_{j} \dfrac{(f_{ij} - e_{ij})^{2}}{e_{ij}}$$

\(\chi^{2} \sim df=(r-1)(c-1)\) where ... \(r=\) number of rows, and \(c=\) number of columns

Why are \(df = (r-1)(c-1)\) ?

12 / 35

Degrees of freedom: \(df = (r-1)(c-1)\)

Gender Light Regular Dark Total
Male 20 40 ? 80
Female ? 30 10 70
Total 50 70 30 150
13 / 35

Completing the Coffee vs. Gender Example

Gender \(f_{i}\) \(e_{i}\) \((f_{i}-e_{i})\) \((f_{i}-e_{i})^{2}\) \((f_{i}-e_{i})^{2}/{e_{i}}\)
Male 20 26.67 -6.67 44.49 1.67
Male 40 37.33 2.67 7.13 0.19
Male 20 16.00 4.00 16.00 1.00
Female 30 23.33 6.67 44.49 1.91
Female 30 32.67 -2.67 7.13 0.22
Female 10 14.00 -4.00 16.00 1.14
\(\chi^{2}\) 6.13

\(df=(r-1)(c-1)=(2-1)(3-1)=(1)(2)=2\)

\(p-value < 0.05\); Reject \(H_{0}\)

Coffee preferences and gender are not independent

14 / 35

Another Example

WA's Public Interest Research Group (PIRG) found in its recent study that 46% of full-time college students work 25 or more hours per week. A sample of 200 included 90 who worked 1-15 hours per week, 60 who worked 16-24 hours per week, and 50 who worked 25-34 hours per week. Students were also asked if their work had a positive, negative, or no effect on their grades. Use \(\alpha = 0.01\)

Hours Worked/Week Positive None Negative Total
1-15 hours 26.00 50.00 14.00 90.00
16-24 hours 16.00 27.00 17.00 60.00
25-34 hours 11.00 19.00 20.00 50.00
Total 53.00 96.00 51.00 200.00
15 / 35

Calculated Expected Frequencies

Hours Worked/Week Positive None Negative Total
1-15 hours 23.85 43.20 22.95 90.00
16-24 hours 15.90 28.80 15.30 60.00
25-34 hours 13.25 24.00 12.75 49.95
Total 53.00 96.00 51.00 200.00
1-15 hours 0.19 1.07 3.49 4.75
16-24 hours 0.00 0.11 0.19 0.30
25-34 hours 0.38 1.04 4.12 5.54
\(\chi^{2}_{df=4}\) 10.59

What is your decision? To Reject or not to Reject?

16 / 35

Do More Working Hours Mean Poorer Grades?

How would you answer this question? Column vs Row Percentages

Hours/Week Positive None Negative Total
1-15 hours 28.89% 55.56% 15.56% 100.00%
16-24 hours 26.67% 45.00% 28.33% 100.00%
25-34 hours 22.00% 38.00% 40.00% 100.00%
17 / 35

Do More Working Hours Mean Poorer Grades?

How would you answer this question? Column vs Row Percentages

Hours/Week Positive None Negative Total
1-15 hours 28.89% 55.56% 15.56% 100.00%
16-24 hours 26.67% 45.00% 28.33% 100.00%
25-34 hours 22.00% 38.00% 40.00% 100.00%
Hours/Week Positive None Negative
1-15 hours 49.05% 52.08% 27.45%
16-24 hours 30.18% 28.12% 33.33%
25-34 hours 20.75% 19.79% 39.21%
Total 100% 100% 100%

As a student's hours worked per week increase, the negative effect on his/her grades increases

17 / 35

A Cautionary Tale

Table (a)

Hours worked/week Positive None Negative Total
1-15 hours 26 50 14 90
16-24 hours 16 27 17 60
25-34 hours 11 19 20 50
Total 53 96 51 200

Table (b)

Hours worked/week Positive None Negative Total
1-15 hours 260 500 140 900
16-24 hours 160 270 170 600
25-34 hours 110 190 200 500
Total 530 960 510 2000
18 / 35
Hours Positive None Negative Total
1-15 hours 260 500 140 900
16-24 hours 160 270 170 600
25-34 hours 110 190 200 500
Total 530 960 510 2000
Hours Positive None Negative Total
1-15 hours 238.50 432.00 229.50 900
16-24 hours 159.00 288.00 153.00 600
25-34 hours 132.50 240.00 127.50 500
Total 530 960 510 2000
Hours worked/week Positive None Negative Total
1-15 hours 1.94 10.70 34.90 47.54
16-24 hours 0.01 1.13 1.89 3.02
25-34 hours 3.82 10.42 41.23 55.46
\(\chi^{2}_{df=4}\) 106.03

Large samples will typically yield statistically significant results and so one also needs to focus on substantive significance -- how large an effect does the independent variable have? See here for a beautiful piece on this issue. This is most important for policy analysis and program evaluation in our fields

19 / 35

Distance to a Hospital and Visit Frequency

Using the data given below, test for an association between the proximity of residence to the hospital and the frequency of visits to the hospital's ER unit.

Frequency of Visits Close Medium Far Total
Low 1000 1030 1050 3080
Medium 525 520 515 1560
High 475 450 435 1360
Total 2000 2000 2000 6000
20 / 35

Calculating the expected frequencies ...

Frequency of Visits Close Medium Far
Low (3080 x 2000)/6000 (3080 x 2000) /6000 (3080 x 2000)/ 6000
Medium (1560 x 2000)/6000 (1560 x 2000) /6000 (1560 x 2000)/ 6000
High (1360 x 2000)/6000 (1360 x 2000) /6000 (1360 x 2000)/ 6000

Use this online \(\chi^2\) calculator

21 / 35

Fisher's Exact Test

The \(\chi^2\) test assumes that

(1) At least 80% of the cells in the table have expected frequencies \(\geq 5\), and

(2) No cell in the table has an expected frequency \(< 1\)

If this assumption is violated, you can try to collapse some categories (for e.g., if the categories are 0, 1-2, 3-4, 5-6, and 7 or more, and the 7 or more category has an expected frequency \(< 1\), you can collapse it into the preceding category to generate a new category called 5 or more. This collapsing has to be defensible.

You can also collapse Strongly Disagree and Agree Somewhat into Agree, Strongly Disagree and Agree Somewhat to generate a three-point classification of 'Disagree', 'Neither Disagree nor Agree', 'Agree', and so on. Again, the collapsing has to be defensible.

Else you can rely on Fisher's Exact Test, provided you have small samples and or a powerful computer

22 / 35

How does Fisher's Exact Test Work

Therapy Patient Improves Patient does not improve Total
Did pre-operative PT 15 6 21
Did not do pre-operative PT 7 322 329
Total 22 328 350

Involves calculating the probability of ending up with the observed frequencies as recorded. Computationally intensive because it involves calculating, under the assumption that \(H_0\) is true, all possible \(2\times2\) tables that would yield the same row and column totals.

\(p-value = \dfrac{(a+b)!(c+d)!(a+c)!(b+d)!}{n!a!b!c!d!}\)

In this example the ensuing \(p-value = 2.2e-16\); so we reject \(H_0\). The patient improving is not independent of whether or not the patient was given pre-operative physical therapy.

See here for a wonderful example

The online calculator for Fisher's exact test can be found here

23 / 35

Hypothesis tests with proportions

24 / 35

One-Group Hypothesis Tests with \(z\)

  • Lower Tail Test \(H_{0}: p \geq p_{0}; H_{1}: p < p_{0}\)
  • Upper Tail Test \(H_{0}: p \leq p_{0}; H_{1}: p > p_{0}\)
  • Two Tailed Test \(H_{0}: p = p_{0}; H_{1}: p \neq p_{0}\)

Sample standard deviation \(s = \sqrt{p_0 \times \left( 1 - p_0 \right)}\) and Standard Error of \(\bar{p}= s_{\bar{p}} = \dfrac{s}{\sqrt{n}}\)
Test Statistic is \(z=\dfrac{\bar{p}-p_{0}}{s_{\bar{p}}}\) and \(df = n-1\)

Confidence Intervals calculated as: \(\bar{p} \pm z_{\alpha/2} (\bar{s}_{\bar{p}})\) and adjusted with a continuity correction of \(\pm \dfrac{0.5}{n}\), with \(\bar{s}_{\bar{p}} = \dfrac{\sqrt{ \bar{p} \times (1 - \bar{p}) }}{\sqrt{n}}\)

Sample size needed calculated as before except the suspected standard deviation is typically set to \(0.5\) because

This yields the largest \(s\) ... \(\sqrt{0.5 \times (1 - 0.5)} = 0.5\) while \(\sqrt{0.1 \times (1 - 0.1)} = 0.3\)

Assuming a 50:50 split in the proportion is a good start unless we can assume otherwise

25 / 35

An Example

Consumer Reports study done in 2010 finds 64% of shoppers think national brands as good as generics. In January of 2019 Heinz asks this question of 100 shoppers and find 52% say generics are as good as national brands. Have consumer preferences changed?

Given \(p_{0}=0.64; n=100; \bar{p}=0.52\); \(H_{0}: p = 0.64; H_{1}: p \neq 0.64\)

$$\sigma_{\bar{p}}={\sqrt{\dfrac{p_{0}(1-p_{0})}{n}}}={\sqrt{\dfrac{0.64(0.36)}{100}}}=0.048$$

$$z=\dfrac{\bar{p}-p_{0}}{\sigma_{\bar{p}}}=\dfrac{0.52-0.64}{0.048}=\dfrac{-0.12}{0.48}=-2.50$$

\(p-value\) is thus 0.0124 and with \(\alpha = 0.05\), we can reject the null hypothesis; consumer preferences appear to have changed

Testing via the online calculator

26 / 35

Another Example

Census Bureau found in 1990 that 24% of those who moved residences did so to be closer to work. In 2010, 90 out of a random sample of 300 movers said so as well. Are more people moving to be closer to work in 2010 than did so in 1990?

Given \(p_{0}=0.24; n=300; \bar{p}=\dfrac{90}{300}=0.30\); \(H_{0}: p \leq 0.24; H_{1}: p > 0.24\)

$$\sigma_{\bar{p}}={\sqrt{\dfrac{p_{0}(1-p_{0})}{n}}}={\sqrt{\dfrac{0.24(0.76)}{300}}}=0.0246$$

$$z=\dfrac{\bar{p}-p_{0}}{\sigma_{\bar{p}}}=\dfrac{0.30-0.24}{0.0246}=\dfrac{0.06}{0.0246}=2.439024$$

\(p-value\) is 0.007363495 and hence with \(\alpha = 0.05\) we can easily reject the null hypothesis, concluding that compared with the 1990 Census, by 2010 more people were relocating to be closer to work

27 / 35

Yet another problem

After a massive inventory the Athens Public Library finds 12% of its books missing. They institute anti-theft measures and after a year, draw a sample of 200 books to see how many are missing and find they cannot locate 14 books. Have the new measures reduced theft?

\(H_0: p \geq 0.12\); \(H_1: p < 0.12\)
\(\bar{p} = \dfrac{14}{200} = 0.07\)
\(s = \sqrt{p_0 \times (1 - p_0)} = \sqrt{0.12 \times (1 - 0.12)} = \sqrt{0.12 \times 0.88} = 0.3249\)
\(s_{\bar{p}} = \dfrac{s}{\sqrt{n}} = \dfrac{0.3249}{\sqrt{200}} = 0.0229\)
\(z = \dfrac{\bar{p} - p_0}{s_{\bar{p}}} = \dfrac{0.07 - 0.12}{0.0229} = \dfrac{-0.05}{0.0229} = -2.1834\)
\(p-value = 0.015\) and so with \(\alpha = 0.05\) we reject \(H_0\); the data suggest that the measures have reduced thefts

What if you had used \(\alpha = 0.01\)? Would your conclusion have changed?

28 / 35

Two proportions

29 / 35

Two-Group Hypothesis Tests with z

Two groups so two proportions ... \(p_{1}; p_{2}\)

With \(n_{1}\) and \(n_{2}\), we have two sample proportions \(\bar{p_{1}}\) and \(\bar{p_{2}}\)

Point Estimate of the difference between the two groups is thus \(\bar{p_{1}} - \bar{p_{2}}\)

Standard deviation for each group is \(s_1 =\sqrt{\bar{p_1} \left(1-\bar{p_1} \right)}\) and \(s_2 =\sqrt{\bar{p_2} \left(1-\bar{p_2} \right)}\)

Standard error for each group is \(s_{\bar{p_1}} = \dfrac{s_1}{\sqrt{n_1}}\) and \(s_{\bar{p_2}} = \dfrac{s_2}{\sqrt{n_2}}\)

The overall standard error for both groups is \(s_{\bar{p_1} - \bar{p_2} } = \sqrt{\left( s_{\bar{p_1}} \right)^2 + \left(s_{\bar{p_2}} \right)^2 }\)

Distribution of the test statistic is \(N()\) if \(n_{1}p_{1}, n_{1}(1-p_{1}), n_{2}p_{2}, n_2(1-p_{2})\) are all \(\geq 5\)

30 / 35

Degrees of freedom now are \(df= n_1 + n_2 - 2\)

Interval Estimate is given by \(\bar{p_{1}} - \bar{p_{2}} \pm z_{\alpha/2} \left( s_{\bar{p_{1}} - \bar{p_{2}}} \right)\)

Hypothesis tests about \(p_{1}-p_{2} \cdots\)

  • \(H_{0}:p_{1}-p_{2} \geq 0; H_{1}:p_{1}-p_{2} < 0\)
  • \(H_{0}:p_{1}-p_{2} \leq 0; H_{1}:p_{1}-p_{2} > 0\)
  • \(H_{0}:p_{1}-p_{2}=0; H_{1}:p_{1}-p_{2} \neq 0\)

Assuming \(H_{0}\) is true is equivalent to saying \(p_{1}=p_{2}=p\)

Testing via the online calculator

31 / 35

An Example with message recall

In a test of two anti-tobacco television commercials, random sample of television viewers were asked to recall the primary message in each. Let \(1=\) Commercial A and \(2 =\) Commercial B. Given \(n_1=150; n_2=200\) and that the number recalling the primary message were 63 and 60, respectively. Test the hypothesis of no difference in recall.

$$H_{0}:p_{1}-p_{2}=0; H_{1}:p_{1}-p_{2} \neq 0$$

$$\bar{p_{1}}=\dfrac{63}{150}=0.42; \bar{p_{2}}=\dfrac{60}{200}=0.30; df=n_1 + n_2 - 2 = 150+200-2=348$$

$$s_1 =\sqrt{\bar{p_1} \left(1-\bar{p_1} \right)} = \sqrt{ 0.42 \times (1 - 0.42) } = 0.4935$$

$$s_2 =\sqrt{\bar{p_2} \left(1-\bar{p_2} \right)} = \sqrt{ 0.30 \times (1 - 0.30) } = 0.4582$$

$$s_{\bar{p_1}} = \dfrac{s_1}{\sqrt{n_1}} = \dfrac{0.4935}{\sqrt{150}} = 0.0402; s_{\bar{p_2}} = \dfrac{s_2}{\sqrt{n_2}} = \dfrac{0.4582}{\sqrt{200}} = 0.0324$$

$$s_{\bar{p_1} - \bar{p_2} } = \sqrt{ \left( s_{\bar{p_1}} \right)^2 + \left(s_{\bar{p_2}} \right)^2 } = \sqrt{(0.0402)^2 + (0.0324)^2} = 0.0517$$

32 / 35

$$z=\dfrac{\bar{p_{1}} - \bar{p_{2}}}{s_{\bar{p_1} - \bar{p_2} }} =\dfrac{0.42-0.30}{0.0517} = \dfrac{0.12}{0.0517} = 2.3206$$

\(p-value = 0.0209\) so, we reject \(H_{0}\)}; Recall rates seem to differ across commercials

95% CI is \(\bar{p_{1}} - \bar{p_{2}} \pm z_{\alpha/2; df} ( s_{\bar{p_1} - \bar{p_2} }) = 0.12 \pm 1.967 (0.0517) = (0.0183; 0.2216) \cdots\) does not include \(H_0\) value of 0

What if we used \(\alpha=0.01\)? Would the conclusion change?

33 / 35

An Example with Helmet Laws

The Wisconsin legislature is considering a mandatory motorcycle helmet law. What legislators don't know is whether the law would encourage more people to use helmets. A Senator tells you that Minnesota has a similar law in use and so you conduct a random survey of registered motorcycle riders in each state. The results are given below:

Minnesota Wisconsin
Sample Size 75 110
Number using helmets 37 28
  • Setup the correct hypotheses

  • Using \(\alpha=0.01\), carry out the test

  • State the conclusion of your hypothesis test

  • What if the numbers using seat belts were 37 and 50, respectively?

34 / 35

Another Example: Racial Discrimination

The City Attorney for Columbus (OH) is gathering data for a racial discrimination lawsuit. When she asks 500 Latino residents of the city if they feel the city is racially biased, 354 reply in the affirmative. When she asks 300 non-Latino White residents the same question, 104 respond in the affirmative. Do these data suggest the Latinos perceive racial bias differently than do non-Latino-Whites?

  • Setup the correct hypotheses

  • Using \(\alpha=0.05\), carry out the test

  • State the conclusion of your hypothesis test

  • What if the numbers replying in the affirmative were 275 and 144, respectively? What would you conclude?

  • What if we wanted to test whether the data suggest that Latinos perceive racial bias more often than do non-Latino Whites? What would the hypotheses be? The conclusion?

35 / 35

Agenda

  1. Hypothesis tests with Multinomial Proportions

  2. Hypothesis tests for one-group proportions

  3. Hypothesis tests for two-group proportions

2 / 35
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow