DRAWING INFERENCES ABOUT POPULATION MEANS AND PROPORTIONS
Introduction
Various applied researchers have used chi-squarestatistic for more than a hundred years. For example, a researcher might conduct a test to determine the effectiveness of cholesterol treatment. The researcher maywant to know whether there is evidence that the treatment is effective. This paper addresses how the researcher can use chi-square results to make valid conclusions. The following five approaches have been used: giving out the procedures for testing hypothesis, formulating the null and alternate hypothesis, determining the test statistic to be used, calculating the p-value, and determining whether to reject or accept the null hypothesis.
Question
A researcher conducts a test on the effectiveness of a cholesterol treatment on 114 total subjects. Assuming the tails of distributions are normal distribution, is there evidence that the treatment is effective?
Table 1
|
Cholesterol Decreased |
No Cholesterol Decrease |
Total |
Treatment |
38 |
18 |
56 |
No treatment |
30 |
28 |
58 |
Total |
68 |
46 |
114 |
- Procedure for testing hypotheses.
This problem can be solved using the chi-square statistic. Chi-square is a non-parametric test designed to check differences in the groups where the dependent variable is assumed to be a categorical variable (McHugh,2013). Chi-square is assumed to be robust to the distributions of the given data.
Procedure
Step 1: State the null hypothesis
The null hypothesis is a conjecture that is used in data analysis to propose that certain characteristics of a population are similar. This hypothesis helps us to state what happens if the experiment does not make any difference. The null hypothesis is mostly denoted by H0.
Step 2: State the Alternate hypothesis
In this step, outline the alternate hypothesis. The alternate hypothesis is the opposite of the null hypothesis, in other words, it is used to propose that there is a difference. The alternate hypothesis is mostly denoted by HA.
Step 3: Set the alpha ()
The following contingency table can be constructed from the hypothesis test above
Table 2
|
Actual |
|
Decision |
H0is TRUE |
H0 is FALSE |
Accept H0 |
Correct |
Type II Error β is the probability of Type II Error |
Reject H0 |
Type I Error is the probability of Type I Error |
Correct |
It is important to set the alpha before the experiment to avoid Type I Error (Pereira and Leslie, 2009).. In most cases, the value is 0.05. This value establishes a 95% confidence level.
Step 4: Collecting the Dataset
The data can be collected through observational or experimental designs. In this paper, the data was collected using experiments.
Step 5: choose and calculate the test statistic
The test statistic is chosen by identifying the objective of the analysis and the type of data involved. F-statistic is calculated when we have categorical treatment level means. The computed F value is mostly denoted as Fcalculated.
Step 5: Identify the acceptance and rejection region
Most test statistics have a critical value which helps to reject or accept the null hypothesis. The F value is mostly obtained from the tables. It is referred to as F-critical. The figure below shows the acceptance/rejection region, F distribution, and F-critical("1.2 - The 7 Step Process of Statistical Hypothesis Testing | STAT 502", 2020).
Figure 1
Step 7: Conclude on the Null hypothesis
The calculated p-value gives the probability of having a bigger Fcalculated than what was observed. If the Fcalculated=, then the p-value is said to be equal to alpha(). If the Fcalculated values are large than the p-values, we move to the rejection region and the p-value become less than the alpha (). Therefore, the following decision rule will hold:
If the p-value is less than, then we reject the Null hypothesis (H0) and accept the Alternative hypothesis (HA).
- Formulating the null and alternative hypotheses.
H0: Cholesterol levels and cholesterol treatment are independent.
HA: Cholesterol levels and cholesterol treatment are not independent.
- Calculating the Test statistic
Degree of freedom =
The expected frequency for each cell is given by
E11 = = 33.403
E12 = = 22.596
E21 = = 34.596
E22 = = 23.403
Table 3
|
Cholesterol Decreased |
No Cholesterol Decrease |
Treatment |
A |
B |
No treatment |
C |
D |
Table 4
|
A |
B |
c |
d |
Observed frequencies |
38 |
18 |
30 |
28 |
Expected Frequencies |
33.403 |
22.596 |
34.596 |
23.403 |
The test statistic can be computed as follows
- Calculate the p-value. Show your work.
For the Chi-square distribution, the critical value can be found in the tale of probabilities.
Degree of freedom =
Alpha = 0.05
The appropriate critical value is 3.841, therefore, the decision rule is as follows: Reject the Null hypothesis if
Using the chi-square distribution table, we get the p-value as 0.0792 with 1 degree of frequency.
- Discuss whether there is enough evidence to reject the null hypothesis.
In this analysis, we set our alpha to be 0.05. The p-value is more than our set alpha (0.0792 > 0.05) hence we do not reject the null hypothesis. Therefore, we can say that cholesterol level and cholesterol treatment is independent of each other.
References
McHugh, M. L. (2013). The chi-square test of independence. Biochemia medica: Biochemia medica, 23(2), 143-149.
Pereira, S. M., & Leslie, G. (2009). Hypothesis testing. Australian Critical Care, 22(4), 187-191.
1.2 - The 7 Step Process of Statistical Hypothesis Testing | STAT 502. (2020). Retrieved 12 June 2020, from https://online.stat.psu.edu/stat502/lesson/1/1.2