|
Basic
Skills and Principles: Measurement
Inferential
Statistics
The t-test: An Example
|
Let's
assume that a researcher is attempting to determine the effects
of pesticide pollution on the hatching success of fish eggs.
He has noticed that fish eggs in streams near agricultural
fields fare poorly, while those is undisturbed areas hatch
successfully. The researcher sets up a lab experiment in which
ten groups of fish eggs are allowed to develop in unmanipulated
stream water, and ten groups of eggs develop in stream water
with the addition of pesticide. The proportion of eggs hatching
in each group is tallied (Table 1), and descriptive statistics
for the two groups compared.
|
Statistic
|
No
Pesticide
|
Pesticide
|
| Mean
proportion eggs hatching |
0.87
|
0.59
|
| Standard
deviation |
0.07
|
0.09
|
|
|
Group
|
Proportion
eggs hatching
|
|
No
Pesticide
|
Pesticide
|
|
1
|
0.80
|
0.50
|
|
2
|
0.76
|
0.45
|
|
3
|
0.81
|
0.68
|
|
4
|
0.90
|
0.77
|
|
5
|
0.95
|
0.64
|
|
6
|
0.84
|
0.60
|
|
7
|
0.88
|
0.54
|
|
8
|
0.99
|
0.57
|
|
9
|
0.86
|
0.62
|
|
10
|
0.93
|
0.57
|
Table
1. Effects of pesticide pollution on
hatching success of fish eggs.
|
While
the results indicate an adverse effect of pesticide exposure on
egg hatching success, how can we be sure that the difference in
the hatching success is "significant", and didn't just occur by
chance? After all, the two data sets do overlap as the lowest value
in the no pesticide groups and the highest value in the pesticide
groups was 76% hatching success. To compare these two data sets,
we must first state our hypothesis. Our null hypothesis would be
that the two groups are not different, and that both data sets belong
to the same distribution.
Ho:
The mean proportion eggs hatching in the two groups is not different
OR
Ho: The two experimental groups are part of the same distribution
We
then conduct a t-test on the data set, which will examine the differences
in mean and dispersion in the two groups, and provide a probability
that the two groups are part of the same distribution.
|
As
the p-value returned by the test was less than 0.05, we can
reject our null hypothesis, conclude that the two groups are
indeed from different distributions, and that they are significantly
different from one another. The researcher can therefore conclude
that pesticide exposure reduces the hatching success of eggs
of this species. All of the comparisons you will be making
in laboratory exercises this semester will mimic this example,
so you should take special care to ensure you understand the
operation and usefulness the t-test for comparing data sets.
|
|
Comparison
|
t-statistic
|
p-value
|
|
t-test
|
7.56
|
p<
0.0005
|
|
Performing
t-tests
To perform
t-tests on data sets, we suggest using an online t-test calculator
from Graphpad.com. It can be accessed at the address below, and
is exceptionally easy to use. WebStat, the program we are using
to calculate descriptive statistics and create graphs, also has
a t-test function, but we suggest you use the one below as it is
tailored to the needs and statistical sophistication of Science
1101 students.
http://www.graphpad.com/quickcalcs/index.cfm
(1.)
Select the "Continuous data" option, then hit the "Continue"
button.
(2.) Select the "t test to compare two means" option,
then hit the "Continue" button.
(3.) Simply enter your data in the columns by group, select "Unpaired
t-test", and then hit the "Calculate now"
button. Your p-value and t statistic will be listed on the
results page.
t-tests
and Statistical Assumptions - A Word of Caution
Those
of you familiar with the t-test have likely noticed that we are
omitting a step in our use of the t-test - the testing of assumptions.
For a given data set to be suitable for analysis with a t-test,
it must meet two assumptions: (1.) the variance in the two groups
being compared cannot be significantly different from one another,
and (2.) the data must roughly fit a normal distribution. When statisticians
and scientists conduct a t-test, they first verify these assumptions
with statistical tests, and only proceed once these assumptions
have been satisfied. If the variances in the two groups differ appreciably,
the data can be mathematically "transformed" to bring
variances closer together. If the data are not normally distributed,
they can be transformed for normality, or an alternative test that
does not require a normal distribution can be used.
As
these steps appreciably increase the statistical complexity of t-test
analysis, we will not be testing data sets for assumptions in this
course. You must therefore realize that the statistical rigor of
your results may not be comparable to that in published scientific
studies, and that we are consciously avoiding the use of assumption
tests to simplify the statistical analyses conducted in the course.
|