Wilcoxon-Mann-Withney Test with R
Data Description:
The dataset "alcohol.txt" located at "http://ramanujan.math.trinity.edu/ekwessi/misc/alcohol.txt" is from Eriksen, Bjornstad, and Gotestam (1986). They studied a social skills training program for alcoholics. Twenty-four "alcohol-dependent" male impatients at an alcohol treatment center were randomly assigned to two groups. The control group patients were given a traditional treatment program while the treatment group patients were given traditional treatment plus a class in social skills training (SST). After being discharged from the program, each patient reported-in 2-week intervals-the quantity of alcohol consumed, the number of days prior to his first drinks, the number of sober days, the days worked, the times admitted to an institution, and the nights slept at home. Reports were verified bu other sources(wives or family members). One patient in the SST group, discovered to be an opiate addict, disappeared after discharge and submitted no reports.
We would like to know if the patients in the SST group tends to have lower alcohol intake than those in the control group. For, we will test the null hypothesis that there is no difference in alcohol intake between the two group versus the alternative that the SST group has a lower alcohol intake than the Control group
Analysis:
We can perform a Wilcoxon-Mann-Withney Test in R with the command "wilcox.test".
>alcohol<-read.table(file="http://ramanujan.math.trinity.edu/ekwessi/misc/alcohol.tx",header=T) # Loading the data in the
R-workspace
>head(alcohol) # Observing the first 6 data points.
Control SST
1 1042 874
2 1617 389
3 1180 612
4 973 798
5 1552 1152
6 1251 893
>x=alcohol$Control # Renaming the variables.
>y=alcohol$SST
>wilcox.test(x,y,alternative="greater",exact=F,correct=F) # Perfoming the wilcoxon-Mann-Withney test.
Wilcoxon rank sum test
data: x and y
W = 117, p-value = 0.0008481
alternative hypothesis: true location shift is greater than 0
Interpretation of the Result:
The p-value (0.0008481) is small (<0.05), thus there is statistical significant evidence that the patients in SST group have a lower alcohol intake than those in the Control group.
Remark:
The parametric equivalent ( two- sample t-test) is given below:
Remark:
The parametric equivalent ( two- sample t-test) is given below:
>t.test(x,y,alternative="greater", paired=F,var.equal=F)
Welch Two Sample t-test
data: x and y
t = 3.9747, df = 20.599, p-value = 0.0003559
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
258.5566 Inf
sample estimates:
mean of x mean of y
1196.1667 739.9091
Note this parametric test is also valid since that the assumptions of normality are met.
>ad.test(x)
>ad.test(y)