p-values from compare_means() and wilcox.test() are different #141

smshuai · 2018-12-03T19:50:01Z

Hello,

Below is an example:

val = c(-0.78137127,-0.86180992,-0.91177614,-0.95413924,-0.80979775,
        -0.70236469,-0.96355688,-0.84418155,-0.30040466,1.25324304,
        0.53833376,3.01788826,5.35022326)
grp = c(rep('G1', 8), rep('G2', 5))
df = data.frame(val, grp)
wilcox.test(val~grp, df)
require(ggpubr)
compare_means(val~grp, df)

The p-value from wilcox.test is 0.001554 while from compare_menas is 0.00431.

Any clue about the cause? Thanks!

Session info:

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bindrcpp_0.2.2 ggpubr_0.2     magrittr_1.5   ggplot2_3.0.0 

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.17     rstudioapi_0.7   bindr_0.1.1      tidyselect_0.2.4 munsell_0.5.0   
 [6] colorspace_1.3-2 R6_2.2.2         rlang_0.2.1      plyr_1.8.4       dplyr_0.7.6     
[11] tools_3.4.1      grid_3.4.1       gtable_0.2.0     utf8_1.1.4       cli_1.0.0       
[16] withr_2.1.2      yaml_2.1.19      lazyeval_0.2.1   assertthat_0.2.0 tibble_1.4.2    
[21] crayon_1.3.4     purrr_0.2.5      tidyr_0.8.1      glue_1.2.0       compiler_3.4.1  
[26] pillar_1.2.3     scales_0.5.0     pkgconfig_2.0.1

The text was updated successfully, but these errors were encountered:

TapsiS · 2019-02-06T16:28:37Z

I am also having different p values. Here is an example -

# example dataframe
before <-c(200.1, 190.9, 192.7, 213, 241.4, 196.9, 172.2, 185.5, 205.2, 193.7)
after <-c(392.9, 393.2, 345.1, 393, 434, 427.9, 422, 383.9, 392.3, 352.2)

# Create a data frame
my_data <- data.frame( 
        group = rep(c("before", "after"), each = 10),
        weight = c(before,  after)
        )
# standard wilcox test
wilcox.test(after, before, paired = TRUE, alternative = "two.sided")

##V = 55, p-value = 0.001953

# using stat_compare_means
ggboxplot(my_data, x = "group", y = "weight",
                                         color  = "group", palette =c("#00AFBB", "#E7B800"),
                                         add = "jitter", shape = "group") +
           stat_compare_means(comparisons = list( c("after", "before")), method = "wilcox.test", paired = TRUE,
                                      method.args = list(alternative = "two.sided"))

##p value 0.0059

stemicha · 2019-04-01T17:43:58Z

same here:
library(ggpubr)
library(tidyverse)

set.seed(666)

#genrate data
df <- tibble(group=c(rep("group1",5),rep("group2",5)),int=c(rnorm(n = 5,mean = 100),rnorm(5,mean = 1000)))
#log data
df$log2_int <- log2(df$int)

#on log2 data
#wilcox test
wilcox.test(x = unlist(df %>% filter(group=="group1") %>% select(log2_int)),
y = unlist(df %>% filter(group=="group2") %>% select(log2_int)),
paired = F,
alternative = "two.sided")

p-value = 0.007937

#ggpubr
compare_means(log2_int ~ group, df,method = "wilcox.test",paired = F,alternative = "two.sided")

p-value = 0.012

…x.test" #141

kassambara · 2019-06-03T06:00:56Z

Fixed now, thanks!

When method = "wilcox.test", the function compare_means() set automatically the option exact = FALSE. This is no longer the case

kazumits · 2019-06-11T06:55:09Z

Please also remove the exact=FALSE in stat_compare_means().

kassambara · 2019-06-11T19:55:34Z

removed now, thanks

hitrp · 2021-04-13T08:24:50Z

I found there're different results for wilcox.test(df,mu=xx) and compare_means(df,ref.group='.all.') for one sample test, after digging into the details, I found that compare_means seems to take all values in df as .all. group, and operate a two-sample test(wilcox rank sum) rather than one sample test, which may violate the independent assumption, am I think it in the right way?

Here's an easy one to repeat the results:

val = c(-0.78137127,-0.86180992,-0.91177614,-0.95413924,-0.80979775,
-0.70236469,-0.96355688,-0.84418155,-0.30040466,1.25324304,
0.53833376,3.01788826,5.35022326)
grp = c(rep('G1', 8), rep('G2', 5))
df = data.frame(val, grp)
wilcox.test(df[df['grp']=='G1','val'], mu=mean(df[,'val']))
compare_means(val~grp, df, ref.group='.all.')

Also, this is the detail of compare_means() function for handing '.all.'
grp val .group.
1 G1 -0.7813713 G1
2 G1 -0.8618099 G1
3 G1 -0.9117761 G1
4 G1 -0.9541392 G1
5 G1 -0.8097977 G1
6 G1 -0.7023647 G1
7 G1 -0.9635569 G1
8 G1 -0.8441815 G1
9 G2 -0.3004047 G2
10 G2 1.2532430 G2
11 G2 0.5383338 G2
12 G2 3.0178883 G2
13 G2 5.3502233 G2
14 G1 -0.7813713 .all.
15 G1 -0.8618099 .all.
16 G1 -0.9117761 .all.
17 G1 -0.9541392 .all.
18 G1 -0.8097977 .all.
19 G1 -0.7023647 .all.
20 G1 -0.9635569 .all.
21 G1 -0.8441815 .all.
22 G2 -0.3004047 .all.
23 G2 1.2532430 .all.
24 G2 0.5383338 .all.
25 G2 3.0178883 .all.
26 G2 5.3502233 .all.
val ~ .group.

kassambara added a commit that referenced this issue Jun 3, 2019

removing the automatic option exact = FALSE used when method = "wilco…

bd67fb7

…x.test" #141

kassambara closed this as completed Jun 3, 2019

kassambara added a commit that referenced this issue Jun 11, 2019

exact = FALSE removed #141

e229519

parrist mentioned this issue Jul 11, 2019

Different p-values using stat_compare_means and compare_means with wilcox.test #193

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

p-values from compare_means() and wilcox.test() are different #141

p-values from compare_means() and wilcox.test() are different #141

smshuai commented Dec 3, 2018

TapsiS commented Feb 6, 2019

stemicha commented Apr 1, 2019

kassambara commented Jun 3, 2019

kazumits commented Jun 11, 2019

kassambara commented Jun 11, 2019

hitrp commented Apr 13, 2021

p-values from compare_means() and wilcox.test() are different #141

p-values from compare_means() and wilcox.test() are different #141

Comments

smshuai commented Dec 3, 2018

TapsiS commented Feb 6, 2019

stemicha commented Apr 1, 2019

p-value = 0.007937

p-value = 0.012

kassambara commented Jun 3, 2019

kazumits commented Jun 11, 2019

kassambara commented Jun 11, 2019

hitrp commented Apr 13, 2021