Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

posthoc_conover_friedman calculates incorrect p-values #63

Closed
liborak opened this issue Nov 29, 2023 · 9 comments
Closed

posthoc_conover_friedman calculates incorrect p-values #63

liborak opened this issue Nov 29, 2023 · 9 comments
Assignees

Comments

@liborak
Copy link

liborak commented Nov 29, 2023

Describe the bug
I have identified a discrepancy in the p-values produced by 'scikit_posthocs.posthoc_conover_friedman' when compared to the results obtained through manual calculation using the provided references ([1] Conover & Iman, 1979; [2] Conover, 1999). Upon closer examination of '_posthocs.py', I believe the issue lies in the formula implemented on line 731 (calculation of T2) which is inconsistent with the formula specified in references [1] and [2]:

# (...) 
# Below is the critical part of scikit_posthocs.posthoc_conover_friedman function:
x['mat'] = x.groupby(_block_col)[_y_col].rank()
R = x.groupby(_group_col)['mat'].sum()
A1 = (x['mat'] ** 2).sum()
m = 1
S2 = m/(m*k - 1.) * (A1 - m*k*n*(m*k + 1.)**2./4.)
T2 = 1. / S2 * (np.sum(R) - n * m * ((m * k + 1.) / 2.)**2.)
A = S2 * (2. * n * (m * k - 1.)) / (m * n * k - k - n + 1.)
B = 1. - T2 / (n * (m * k - 1.))
# (...)

Dataset
See the code below.

To Reproduce

from scikit_posthocs import posthoc_conover_friedman

data=[
	[3.88, 30.58, 25.24, 4.44, 29.41, 38.87],
	[5.64, 30.14, 33.52, 7.94, 30.72, 33.12],
	[5.76, 16.92, 25.45, 4.04, 32.92, 39.15],
	[4.25, 23.19, 18.85, 4.40, 28.23, 28.06],
	[5.91, 26.74, 20.45, 4.23, 23.35, 38.23],
	[4.33, 10.91, 26.67, 4.36, 12.00, 26.65]
]

pval = posthoc_conover_friedman(data, p_adjust=None)

The script yields in following p-values:

          0         1         2         3         4         5
0  1.000000  0.036732  0.019296  0.771000  0.009817  0.001127
1  0.036732  1.000000  0.771000  0.067323  0.561483  0.153707
2  0.019296  0.771000  1.000000  0.036732  0.771000  0.250288
3  0.771000  0.067323  0.036732  1.000000  0.019296  0.002360
4  0.009817  0.561483  0.771000  0.019296  1.000000  0.385789
5  0.001127  0.153707  0.250288  0.002360  0.385789  1.000000

Expected behavior
Correct results (as I believe) for the testing data are reported on page 18 of "The Pairwise Multiple Comparison of Mean Ranks Package (PMCMR)" by Thorsten Pohlert (2016).

obrazek

I have verified these results using the script below and independently through manual calculation. The verification process followed the methodology outlined in [2] Conover, 1999 (page 371). The validation shows significant discrepancies with p-values produced by scikit_posthocs.

import numpy as np
import scipy.stats as ss

# Data
data = np.array([
    # T1     T2     T3    T4    T5      T6      = Treatment (k-times)
    [3.88, 30.58, 25.24, 4.44, 29.41, 38.87],   # Value 1
    [5.64, 30.14, 33.52, 7.94, 30.72, 33.12],   # Value 2
    [5.76, 16.92, 25.45, 4.04, 32.92, 39.15],   # Value 3
    [4.25, 23.19, 18.85, 4.40, 28.23, 28.06],   # Value 4
    [5.91, 26.74, 20.45, 4.23, 23.35, 38.23],   # Value 5
    [4.33, 10.91, 26.67, 4.36, 12.00, 26.65]    # Value 6
])

# Rank data
data_ranked = np.array([ss.rankdata(col) for col in data])

# Calculate Conover statistics (tval)
b, k = data.shape
A1 = np.sum(data_ranked**2)
C1 = b*k*(k+1)**2/4
R = np.sum(data_ranked, axis=0)
T1 = (k-1)*(np.sum(R**2)-b*C1)/(A1-C1)
SE = ((A1-C1)*2*b/((b-1)*(k-1))*(1-T1/(b*(k-1))))**0.5

difR = np.zeros((k, k))
for i in range(k):
    for j in range(k):
        difR[i, j] = abs(R[i] - R[j])

tval = difR / SE

# Calculate p-values
pval = 2 * ss.t.sf(tval, df=(b*k - k - b + 1))

This yields in following p-values:

[[1.000000 0.000143 0.000030 0.555472 0.000007 0.000000]
 [0.000143 1.000000 0.555472 0.000666 0.243213 0.006214]
 [0.000030 0.555472 1.000000 0.000143 0.555472 0.024680]
 [0.555472 0.000666 0.000143 1.000000 0.000030 0.000000]
 [0.000007 0.243213 0.555472 0.000030 1.000000 0.085107]
 [0.000000 0.006214 0.024680 0.000000 0.085107 1.000000]]

These values are consistent with Pohlert's results.

System and package information (please complete the following information):

  • OS: Windos 10 Pro
  • Package version:
    • scipy 1.11.1
    • scikit_posthocs 0.8.0

** Aditional context **
No p-value adjustment was used in the tested runs.

@maximtrp
Copy link
Owner

maximtrp commented Nov 30, 2023

Hello! Thank you! Let's try to investigate this one deeper. Could you please run the following R script and post your results here?

library(PMCMRplus)

y <- matrix(c(
  3.88, 5.64, 5.76, 4.25, 5.91, 4.33, 30.58, 30.14, 16.92,
  23.19, 26.74, 10.91, 25.24, 33.52, 25.45, 18.85, 20.45,
  26.67, 4.44, 7.94, 4.04, 4.4, 4.23, 4.36, 29.41, 30.72,
  32.92, 28.23, 23.35, 12, 38.87, 33.12, 39.15, 28.06, 38.23,
  26.65), nrow=6, ncol=6,
  dimnames=list(1:6, LETTERS[1:6])
)

frdAllPairsConoverTest(y, p.adjust.method = 'none')

Here are my results (equal to scikit-posthocs output):

  A      B      C      D      E     
B 0.0367 -      -      -      -     
C 0.0193 0.7710 -      -      -     
D 0.7710 0.0673 0.0367 -      -     
E 0.0098 0.5615 0.7710 0.0193 -     
F 0.0011 0.1537 0.2503 0.0024 0.3858

P value adjustment method: none

@maximtrp
Copy link
Owner

maximtrp commented Nov 30, 2023

Also, if you look at PMCMRplus code for frdAllPairsConoverTest, you will find out that scikit-posthocs implementation is very close to it, almost equal in some lines.

I assume that you are using the latest version of PMCMRplus.

@liborak
Copy link
Author

liborak commented Nov 30, 2023

Hello Max! I'm not familiar with R and don't have the capability to run R scripts. However, I've manually verified the results by substituting values into the formulas as outlined by Conover in his book "Practical nonparametric Statistics" (3rd. Edition, Wiley, 1999; pages 369-371). I suspect parameter T2 might be critical in this regard. PMCMRplus code references Conover's original paper from 1978 which specifies T2 as follows:

obrazek

However, in PMCMRplus code (and also in scikit-posthocs), T2 is calculated as follows:

PMCMRplus:

  S2 <- m / (m * k - 1) * (sum(r ^ 2) - m * k * n *
                                 (m * k + 1) ^ 2 / 4)
  T2 <- 1 / S2 * (sum(R.sum) - n * m * ((m * k + 1) / 2) ^ 2)

scikit-posthocs:

  R = x.groupby(_group_col)['mat'].sum()
  S2 = m/(m*k - 1.) * (A1 - m*k*n*(m*k + 1.)**2./4.)
  T2 = 1. / S2 * (np.sum(R) - n * m * ((m * k + 1.) / 2.)**2.)

Indeed, the implementation is the same. In both codes, however, the second power is applied only to the parentheses ((m * k + 1) / 2) and is excluded from the summation while Conover applies the second power to the whole term inside the summation (which includes R and the terms with n,m,k). This is not equivalent.

My results corresponded to Pohlert's documentation for PMCMR as shown in my bug report. Maybe Code has changed in PMCMRplus (I didn't check). If so, a bug may have occured with the change.

I compared both Conover's publications (1979 and 1999) and although he gives slightly different formulas in those two publications, they give the same results.

@maximtrp
Copy link
Owner

maximtrp commented Dec 1, 2023

Interesting! Thank you! I need some time to look into it. If you have time, please report this in PMCMRplus repo as well

@liborak
Copy link
Author

liborak commented Dec 4, 2023

Thank you. I have compared the codes for Conover's test in PMCMR and PMCMRplus, and I have noticed that the calculations for T2 are indeed different in the two codes. The correct formula, i.e. consistent with the references, is present in PMCMR. I have already reported this discrepancy in PMCMRplus.

@maximtrp
Copy link
Owner

maximtrp commented Dec 5, 2023

Thank you!

@liborak
Copy link
Author

liborak commented Dec 10, 2023

Here is an email response from the author & maintainer of the PMCMRplus package confirming the bug:

I can confirm the issue and I have already discovered the bug.
I will fix it as soon as possible and will submit it to CRAN.
The results from the older PMCMR package were correct.

@liborak
Copy link
Author

liborak commented Dec 10, 2023

It seems that the bug has already been fixed in PMCMRplus version 1.9.10. See the news below. Please note that the rdrr snippet for online running the R Code uses version 1.7.0 of the PMCMRplus package and, therefore, contains the bug.

obrazek

@maximtrp
Copy link
Owner

Thank you! Fixed in the latest release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants