Skip to content

Conversation

@daikonradish
Copy link
Contributor

fixes #148.

The Kruskal Wallis test requires there to be some underlying variation within the dataset. That means that the test should fail when provided with samples that are all of the same value.

This fix inserts a check for the first rank and the last rank within kruskalWallisRank. If the first rank is the same as the last rank, then there is no underlying variation, and Nothing is returned and propagated forward.

@Shimuuar
Copy link
Collaborator

kruskalWallis and kruskalWallisRank are part of public API, so this is breaking change. I'll try to remember what is Kruskal Wallis test and try to invent way to avoid breaking changes

P.S. Build problems with GHC 9.0 and 9.2 are

@Shimuuar
Copy link
Collaborator

P.P.S. Build problems with GHC 9.0 and 9.2 are due to some caching problem. Ignore them

Copy link
Collaborator

@Shimuuar Shimuuar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is behavior of exiting tests: scipy throws exception,. R return NaN for p-value. So we need to answer what is desired behavior. Standard formula is not defined in that case but maybe there's sensible limit.

Numerical experimentation suggest that there's no sensible limit. If we take test for two large samples where almost all elements are identical except few. p-value depends only on number of elements that differ from mode and nearly independent of sample size. I think returning Nothing is correct choice.

I'd like to avoid breaking changes if possible

  • kruskalWallisRank function is perfectly defined for samples where all element are same. No need to change anything
  • Same for kruskalWallis. It returns NaN but it's quite reasonable way to communicate failure in numeric code.

P.S. Please rebase on top of master. I've fixed CI failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Kruskal Wallis out of range

2 participants