Weighted Wilcox Rank Sum Test


The Wilcox Rank Sum test (WRST) is an extremely useful test from nonparametric statistics. One issue arises, however, in sample survey work. In a basic sample survey, a unit, say a business establishment, has a weight of w_i which means that that unit represents itself and w_i units in the population. The issue is that the WRST does not incorporate survey weights. Additionally, the standard SAS procedures and R macros for doing a WRST do not incorporate survey weights for nonparametric analysis. In this article I will give several ways (ranging from bad, to OK, to probably correct) to carry out a WRST using survey weights.

a) First, there is the naive way of creating w_i replications of an observation given the observation's weight of w_i, and then run the Wilcox rank sum test as usual on this expanded dataset. I'd proceed with extreme caution with this because it is ad hoc and increases the 'sample size', but it seems a good first stab at it in my opinion. I believe this because experience tells me that the impact of sampling design and weights on statistical tests is sometimes great but sometimes it is not so great. With this approach, your dataset increased from size M to size N. Again, beware.

b) Another way, is to do the duplication described in a) above, but then randomly sample M observations with replacement from the N observations, so you're back at an "original" dataset of size M, and then carry out the Wilcox rank sum test as you normally would. The logic is that this newly sampled dataset would reflect the weights more than not using weights at all.

c) A way to improve the above is to replicate the procedure described in b) many times, say 100, and give the averages and results over these 100 simulations. One could provide distributions of the Wilcox statistics and p-values, for example, to make their conclusion.

d) If you want to do a probably less powerful test than the Wilcox rank sum test, you can do the Wilcox signed rank test on the original data. To get this weighted, you can create the signed ranks yourself, and then put these signed ranks through SAS's PROC SURVEYMEANS which can incorporate the survey weights. Note that this would ultimately carry out a t-test however to get the p-value.

e) A more sophisticated way, probably the correct way mathematically speaking, is something I recently found in R, a nonparametric alternative to the weighted t test (wtd.t.test) in R, which is svyranktest (see https://www.rdocumentation.org/packages/survey/versions/3.32-1/topics/svyranktest) in the 'survey' package. Please note that I have never tried it, but have heard good things from people that have.

The theory for this comes from:

T. Lumley and A.J. Scott (2013). Two-sample rank tests under complex sampling. Biometrika, 100, 831-842. (https://stattech.blogs.auckland.ac.nz/files/2012/06/ranktests-techrep.pdf)

f) Last, I would simply bypass nonparametric approaches entirely and try something like a weighted 2 sample t-test! Nonparametric tests can definitely be a headache when working with survey data, but if sample sizes are "large enough", both parametric and nonparametric tend to be similar, if not the numbers of the test statistics at least the same conclusions from the tests.

I hope any of this helps! Thank you for reading.

If you enjoyed any of my content, please consider supporting it in a variety of ways: