weightedMedian - HenrikBengtsson/matrixStats GitHub Wiki
matrixStats: Benchmark report
This report benchmark the performance of weightedMedian() against alternative methods.
- apply() + limma::weighted.median()
- apply() + cwhmisc::w.median()
- apply() + laeken::weightedMedian()
> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), na_prob = 0) {
+ mode <- match.arg(mode)
+ if (mode == "logical") {
+ x <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+ } else {
+ x <- runif(n, min = range[1], max = range[2])
+ }
+ storage.mode(x) <- mode
+ if (na_prob > 0)
+ x[sample(n, size = na_prob * n)] <- NA
+ x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+ set.seed(seed)
+ data <- list()
+ data[[1]] <- rvector(n = scale * 100, ...)
+ data[[2]] <- rvector(n = scale * 1000, ...)
+ data[[3]] <- rvector(n = scale * 10000, ...)
+ data[[4]] <- rvector(n = scale * 1e+05, ...)
+ data[[5]] <- rvector(n = scale * 1e+06, ...)
+ names(data) <- sprintf("n = %d", sapply(data, FUN = length))
+ data
+ }
> data <- rvectors(mode = "double")
> data <- data[1:3]
> x <- data[["n = 1000"]]
> w <- runif(length(x))
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 3282833 175.4 5709258 305.0 5709258 305.0
Vcells 6806071 52.0 26329732 200.9 87357391 666.5
> stats <- microbenchmark(weightedMedian = weightedMedian(x, w = w, ties = "mean", na.rm = FALSE),
+ `limma::weighted.median` = limma_weighted.median(x, w = w, na.rm = FALSE), `cwhmisc::w.median` = cwhmisc_w.median(x,
+ w = w), `laeken::weightedMedian` = laeken_weightedMedian(x, w = w), unit = "ms")
Table: Benchmarking of weightedMedian(), limma::weighted.median(), cwhmisc::w.median() and laeken::weightedMedian() on n = 1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMedian | 0.041404 | 0.0478795 | 0.0495992 | 0.0498755 | 0.0512730 | 0.064162 |
2 | limma::weighted.median | 0.061138 | 0.0658225 | 0.0691307 | 0.0677805 | 0.0695805 | 0.176117 |
3 | cwhmisc::w.median | 0.083722 | 0.0920515 | 0.0949106 | 0.0946210 | 0.0981945 | 0.109411 |
4 | laeken::weightedMedian | 0.128815 | 0.1360515 | 0.1430878 | 0.1392040 | 0.1432375 | 0.458474 |
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMedian | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
2 | limma::weighted.median | 1.476621 | 1.374753 | 1.393786 | 1.358994 | 1.357059 | 2.744880 |
3 | cwhmisc::w.median | 2.022075 | 1.922566 | 1.913551 | 1.897144 | 1.915131 | 1.705231 |
4 | laeken::weightedMedian | 3.111173 | 2.841540 | 2.884880 | 2.791030 | 2.793624 | 7.145569 |
Figure: Benchmarking of weightedMedian(), limma::weighted.median(), cwhmisc::w.median() and laeken::weightedMedian() on n = 1000 data. Outliers are displayed as crosses. Times are in milliseconds.
> x <- data[["n = 10000"]]
> w <- runif(length(x))
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 3281145 175.3 5709258 305.0 5709258 305.0
Vcells 6406585 48.9 26329732 200.9 87357391 666.5
> stats <- microbenchmark(weightedMedian = weightedMedian(x, w = w, ties = "mean", na.rm = FALSE),
+ `limma::weighted.median` = limma_weighted.median(x, w = w, na.rm = FALSE), `cwhmisc::w.median` = cwhmisc_w.median(x,
+ w = w), `laeken::weightedMedian` = laeken_weightedMedian(x, w = w), unit = "ms")
Table: Benchmarking of weightedMedian(), limma::weighted.median(), cwhmisc::w.median() and laeken::weightedMedian() on n = 10000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
2 | limma::weighted.median | 0.548102 | 0.6031085 | 0.6093292 | 0.6062180 | 0.6124475 | 0.718089 |
1 | weightedMedian | 0.560750 | 0.6194825 | 0.6355163 | 0.6356275 | 0.6517530 | 0.754366 |
4 | laeken::weightedMedian | 0.639606 | 0.7083430 | 0.7198366 | 0.7179295 | 0.7318340 | 0.887779 |
3 | cwhmisc::w.median | 0.674368 | 0.7343000 | 0.8258406 | 0.7468600 | 0.7648505 | 7.813275 |
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
2 | limma::weighted.median | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
1 | weightedMedian | 1.023076 | 1.027149 | 1.042977 | 1.048513 | 1.064178 | 1.050519 |
4 | laeken::weightedMedian | 1.166947 | 1.174487 | 1.181359 | 1.184276 | 1.194933 | 1.236308 |
3 | cwhmisc::w.median | 1.230369 | 1.217526 | 1.355328 | 1.231999 | 1.248843 | 10.880650 |
Figure: Benchmarking of weightedMedian(), limma::weighted.median(), cwhmisc::w.median() and laeken::weightedMedian() on n = 10000 data. Outliers are displayed as crosses. Times are in milliseconds.
> x <- data[["n = 100000"]]
> w <- runif(length(x))
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 3281226 175.3 5709258 305.0 5709258 305.0
Vcells 6497151 49.6 26329732 200.9 87357391 666.5
> stats <- microbenchmark(weightedMedian = weightedMedian(x, w = w, ties = "mean", na.rm = FALSE),
+ `limma::weighted.median` = limma_weighted.median(x, w = w, na.rm = FALSE), `cwhmisc::w.median` = cwhmisc_w.median(x,
+ w = w), `laeken::weightedMedian` = laeken_weightedMedian(x, w = w), unit = "ms")
Table: Benchmarking of weightedMedian(), limma::weighted.median(), cwhmisc::w.median() and laeken::weightedMedian() on n = 100000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
2 | limma::weighted.median | 4.734878 | 5.171025 | 5.623710 | 5.307669 | 5.583745 | 17.654005 |
4 | laeken::weightedMedian | 5.305862 | 5.653216 | 6.284446 | 5.776996 | 6.863891 | 18.354886 |
3 | cwhmisc::w.median | 6.091735 | 6.468177 | 7.590119 | 6.613154 | 7.929010 | 24.079816 |
1 | weightedMedian | 6.728040 | 7.339554 | 7.576383 | 7.496245 | 7.812422 | 9.315709 |
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
2 | limma::weighted.median | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.0000000 |
4 | laeken::weightedMedian | 1.120591 | 1.093249 | 1.117491 | 1.088424 | 1.229263 | 1.0397010 |
3 | cwhmisc::w.median | 1.286566 | 1.250850 | 1.349664 | 1.245962 | 1.420017 | 1.3639860 |
1 | weightedMedian | 1.420953 | 1.419361 | 1.347222 | 1.412342 | 1.399137 | 0.5276825 |
Figure: Benchmarking of weightedMedian(), limma::weighted.median(), cwhmisc::w.median() and laeken::weightedMedian() on n = 100000 data. Outliers are displayed as crosses. Times are in milliseconds.
R version 3.6.1 Patched (2019-08-27 r77078)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS
Matrix products: default
BLAS: /home/hb/software/R-devel/R-3-6-branch/lib/R/lib/libRblas.so
LAPACK: /home/hb/software/R-devel/R-3-6-branch/lib/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] microbenchmark_1.4-6 matrixStats_0.55.0-9000 ggplot2_3.2.1
[4] knitr_1.24 R.devices_2.16.0 R.utils_2.9.0
[7] R.oo_1.22.0 R.methodsS3_1.7.1 history_0.0.0-9002
loaded via a namespace (and not attached):
[1] Biobase_2.45.0 bit64_0.9-7 splines_3.6.1
[4] network_1.15 assertthat_0.2.1 highr_0.8
[7] stats4_3.6.1 blob_1.2.0 robustbase_0.93-5
[10] pillar_1.4.2 RSQLite_2.1.2 backports_1.1.4
[13] lattice_0.20-38 limma_3.40.6 glue_1.3.1
[16] digest_0.6.20 colorspace_1.4-1 sandwich_2.5-1
[19] Matrix_1.2-17 XML_3.98-1.20 lpSolve_5.6.13.3
[22] pkgconfig_2.0.2 genefilter_1.66.0 purrr_0.3.2
[25] ergm_3.10.4 xtable_1.8-4 mvtnorm_1.0-11
[28] scales_1.0.0 tibble_2.1.3 annotate_1.62.0
[31] IRanges_2.18.2 TH.data_1.0-10 withr_2.1.2
[34] BiocGenerics_0.30.0 lazyeval_0.2.2 mime_0.7
[37] survival_2.44-1.1 magrittr_1.5 crayon_1.3.4
[40] statnet.common_4.3.0 memoise_1.1.0 laeken_0.5.0
[43] R.cache_0.13.0 MASS_7.3-51.4 R.rsp_0.43.1
[46] tools_3.6.1 multcomp_1.4-10 S4Vectors_0.22.1
[49] trust_0.1-7 munsell_0.5.0 AnnotationDbi_1.46.1
[52] compiler_3.6.1 rlang_0.4.0 grid_3.6.1
[55] RCurl_1.95-4.12 cwhmisc_6.6 rappdirs_0.3.1
[58] labeling_0.3 bitops_1.0-6 base64enc_0.1-3
[61] boot_1.3-23 gtable_0.3.0 codetools_0.2-16
[64] DBI_1.0.0 markdown_1.1 R6_2.4.0
[67] zoo_1.8-6 dplyr_0.8.3 bit_1.1-14
[70] zeallot_0.1.0 parallel_3.6.1 Rcpp_1.0.2
[73] vctrs_0.2.0 DEoptimR_1.0-8 tidyselect_0.2.5
[76] xfun_0.9 coda_0.19-3
Total processing time was 7.3 secs.
To reproduce this report, do:
html <- matrixStats:::benchmark('weightedMedian')
Copyright Henrik Bengtsson. Last updated on 2019-09-10 21:14:58 (-0700 UTC). Powered by RSP.
<script> var link = document.createElement('link'); link.rel = 'icon'; link.href = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAMAAABEpIrGAAAA21BMVEUAAAAAAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8BAf4CAv0DA/wdHeIeHuEfH+AgIN8hId4lJdomJtknJ9g+PsE/P8BAQL9yco10dIt1dYp3d4h4eIeVlWqWlmmXl2iYmGeZmWabm2Tn5xjo6Bfp6Rb39wj4+Af//wA2M9hbAAAASXRSTlMAAQIJCgsMJSYnKD4/QGRlZmhpamtsbautrrCxuru8y8zN5ebn6Pn6+///////////////////////////////////////////LsUNcQAAAS9JREFUOI29k21XgkAQhVcFytdSMqMETU26UVqGmpaiFbL//xc1cAhhwVNf6n5i5z67M2dmYOyfJZUqlVLhkKucG7cgmUZTybDz6g0iDeq51PUr37Ds2cy2/C9NeES5puDjxuUk1xnToZsg8pfA3avHQ3lLIi7iWRrkv/OYtkScxBIMgDee0ALoyxHQBJ68JLCjOtQIMIANF7QG9G9fNnHvisCHBVMKgSJgiz7nE+AoBKrAPA3MgepvgR9TSCasrCKH0eB1wBGBFdCO+nAGjMVGPcQb5bd6mQRegN6+1axOs9nGfYcCtfi4NQosdtH7dB+txFIpXQqN1p9B/asRHToyS0jRgpV7nk4nwcq1BJ+x3Gl/v7S9Wmpp/aGquum7w3ZDyrADFYrl8vHBH+ev9AUASW1dmU4h4wAAAABJRU5ErkJggg==" document.getElementsByTagName('head')[0].appendChild(link); </script>