weightedMean - HenrikBengtsson/matrixStats GitHub Wiki
matrixStats: Benchmark report
This report benchmark the performance of weightedMean() against alternative methods.
- stats::weighted.mean()
- stats:::weighted.mean.default()
> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), na_prob = 0) {
+ mode <- match.arg(mode)
+ if (mode == "logical") {
+ x <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+ } else {
+ x <- runif(n, min = range[1], max = range[2])
+ }
+ storage.mode(x) <- mode
+ if (na_prob > 0)
+ x[sample(n, size = na_prob * n)] <- NA
+ x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+ set.seed(seed)
+ data <- list()
+ data[[1]] <- rvector(n = scale * 100, ...)
+ data[[2]] <- rvector(n = scale * 1000, ...)
+ data[[3]] <- rvector(n = scale * 10000, ...)
+ data[[4]] <- rvector(n = scale * 1e+05, ...)
+ data[[5]] <- rvector(n = scale * 1e+06, ...)
+ names(data) <- sprintf("n = %d", sapply(data, FUN = length))
+ data
+ }
> data <- rvectors(mode = mode)
> data <- data[1:4]
> x <- data[["n = 1000"]]
> w <- runif(length(x))
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 3253690 173.8 5709258 305.0 5709258 305.0
Vcells 10683390 81.6 32912165 251.1 87357391 666.5
> stats <- microbenchmark(weightedMean = weightedMean(x, w = w, na.rm = FALSE), `stats::weighted.mean` = weighted.mean(x,
+ w = w, na.rm = FALSE), `stats:::weighted.mean.default` = weighted.mean.default(x, w = w, na.rm = FALSE),
+ unit = "ms")
Table: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on integer+n = 1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 0.001799 | 0.0019990 | 0.0022183 | 0.0021315 | 0.0022670 | 0.010164 |
3 | stats:::weighted.mean.default | 0.007978 | 0.0086310 | 0.0091380 | 0.0089075 | 0.0092150 | 0.017101 |
2 | stats::weighted.mean | 0.010164 | 0.0109585 | 0.0117989 | 0.0112575 | 0.0115765 | 0.050782 |
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
3 | stats:::weighted.mean.default | 4.434686 | 4.317659 | 4.119407 | 4.178982 | 4.064843 | 1.682507 |
2 | stats::weighted.mean | 5.649805 | 5.481991 | 5.318930 | 5.281492 | 5.106528 | 4.996261 |
Figure: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on integer+n = 1000 data. Outliers are displayed as crosses. Times are in milliseconds.
> x <- data[["n = 10000"]]
> w <- runif(length(x))
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 3251465 173.7 5709258 305.0 5709258 305.0
Vcells 6807038 52.0 26329732 200.9 87357391 666.5
> stats <- microbenchmark(weightedMean = weightedMean(x, w = w, na.rm = FALSE), `stats::weighted.mean` = weighted.mean(x,
+ w = w, na.rm = FALSE), `stats:::weighted.mean.default` = weighted.mean.default(x, w = w, na.rm = FALSE),
+ unit = "ms")
Table: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on integer+n = 10000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 0.011350 | 0.0115630 | 0.0119662 | 0.0118160 | 0.0119335 | 0.022352 |
3 | stats:::weighted.mean.default | 0.059287 | 0.0609965 | 0.0631667 | 0.0617805 | 0.0634735 | 0.105014 |
2 | stats::weighted.mean | 0.062161 | 0.0633895 | 0.0657534 | 0.0642060 | 0.0657900 | 0.095198 |
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
3 | stats:::weighted.mean.default | 5.223524 | 5.275145 | 5.278744 | 5.228546 | 5.318934 | 4.698193 |
2 | stats::weighted.mean | 5.476740 | 5.482098 | 5.494913 | 5.433819 | 5.513052 | 4.259037 |
Figure: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on integer+n = 10000 data. Outliers are displayed as crosses. Times are in milliseconds.
> x <- data[["n = 100000"]]
> w <- runif(length(x))
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 3251537 173.7 5709258 305.0 5709258 305.0
Vcells 6897598 52.7 26329732 200.9 87357391 666.5
> stats <- microbenchmark(weightedMean = weightedMean(x, w = w, na.rm = FALSE), `stats::weighted.mean` = weighted.mean(x,
+ w = w, na.rm = FALSE), `stats:::weighted.mean.default` = weighted.mean.default(x, w = w, na.rm = FALSE),
+ unit = "ms")
Table: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on integer+n = 100000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 0.104369 | 0.108219 | 0.1186231 | 0.1164295 | 0.1246330 | 0.147534 |
3 | stats:::weighted.mean.default | 0.598650 | 0.627936 | 0.7646579 | 0.6632800 | 0.7220650 | 6.491542 |
2 | stats::weighted.mean | 0.612813 | 0.641460 | 0.8342262 | 0.6877985 | 0.7376575 | 6.687976 |
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.00000 |
3 | stats:::weighted.mean.default | 5.735899 | 5.802456 | 6.446112 | 5.696838 | 5.793530 | 44.00031 |
2 | stats::weighted.mean | 5.871600 | 5.927425 | 7.032577 | 5.907425 | 5.918637 | 45.33176 |
Figure: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on integer+n = 100000 data. Outliers are displayed as crosses. Times are in milliseconds.
> x <- data[["n = 1000000"]]
> w <- runif(length(x))
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 3251609 173.7 5709258 305.0 5709258 305.0
Vcells 7797647 59.5 26329732 200.9 87357391 666.5
> stats <- microbenchmark(weightedMean = weightedMean(x, w = w, na.rm = FALSE), `stats::weighted.mean` = weighted.mean(x,
+ w = w, na.rm = FALSE), `stats:::weighted.mean.default` = weighted.mean.default(x, w = w, na.rm = FALSE),
+ unit = "ms")
Table: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on integer+n = 1000000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 1.071714 | 1.250920 | 1.331716 | 1.336521 | 1.394437 | 1.857999 |
3 | stats:::weighted.mean.default | 6.385673 | 7.282114 | 12.330859 | 7.765860 | 13.345553 | 269.733846 |
2 | stats::weighted.mean | 6.654515 | 7.435343 | 12.565293 | 7.879948 | 13.545939 | 266.984058 |
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.0000 |
3 | stats:::weighted.mean.default | 5.958374 | 5.821407 | 9.259373 | 5.810503 | 9.570567 | 145.1744 |
2 | stats::weighted.mean | 6.209226 | 5.943900 | 9.435413 | 5.895865 | 9.714271 | 143.6944 |
Figure: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on integer+n = 1000000 data. Outliers are displayed as crosses. Times are in milliseconds.
> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), na_prob = 0) {
+ mode <- match.arg(mode)
+ if (mode == "logical") {
+ x <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+ } else {
+ x <- runif(n, min = range[1], max = range[2])
+ }
+ storage.mode(x) <- mode
+ if (na_prob > 0)
+ x[sample(n, size = na_prob * n)] <- NA
+ x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+ set.seed(seed)
+ data <- list()
+ data[[1]] <- rvector(n = scale * 100, ...)
+ data[[2]] <- rvector(n = scale * 1000, ...)
+ data[[3]] <- rvector(n = scale * 10000, ...)
+ data[[4]] <- rvector(n = scale * 1e+05, ...)
+ data[[5]] <- rvector(n = scale * 1e+06, ...)
+ names(data) <- sprintf("n = %d", sapply(data, FUN = length))
+ data
+ }
> data <- rvectors(mode = mode)
> data <- data[1:4]
> x <- data[["n = 1000"]]
> w <- runif(length(x))
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 3251681 173.7 5709258 305.0 5709258 305.0
Vcells 7354677 56.2 26329732 200.9 87357391 666.5
> stats <- microbenchmark(weightedMean = weightedMean(x, w = w, na.rm = FALSE), `stats::weighted.mean` = weighted.mean(x,
+ w = w, na.rm = FALSE), `stats:::weighted.mean.default` = weighted.mean.default(x, w = w, na.rm = FALSE),
+ unit = "ms")
Table: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on double+n = 1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 0.001873 | 0.002131 | 0.0023838 | 0.0023080 | 0.0024775 | 0.009741 |
3 | stats:::weighted.mean.default | 0.007760 | 0.008604 | 0.0090266 | 0.0088345 | 0.0091680 | 0.013523 |
2 | stats::weighted.mean | 0.010181 | 0.010803 | 0.0115920 | 0.0111110 | 0.0114995 | 0.047691 |
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
3 | stats:::weighted.mean.default | 4.143086 | 4.037541 | 3.786587 | 3.827773 | 3.700505 | 1.388256 |
2 | stats::weighted.mean | 5.435665 | 5.069451 | 4.862771 | 4.814125 | 4.641574 | 4.895904 |
Figure: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on double+n = 1000 data. Outliers are displayed as crosses. Times are in milliseconds.
> x <- data[["n = 10000"]]
> w <- runif(length(x))
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 3251753 173.7 5709258 305.0 5709258 305.0
Vcells 7363724 56.2 26329732 200.9 87357391 666.5
> stats <- microbenchmark(weightedMean = weightedMean(x, w = w, na.rm = FALSE), `stats::weighted.mean` = weighted.mean(x,
+ w = w, na.rm = FALSE), `stats:::weighted.mean.default` = weighted.mean.default(x, w = w, na.rm = FALSE),
+ unit = "ms")
Table: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on double+n = 10000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 0.011454 | 0.0116345 | 0.0125948 | 0.0119475 | 0.0137065 | 0.020861 |
3 | stats:::weighted.mean.default | 0.054687 | 0.0560620 | 0.0588038 | 0.0570060 | 0.0587000 | 0.094116 |
2 | stats::weighted.mean | 0.057985 | 0.0591660 | 0.0613805 | 0.0597465 | 0.0610195 | 0.091445 |
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
3 | stats:::weighted.mean.default | 4.774489 | 4.818600 | 4.668873 | 4.771375 | 4.282640 | 4.511577 |
2 | stats::weighted.mean | 5.062424 | 5.085393 | 4.873457 | 5.000753 | 4.451866 | 4.383539 |
Figure: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on double+n = 10000 data. Outliers are displayed as crosses. Times are in milliseconds.
> x <- data[["n = 100000"]]
> w <- runif(length(x))
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 3251825 173.7 5709258 305.0 5709258 305.0
Vcells 7454142 56.9 26329732 200.9 87357391 666.5
> stats <- microbenchmark(weightedMean = weightedMean(x, w = w, na.rm = FALSE), `stats::weighted.mean` = weighted.mean(x,
+ w = w, na.rm = FALSE), `stats:::weighted.mean.default` = weighted.mean.default(x, w = w, na.rm = FALSE),
+ unit = "ms")
Table: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on double+n = 100000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 0.104343 | 0.1205490 | 0.1331922 | 0.1317125 | 0.1415450 | 0.176506 |
3 | stats:::weighted.mean.default | 0.573308 | 0.6238395 | 0.8089427 | 0.6603815 | 0.7104230 | 6.749770 |
2 | stats::weighted.mean | 0.577805 | 0.6358815 | 0.7690570 | 0.6812455 | 0.7217285 | 7.441084 |
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 1.000000 | 1.000000 | 1.00000 | 1.000000 | 1.000000 | 1.00000 |
3 | stats:::weighted.mean.default | 5.494456 | 5.174987 | 6.07350 | 5.013810 | 5.019061 | 38.24102 |
2 | stats::weighted.mean | 5.537554 | 5.274880 | 5.77404 | 5.172216 | 5.098933 | 42.15768 |
Figure: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on double+n = 100000 data. Outliers are displayed as crosses. Times are in milliseconds.
> x <- data[["n = 1000000"]]
> w <- runif(length(x))
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 3251897 173.7 5709258 305.0 5709258 305.0
Vcells 8354578 63.8 26329732 200.9 87357391 666.5
> stats <- microbenchmark(weightedMean = weightedMean(x, w = w, na.rm = FALSE), `stats::weighted.mean` = weighted.mean(x,
+ w = w, na.rm = FALSE), `stats:::weighted.mean.default` = weighted.mean.default(x, w = w, na.rm = FALSE),
+ unit = "ms")
Table: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on double+n = 1000000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 1.186222 | 1.385917 | 1.475081 | 1.441063 | 1.503171 | 2.267329 |
2 | stats::weighted.mean | 6.686669 | 7.629785 | 9.613018 | 7.811571 | 9.362951 | 21.811615 |
3 | stats:::weighted.mean.default | 6.488990 | 7.586513 | 9.028397 | 7.840875 | 8.884047 | 17.437721 |
expr | min | lq | mean | median | uq | max | |
---|---|---|---|---|---|---|---|
1 | weightedMean | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
2 | stats::weighted.mean | 5.636946 | 5.505223 | 6.516942 | 5.420701 | 6.228800 | 9.619960 |
3 | stats:::weighted.mean.default | 5.470300 | 5.474000 | 6.120610 | 5.441035 | 5.910203 | 7.690865 |
Figure: Benchmarking of weightedMean(), stats::weighted.mean() and stats:::weighted.mean.default() on double+n = 1000000 data. Outliers are displayed as crosses. Times are in milliseconds.
R version 3.6.1 Patched (2019-08-27 r77078)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS
Matrix products: default
BLAS: /home/hb/software/R-devel/R-3-6-branch/lib/R/lib/libRblas.so
LAPACK: /home/hb/software/R-devel/R-3-6-branch/lib/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] microbenchmark_1.4-6 matrixStats_0.55.0-9000 ggplot2_3.2.1
[4] knitr_1.24 R.devices_2.16.0 R.utils_2.9.0
[7] R.oo_1.22.0 R.methodsS3_1.7.1 history_0.0.0-9002
loaded via a namespace (and not attached):
[1] Biobase_2.45.0 bit64_0.9-7 splines_3.6.1
[4] network_1.15 assertthat_0.2.1 highr_0.8
[7] stats4_3.6.1 blob_1.2.0 robustbase_0.93-5
[10] pillar_1.4.2 RSQLite_2.1.2 backports_1.1.4
[13] lattice_0.20-38 glue_1.3.1 digest_0.6.20
[16] colorspace_1.4-1 sandwich_2.5-1 Matrix_1.2-17
[19] XML_3.98-1.20 lpSolve_5.6.13.3 pkgconfig_2.0.2
[22] genefilter_1.66.0 purrr_0.3.2 ergm_3.10.4
[25] xtable_1.8-4 mvtnorm_1.0-11 scales_1.0.0
[28] tibble_2.1.3 annotate_1.62.0 IRanges_2.18.2
[31] TH.data_1.0-10 withr_2.1.2 BiocGenerics_0.30.0
[34] lazyeval_0.2.2 mime_0.7 survival_2.44-1.1
[37] magrittr_1.5 crayon_1.3.4 statnet.common_4.3.0
[40] memoise_1.1.0 laeken_0.5.0 R.cache_0.13.0
[43] MASS_7.3-51.4 R.rsp_0.43.1 tools_3.6.1
[46] multcomp_1.4-10 S4Vectors_0.22.1 trust_0.1-7
[49] munsell_0.5.0 AnnotationDbi_1.46.1 compiler_3.6.1
[52] rlang_0.4.0 grid_3.6.1 RCurl_1.95-4.12
[55] cwhmisc_6.6 rappdirs_0.3.1 labeling_0.3
[58] bitops_1.0-6 base64enc_0.1-3 boot_1.3-23
[61] gtable_0.3.0 codetools_0.2-16 DBI_1.0.0
[64] markdown_1.1 R6_2.4.0 zoo_1.8-6
[67] dplyr_0.8.3 bit_1.1-14 zeallot_0.1.0
[70] parallel_3.6.1 Rcpp_1.0.2 vctrs_0.2.0
[73] DEoptimR_1.0-8 tidyselect_0.2.5 xfun_0.9
[76] coda_0.19-3
Total processing time was 15.42 secs.
To reproduce this report, do:
html <- matrixStats:::benchmark('weightedMean')
Copyright Henrik Bengtsson. Last updated on 2019-09-10 21:14:44 (-0700 UTC). Powered by RSP.
<script> var link = document.createElement('link'); link.rel = 'icon'; link.href = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAMAAABEpIrGAAAA21BMVEUAAAAAAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8BAf4CAv0DA/wdHeIeHuEfH+AgIN8hId4lJdomJtknJ9g+PsE/P8BAQL9yco10dIt1dYp3d4h4eIeVlWqWlmmXl2iYmGeZmWabm2Tn5xjo6Bfp6Rb39wj4+Af//wA2M9hbAAAASXRSTlMAAQIJCgsMJSYnKD4/QGRlZmhpamtsbautrrCxuru8y8zN5ebn6Pn6+///////////////////////////////////////////LsUNcQAAAS9JREFUOI29k21XgkAQhVcFytdSMqMETU26UVqGmpaiFbL//xc1cAhhwVNf6n5i5z67M2dmYOyfJZUqlVLhkKucG7cgmUZTybDz6g0iDeq51PUr37Ds2cy2/C9NeES5puDjxuUk1xnToZsg8pfA3avHQ3lLIi7iWRrkv/OYtkScxBIMgDee0ALoyxHQBJ68JLCjOtQIMIANF7QG9G9fNnHvisCHBVMKgSJgiz7nE+AoBKrAPA3MgepvgR9TSCasrCKH0eB1wBGBFdCO+nAGjMVGPcQb5bd6mQRegN6+1axOs9nGfYcCtfi4NQosdtH7dB+txFIpXQqN1p9B/asRHToyS0jRgpV7nk4nwcq1BJ+x3Gl/v7S9Wmpp/aGquum7w3ZDyrADFYrl8vHBH+ev9AUASW1dmU4h4wAAAABJRU5ErkJggg==" document.getElementsByTagName('head')[0].appendChild(link); </script>