colRowTabulates_subset - HenrikBengtsson/matrixStats GitHub Wiki
matrixStats: Benchmark report
This report benchmark the performance of colTabulates() and rowTabulates() on subsetted computation.
> rmatrix <- function(nrow, ncol, mode = c("logical", "double", "integer", "index"), range = c(-100, 
+     +100), na_prob = 0) {
+     mode <- match.arg(mode)
+     n <- nrow * ncol
+     if (mode == "logical") {
+         x <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else if (mode == "index") {
+         x <- seq_len(n)
+         mode <- "integer"
+     }     else {
+         x <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(x) <- mode
+     if (na_prob > 0) 
+         x[sample(n, size = na_prob * n)] <- NA
+     dim(x) <- c(nrow, ncol)
+     x
+ }
> rmatrices <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rmatrix(nrow = scale * 1, ncol = scale * 1, ...)
+     data[[2]] <- rmatrix(nrow = scale * 10, ncol = scale * 10, ...)
+     data[[3]] <- rmatrix(nrow = scale * 100, ncol = scale * 1, ...)
+     data[[4]] <- t(data[[3]])
+     data[[5]] <- rmatrix(nrow = scale * 10, ncol = scale * 100, ...)
+     data[[6]] <- t(data[[5]])
+     names(data) <- sapply(data, FUN = function(x) paste(dim(x), collapse = "x"))
+     data
+ }
> data <- rmatrices(mode = "integer", range = c(-10, 10))> X <- data[["10x10"]]
> rows <- sample.int(nrow(X), size = nrow(X) * 0.7)
> cols <- sample.int(ncol(X), size = ncol(X) * 0.7)
> X_S <- X[rows, cols]
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3205801 171.3    5709258 305.0  5709258 305.0
Vcells 6502589  49.7   22343563 170.5 56666022 432.4
> colStats <- microbenchmark(colTabulates_X_S = colTabulates(X_S, na.rm = FALSE), `colTabulates(X, rows, cols)` = colTabulates(X, 
+     rows = rows, cols = cols, na.rm = FALSE), `colTabulates(X[rows, cols])` = colTabulates(X[rows, 
+     cols], na.rm = FALSE), unit = "ms")
> X <- t(X)
> X_S <- t(X_S)
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3204525 171.2    5709258 305.0  5709258 305.0
Vcells 6498843  49.6   22343563 170.5 56666022 432.4
> rowStats <- microbenchmark(rowTabulates_X_S = rowTabulates(X_S, na.rm = FALSE), `rowTabulates(X, cols, rows)` = rowTabulates(X, 
+     rows = cols, cols = rows, na.rm = FALSE), `rowTabulates(X[cols, rows])` = rowTabulates(X[cols, 
+     rows], na.rm = FALSE), unit = "ms")Table: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 10x10 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 0.087540 | 0.0889065 | 0.0980791 | 0.0896510 | 0.0908495 | 0.429847 | 
| 2 | colTabulates(X, rows, cols) | 0.089037 | 0.0903495 | 0.0971355 | 0.0909390 | 0.0921390 | 0.219770 | 
| 3 | colTabulates(X[rows, cols]) | 0.089401 | 0.0902440 | 0.1008590 | 0.0912245 | 0.0924605 | 0.288946 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.000000 | 1.000000 | 1.0000000 | 1.000000 | 1.000000 | 1.0000000 | 
| 2 | colTabulates(X, rows, cols) | 1.017101 | 1.016231 | 0.9903792 | 1.014367 | 1.014194 | 0.5112749 | 
| 3 | colTabulates(X[rows, cols]) | 1.021259 | 1.015044 | 1.0283440 | 1.017551 | 1.017733 | 0.6722066 | 
Table: Benchmarking of rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on 10x10 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 0.085704 | 0.0866480 | 0.0883837 | 0.0874370 | 0.0881800 | 0.102645 | 
| 3 | rowTabulates(X[cols, rows]) | 0.086742 | 0.0880295 | 0.0892940 | 0.0885225 | 0.0892075 | 0.130317 | 
| 2 | rowTabulates(X, cols, rows) | 0.086714 | 0.0881100 | 0.0931348 | 0.0885260 | 0.0894770 | 0.420335 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 
| 3 | rowTabulates(X[cols, rows]) | 1.012112 | 1.015944 | 1.010300 | 1.012415 | 1.011652 | 1.269589 | 
| 2 | rowTabulates(X, cols, rows) | 1.011785 | 1.016873 | 1.053755 | 1.012455 | 1.014709 | 4.095036 | 
Figure: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 10x10 data as well as rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

 Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 10x10 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 10x10 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 2 | rowTabulates_X_S | 85.704 | 86.6480 | 88.38371 | 87.437 | 88.1800 | 102.645 | 
| 1 | colTabulates_X_S | 87.540 | 88.9065 | 98.07909 | 89.651 | 90.8495 | 429.847 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 2 | rowTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 
| 1 | colTabulates_X_S | 1.021423 | 1.026065 | 1.109697 | 1.025321 | 1.030273 | 4.187705 | 
Figure: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 10x10 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

> X <- data[["100x100"]]
> rows <- sample.int(nrow(X), size = nrow(X) * 0.7)
> cols <- sample.int(ncol(X), size = ncol(X) * 0.7)
> X_S <- X[rows, cols]
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3203486 171.1    5709258 305.0  5709258 305.0
Vcells 6167977  47.1   22343563 170.5 56666022 432.4
> colStats <- microbenchmark(colTabulates_X_S = colTabulates(X_S, na.rm = FALSE), `colTabulates(X, rows, cols)` = colTabulates(X, 
+     rows = rows, cols = cols, na.rm = FALSE), `colTabulates(X[rows, cols])` = colTabulates(X[rows, 
+     cols], na.rm = FALSE), unit = "ms")
> X <- t(X)
> X_S <- t(X_S)
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3203477 171.1    5709258 305.0  5709258 305.0
Vcells 6173055  47.1   22343563 170.5 56666022 432.4
> rowStats <- microbenchmark(rowTabulates_X_S = rowTabulates(X_S, na.rm = FALSE), `rowTabulates(X, cols, rows)` = rowTabulates(X, 
+     rows = cols, cols = rows, na.rm = FALSE), `rowTabulates(X[cols, rows])` = rowTabulates(X[cols, 
+     rows], na.rm = FALSE), unit = "ms")Table: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 100x100 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 0.253419 | 0.2609700 | 0.2726863 | 0.2656830 | 0.280109 | 0.362675 | 
| 2 | colTabulates(X, rows, cols) | 0.262151 | 0.2719990 | 0.2841614 | 0.2791645 | 0.293225 | 0.412909 | 
| 3 | colTabulates(X[rows, cols]) | 0.261782 | 0.2737665 | 0.2917133 | 0.2794695 | 0.294835 | 0.622681 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 
| 2 | colTabulates(X, rows, cols) | 1.034457 | 1.042262 | 1.042082 | 1.050743 | 1.046825 | 1.138510 | 
| 3 | colTabulates(X[rows, cols]) | 1.033001 | 1.049034 | 1.069776 | 1.051891 | 1.052572 | 1.716912 | 
Table: Benchmarking of rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on 100x100 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 0.276059 | 0.2857250 | 0.2933003 | 0.2923040 | 0.2996070 | 0.335534 | 
| 2 | rowTabulates(X, cols, rows) | 0.285986 | 0.2949865 | 0.3022689 | 0.3019410 | 0.3090690 | 0.323716 | 
| 3 | rowTabulates(X[cols, rows]) | 0.283183 | 0.2971275 | 0.3120059 | 0.3023895 | 0.3084625 | 1.172120 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.0000000 | 
| 2 | rowTabulates(X, cols, rows) | 1.035960 | 1.032414 | 1.030578 | 1.032969 | 1.031581 | 0.9647785 | 
| 3 | rowTabulates(X[cols, rows]) | 1.025806 | 1.039907 | 1.063776 | 1.034504 | 1.029557 | 3.4932973 | 
Figure: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 100x100 data as well as rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

 Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 100x100 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 100x100 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 253.419 | 260.970 | 272.6863 | 265.683 | 280.109 | 362.675 | 
| 2 | rowTabulates_X_S | 276.059 | 285.725 | 293.3003 | 292.304 | 299.607 | 335.534 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.0000000 | 
| 2 | rowTabulates_X_S | 1.089338 | 1.094858 | 1.075596 | 1.100198 | 1.069609 | 0.9251644 | 
Figure: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 100x100 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

> X <- data[["1000x10"]]
> rows <- sample.int(nrow(X), size = nrow(X) * 0.7)
> cols <- sample.int(ncol(X), size = ncol(X) * 0.7)
> X_S <- X[rows, cols]
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3204238 171.2    5709258 305.0  5709258 305.0
Vcells 6172045  47.1   22343563 170.5 56666022 432.4
> colStats <- microbenchmark(colTabulates_X_S = colTabulates(X_S, na.rm = FALSE), `colTabulates(X, rows, cols)` = colTabulates(X, 
+     rows = rows, cols = cols, na.rm = FALSE), `colTabulates(X[rows, cols])` = colTabulates(X[rows, 
+     cols], na.rm = FALSE), unit = "ms")
> X <- t(X)
> X_S <- t(X_S)
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3204229 171.2    5709258 305.0  5709258 305.0
Vcells 6177123  47.2   22343563 170.5 56666022 432.4
> rowStats <- microbenchmark(rowTabulates_X_S = rowTabulates(X_S, na.rm = FALSE), `rowTabulates(X, cols, rows)` = rowTabulates(X, 
+     rows = cols, cols = rows, na.rm = FALSE), `rowTabulates(X[cols, rows])` = rowTabulates(X[cols, 
+     rows], na.rm = FALSE), unit = "ms")Table: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 1000x10 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 0.233580 | 0.2362520 | 0.2447179 | 0.2401835 | 0.2475065 | 0.329554 | 
| 3 | colTabulates(X[rows, cols]) | 0.241319 | 0.2446845 | 0.2515279 | 0.2484270 | 0.2547075 | 0.311018 | 
| 2 | colTabulates(X, rows, cols) | 0.241028 | 0.2443960 | 0.2583131 | 0.2500805 | 0.2600480 | 0.421301 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.0000000 | 
| 3 | colTabulates(X[rows, cols]) | 1.033132 | 1.035693 | 1.027828 | 1.034322 | 1.029094 | 0.9437543 | 
| 2 | colTabulates(X, rows, cols) | 1.031886 | 1.034472 | 1.055554 | 1.041206 | 1.050671 | 1.2783975 | 
Table: Benchmarking of rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on 1000x10 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 0.276673 | 0.2801780 | 0.2889631 | 0.2851215 | 0.2919005 | 0.452778 | 
| 3 | rowTabulates(X[cols, rows]) | 0.287231 | 0.2913165 | 0.3008558 | 0.2962890 | 0.3055805 | 0.403375 | 
| 2 | rowTabulates(X, cols, rows) | 0.286890 | 0.2910215 | 0.2980739 | 0.2965630 | 0.3026750 | 0.340465 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.0000000 | 
| 3 | rowTabulates(X[cols, rows]) | 1.038161 | 1.039755 | 1.041157 | 1.039168 | 1.046865 | 0.8908891 | 
| 2 | rowTabulates(X, cols, rows) | 1.036928 | 1.038702 | 1.031529 | 1.040129 | 1.036912 | 0.7519469 | 
Figure: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 1000x10 data as well as rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

 Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 1000x10 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 1000x10 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 233.580 | 236.252 | 244.7179 | 240.1835 | 247.5065 | 329.554 | 
| 2 | rowTabulates_X_S | 276.673 | 280.178 | 288.9631 | 285.1215 | 291.9005 | 452.778 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 
| 2 | rowTabulates_X_S | 1.184489 | 1.185929 | 1.180801 | 1.187099 | 1.179365 | 1.373911 | 
Figure: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 1000x10 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

> X <- data[["10x1000"]]
> rows <- sample.int(nrow(X), size = nrow(X) * 0.7)
> cols <- sample.int(ncol(X), size = ncol(X) * 0.7)
> X_S <- X[rows, cols]
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3204444 171.2    5709258 305.0  5709258 305.0
Vcells 6172952  47.1   22343563 170.5 56666022 432.4
> colStats <- microbenchmark(colTabulates_X_S = colTabulates(X_S, na.rm = FALSE), `colTabulates(X, rows, cols)` = colTabulates(X, 
+     rows = rows, cols = cols, na.rm = FALSE), `colTabulates(X[rows, cols])` = colTabulates(X[rows, 
+     cols], na.rm = FALSE), unit = "ms")
> X <- t(X)
> X_S <- t(X_S)
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3204435 171.2    5709258 305.0  5709258 305.0
Vcells 6178030  47.2   22345455 170.5 56666022 432.4
> rowStats <- microbenchmark(rowTabulates_X_S = rowTabulates(X_S, na.rm = FALSE), `rowTabulates(X, cols, rows)` = rowTabulates(X, 
+     rows = cols, cols = rows, na.rm = FALSE), `rowTabulates(X[cols, rows])` = rowTabulates(X[cols, 
+     rows], na.rm = FALSE), unit = "ms")Table: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 10x1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 0.297480 | 0.3047505 | 2.8225353 | 0.3099065 | 0.3148250 | 250.891787 | 
| 3 | colTabulates(X[rows, cols]) | 0.304955 | 0.3150990 | 0.3232788 | 0.3190925 | 0.3238450 | 0.573301 | 
| 2 | colTabulates(X, rows, cols) | 0.305951 | 0.3151110 | 0.3291838 | 0.3212190 | 0.3255665 | 0.693542 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.000000 | 1.000000 | 1.0000000 | 1.000000 | 1.000000 | 1.0000000 | 
| 3 | colTabulates(X[rows, cols]) | 1.025128 | 1.033957 | 0.1145349 | 1.029641 | 1.028651 | 0.0022851 | 
| 2 | colTabulates(X, rows, cols) | 1.028476 | 1.033997 | 0.1166270 | 1.036503 | 1.034119 | 0.0027643 | 
Table: Benchmarking of rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on 10x1000 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 0.291203 | 0.3048035 | 0.3166833 | 0.3105615 | 0.3189695 | 0.485174 | 
| 3 | rowTabulates(X[cols, rows]) | 0.302955 | 0.3157510 | 0.3264460 | 0.3193135 | 0.3318315 | 0.524493 | 
| 2 | rowTabulates(X, cols, rows) | 0.301042 | 0.3162710 | 0.3273503 | 0.3213555 | 0.3335205 | 0.444905 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.0000000 | 
| 3 | rowTabulates(X[cols, rows]) | 1.040357 | 1.035917 | 1.030828 | 1.028181 | 1.040324 | 1.0810410 | 
| 2 | rowTabulates(X, cols, rows) | 1.033787 | 1.037623 | 1.033683 | 1.034756 | 1.045619 | 0.9170009 | 
Figure: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 10x1000 data as well as rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

 Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 10x1000 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 10x1000 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 297.480 | 304.7505 | 2822.5354 | 309.9065 | 314.8250 | 250891.787 | 
| 2 | rowTabulates_X_S | 291.203 | 304.8035 | 316.6833 | 310.5615 | 318.9695 | 485.174 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.0000000 | 1.000000 | 1.0000000 | 1.000000 | 1.000000 | 1.0000000 | 
| 2 | rowTabulates_X_S | 0.9788994 | 1.000174 | 0.1121982 | 1.002114 | 1.013165 | 0.0019338 | 
Figure: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 10x1000 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

> X <- data[["100x1000"]]
> rows <- sample.int(nrow(X), size = nrow(X) * 0.7)
> cols <- sample.int(ncol(X), size = ncol(X) * 0.7)
> X_S <- X[rows, cols]
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3204643 171.2    5709258 305.0  5709258 305.0
Vcells 6195618  47.3   22345455 170.5 56666022 432.4
> colStats <- microbenchmark(colTabulates_X_S = colTabulates(X_S, na.rm = FALSE), `colTabulates(X, rows, cols)` = colTabulates(X, 
+     rows = rows, cols = cols, na.rm = FALSE), `colTabulates(X[rows, cols])` = colTabulates(X[rows, 
+     cols], na.rm = FALSE), unit = "ms")
> X <- t(X)
> X_S <- t(X_S)
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3204634 171.2    5709258 305.0  5709258 305.0
Vcells 6245696  47.7   22345847 170.5 56666022 432.4
> rowStats <- microbenchmark(rowTabulates_X_S = rowTabulates(X_S, na.rm = FALSE), `rowTabulates(X, cols, rows)` = rowTabulates(X, 
+     rows = cols, cols = rows, na.rm = FALSE), `rowTabulates(X[cols, rows])` = rowTabulates(X[cols, 
+     rows], na.rm = FALSE), unit = "ms")Table: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 100x1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.567425 | 1.639608 | 1.705160 | 1.655323 | 1.771980 | 2.380615 | 
| 2 | colTabulates(X, rows, cols) | 1.645321 | 1.694262 | 1.870166 | 1.728113 | 1.860080 | 8.493609 | 
| 3 | colTabulates(X[rows, cols]) | 1.642920 | 1.715765 | 4.394435 | 1.736148 | 1.947884 | 252.808727 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 
| 2 | colTabulates(X, rows, cols) | 1.049697 | 1.033334 | 1.096768 | 1.043974 | 1.049718 | 3.567821 | 
| 3 | colTabulates(X[rows, cols]) | 1.048165 | 1.046449 | 2.577139 | 1.048828 | 1.099269 | 106.194713 | 
Table: Benchmarking of rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on 100x1000 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 1.748535 | 1.842582 | 1.994827 | 1.887706 | 1.920214 | 8.649732 | 
| 2 | rowTabulates(X, cols, rows) | 1.823317 | 1.918680 | 2.068009 | 1.945528 | 1.992203 | 8.626591 | 
| 3 | rowTabulates(X[cols, rows]) | 1.831858 | 1.923616 | 1.991026 | 1.959248 | 2.003563 | 2.853159 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 1.000000 | 1.000000 | 1.0000000 | 1.000000 | 1.000000 | 1.0000000 | 
| 2 | rowTabulates(X, cols, rows) | 1.042768 | 1.041299 | 1.0366861 | 1.030631 | 1.037490 | 0.9973247 | 
| 3 | rowTabulates(X[cols, rows]) | 1.047653 | 1.043978 | 0.9980948 | 1.037899 | 1.043407 | 0.3298552 | 
Figure: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 100x1000 data as well as rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

 Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 100x1000 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 100x1000 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.567425 | 1.639608 | 1.705160 | 1.655323 | 1.771980 | 2.380615 | 
| 2 | rowTabulates_X_S | 1.748535 | 1.842582 | 1.994827 | 1.887706 | 1.920214 | 8.649732 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 
| 2 | rowTabulates_X_S | 1.115546 | 1.123795 | 1.169876 | 1.140385 | 1.083654 | 3.633402 | 
Figure: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 100x1000 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

> X <- data[["1000x100"]]
> rows <- sample.int(nrow(X), size = nrow(X) * 0.7)
> cols <- sample.int(ncol(X), size = ncol(X) * 0.7)
> X_S <- X[rows, cols]
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3204859 171.2    5709258 305.0  5709258 305.0
Vcells 6196447  47.3   22345847 170.5 56666022 432.4
> colStats <- microbenchmark(colTabulates_X_S = colTabulates(X_S, na.rm = FALSE), `colTabulates(X, rows, cols)` = colTabulates(X, 
+     rows = rows, cols = cols, na.rm = FALSE), `colTabulates(X[rows, cols])` = colTabulates(X[rows, 
+     cols], na.rm = FALSE), unit = "ms")
> X <- t(X)
> X_S <- t(X_S)
> gc()
          used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 3204850 171.2    5709258 305.0  5709258 305.0
Vcells 6246525  47.7   22345847 170.5 56666022 432.4
> rowStats <- microbenchmark(rowTabulates_X_S = rowTabulates(X_S, na.rm = FALSE), `rowTabulates(X, cols, rows)` = rowTabulates(X, 
+     rows = cols, cols = rows, na.rm = FALSE), `rowTabulates(X[cols, rows])` = rowTabulates(X[cols, 
+     rows], na.rm = FALSE), unit = "ms")Table: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 1000x100 data. The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.418797 | 1.470277 | 1.498454 | 1.489486 | 1.504072 | 1.788356 | 
| 2 | colTabulates(X, rows, cols) | 1.485973 | 1.556599 | 1.660665 | 1.563964 | 1.601022 | 8.140991 | 
| 3 | colTabulates(X[rows, cols]) | 1.484032 | 1.556742 | 1.657618 | 1.565163 | 1.593916 | 8.321581 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 
| 2 | colTabulates(X, rows, cols) | 1.047347 | 1.058712 | 1.108252 | 1.050003 | 1.064458 | 4.552221 | 
| 3 | colTabulates(X[rows, cols]) | 1.045979 | 1.058808 | 1.106219 | 1.050808 | 1.059734 | 4.653202 | 
Table: Benchmarking of rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on 1000x100 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 1.819188 | 1.922722 | 2.013304 | 1.977278 | 2.113175 | 3.094882 | 
| 2 | rowTabulates(X, cols, rows) | 1.904025 | 2.013527 | 2.153842 | 2.059273 | 2.173602 | 8.314753 | 
| 3 | rowTabulates(X[cols, rows]) | 1.907501 | 2.010804 | 2.187695 | 2.080598 | 2.263897 | 8.807484 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | rowTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 
| 2 | rowTabulates(X, cols, rows) | 1.046634 | 1.047228 | 1.069805 | 1.041468 | 1.028595 | 2.686614 | 
| 3 | rowTabulates(X[cols, rows]) | 1.048545 | 1.045811 | 1.086620 | 1.052254 | 1.071325 | 2.845822 | 
Figure: Benchmarking of colTabulates_X_S(), colTabulates(X, rows, cols)() and colTabulates(X[rows, cols])() on 1000x100 data as well as rowTabulates_X_S(), rowTabulates(X, cols, rows)() and rowTabulates(X[cols, rows])() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

 Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 1000x100 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
Table: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 1000x100 data (original and transposed).  The top panel shows times in milliseconds and the bottom panel shows relative times.
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.418797 | 1.470277 | 1.498454 | 1.489486 | 1.504072 | 1.788356 | 
| 2 | rowTabulates_X_S | 1.819188 | 1.922722 | 2.013304 | 1.977278 | 2.113175 | 3.094882 | 
| expr | min | lq | mean | median | uq | max | |
|---|---|---|---|---|---|---|---|
| 1 | colTabulates_X_S | 1.000000 | 1.000000 | 1.000000 | 1.00000 | 1.000000 | 1.000000 | 
| 2 | rowTabulates_X_S | 1.282205 | 1.307728 | 1.343587 | 1.32749 | 1.404969 | 1.730574 | 
Figure: Benchmarking of colTabulates_X_S() and rowTabulates_X_S() on 1000x100 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

R version 3.6.1 Patched (2019-08-27 r77078)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS
Matrix products: default
BLAS:   /home/hb/software/R-devel/R-3-6-branch/lib/R/lib/libRblas.so
LAPACK: /home/hb/software/R-devel/R-3-6-branch/lib/R/lib/libRlapack.so
locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] microbenchmark_1.4-6    matrixStats_0.55.0-9000 ggplot2_3.2.1          
[4] knitr_1.24              R.devices_2.16.0        R.utils_2.9.0          
[7] R.oo_1.22.0             R.methodsS3_1.7.1       history_0.0.0-9002     
loaded via a namespace (and not attached):
 [1] Biobase_2.45.0       bit64_0.9-7          splines_3.6.1       
 [4] network_1.15         assertthat_0.2.1     highr_0.8           
 [7] stats4_3.6.1         blob_1.2.0           robustbase_0.93-5   
[10] pillar_1.4.2         RSQLite_2.1.2        backports_1.1.4     
[13] lattice_0.20-38      glue_1.3.1           digest_0.6.20       
[16] colorspace_1.4-1     sandwich_2.5-1       Matrix_1.2-17       
[19] XML_3.98-1.20        lpSolve_5.6.13.3     pkgconfig_2.0.2     
[22] genefilter_1.66.0    purrr_0.3.2          ergm_3.10.4         
[25] xtable_1.8-4         mvtnorm_1.0-11       scales_1.0.0        
[28] tibble_2.1.3         annotate_1.62.0      IRanges_2.18.2      
[31] TH.data_1.0-10       withr_2.1.2          BiocGenerics_0.30.0 
[34] lazyeval_0.2.2       mime_0.7             survival_2.44-1.1   
[37] magrittr_1.5         crayon_1.3.4         statnet.common_4.3.0
[40] memoise_1.1.0        laeken_0.5.0         R.cache_0.13.0      
[43] MASS_7.3-51.4        R.rsp_0.43.1         tools_3.6.1         
[46] multcomp_1.4-10      S4Vectors_0.22.1     trust_0.1-7         
[49] munsell_0.5.0        AnnotationDbi_1.46.1 compiler_3.6.1      
[52] rlang_0.4.0          grid_3.6.1           RCurl_1.95-4.12     
[55] cwhmisc_6.6          rappdirs_0.3.1       labeling_0.3        
[58] bitops_1.0-6         base64enc_0.1-3      boot_1.3-23         
[61] gtable_0.3.0         codetools_0.2-16     DBI_1.0.0           
[64] markdown_1.1         R6_2.4.0             zoo_1.8-6           
[67] dplyr_0.8.3          bit_1.1-14           zeallot_0.1.0       
[70] parallel_3.6.1       Rcpp_1.0.2           vctrs_0.2.0         
[73] DEoptimR_1.0-8       tidyselect_0.2.5     xfun_0.9            
[76] coda_0.19-3         Total processing time was 15.56 secs.
To reproduce this report, do:
html <- matrixStats:::benchmark('colRowTabulates_subset')Copyright Dongcan Jiang. Last updated on 2019-09-10 20:53:52 (-0700 UTC). Powered by RSP.
<script> var link = document.createElement('link'); link.rel = 'icon'; link.href = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAMAAABEpIrGAAAA21BMVEUAAAAAAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8AAP8BAf4CAv0DA/wdHeIeHuEfH+AgIN8hId4lJdomJtknJ9g+PsE/P8BAQL9yco10dIt1dYp3d4h4eIeVlWqWlmmXl2iYmGeZmWabm2Tn5xjo6Bfp6Rb39wj4+Af//wA2M9hbAAAASXRSTlMAAQIJCgsMJSYnKD4/QGRlZmhpamtsbautrrCxuru8y8zN5ebn6Pn6+///////////////////////////////////////////LsUNcQAAAS9JREFUOI29k21XgkAQhVcFytdSMqMETU26UVqGmpaiFbL//xc1cAhhwVNf6n5i5z67M2dmYOyfJZUqlVLhkKucG7cgmUZTybDz6g0iDeq51PUr37Ds2cy2/C9NeES5puDjxuUk1xnToZsg8pfA3avHQ3lLIi7iWRrkv/OYtkScxBIMgDee0ALoyxHQBJ68JLCjOtQIMIANF7QG9G9fNnHvisCHBVMKgSJgiz7nE+AoBKrAPA3MgepvgR9TSCasrCKH0eB1wBGBFdCO+nAGjMVGPcQb5bd6mQRegN6+1axOs9nGfYcCtfi4NQosdtH7dB+txFIpXQqN1p9B/asRHToyS0jRgpV7nk4nwcq1BJ+x3Gl/v7S9Wmpp/aGquum7w3ZDyrADFYrl8vHBH+ev9AUASW1dmU4h4wAAAABJRU5ErkJggg==" document.getElementsByTagName('head')[0].appendChild(link); </script>