matrixStats: Benchmark report

sum2() benchmarks

This report benchmark the performance of sum2() against alternative methods.

Alternative methods

sum() + [()

as below

> sum2_R <- function(x, na.rm = FALSE, idxs) {
+     sum(x[idxs], na.rm = na.rm)
+ }

Data type "integer"

Data

> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), na_prob = 0) {
+     mode <- match.arg(mode)
+     if (mode == "logical") {
+         x <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else {
+         x <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(x) <- mode
+     if (na_prob > 0) 
+         x[sample(n, size = na_prob * n)] <- NA
+     x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rvector(n = scale * 100, ...)
+     data[[2]] <- rvector(n = scale * 1000, ...)
+     data[[3]] <- rvector(n = scale * 10000, ...)
+     data[[4]] <- rvector(n = scale * 1e+05, ...)
+     data[[5]] <- rvector(n = scale * 1e+06, ...)
+     names(data) <- sprintf("n = %d", sapply(data, FUN = length))
+     data
+ }
> data <- rvectors(mode = mode)

Results

n = 1000 vector

All elements

> x <- data[["n = 1000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3242120 173.2    5709258 305.0  5709258 305.0
Vcells 33422056 255.0   60231636 459.6 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 1000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum	0.000696	0.0007135	0.0007671	0.0007280	0.0007650	0.003167
1	sum2	0.002496	0.0025470	0.0027997	0.0025965	0.0027015	0.018432

	expr	min	lq	mean	median	uq	max
2	sum	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
1	sum2	3.586207	3.569727	3.649806	3.566621	3.531372	5.820019

Figure: Benchmarking of sum2() and sum() on n = 1000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3239955 173.1    5709258 305.0  5709258 305.0
Vcells 11787296  90.0   48185309 367.7 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum+[()	0.001158	0.0013055	0.0016169	0.001460	0.0015950	0.012060
1	sum2	0.001956	0.0020410	0.0022577	0.002116	0.0021655	0.015296

	expr	min	lq	mean	median	uq	max
2	sum+[()	1.000000	1.000000	1.000000	1.000000	1.00000	1.000000
1	sum2	1.689119	1.563386	1.396355	1.449315	1.35768	1.268325

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240021 173.1    5709258 305.0  5709258 305.0
Vcells 11787448  90.0   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum+[()	0.001665	0.0018235	0.0021305	0.0018975	0.0020215	0.021375
1	sum2	0.002296	0.0023755	0.0026987	0.0024770	0.0025750	0.022246

	expr	min	lq	mean	median	uq	max
2	sum+[()	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
1	sum2	1.378979	1.302715	1.266745	1.305402	1.273807	1.040749

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240084 173.1    5709258 305.0  5709258 305.0
Vcells 11788202  90.0   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum+[()	0.002515	0.0026475	0.0028923	0.0027495	0.002843	0.015982
1	sum2	0.003039	0.0030885	0.0033549	0.0032005	0.003283	0.016843

	expr	min	lq	mean	median	uq	max
2	sum+[()	1.00000	1.000000	1.000000	1.00000	1.000000	1.000000
1	sum2	1.20835	1.166572	1.159968	1.16403	1.154766	1.053873

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 10000 vector

All elements

> x <- data[["n = 10000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240085 173.1    5709258 305.0  5709258 305.0
Vcells 11787831  90.0   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 10000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum	0.005396	0.0055005	0.0057501	0.0055365	0.0056140	0.015419
1	sum2	0.011895	0.0119915	0.0123481	0.0120420	0.0121775	0.033438

	expr	min	lq	mean	median	uq	max
2	sum	1.000000	1.000000	1.00000	1.00000	1.000000	1.000000
1	sum2	2.204411	2.180074	2.14744	2.17502	2.169131	2.168623

Figure: Benchmarking of sum2() and sum() on n = 10000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240210 173.1    5709258 305.0  5709258 305.0
Vcells 11788884  90.0   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.005390	0.005507	0.0058540	0.005587	0.0057530	0.027598
2	sum+[()	0.005109	0.005396	0.0061182	0.005596	0.0059905	0.037891

	expr	min	lq	mean	median	uq	max
1	sum2	1.0000000	1.0000000	1.000000	1.000000	1.000000	1.000000
2	sum+[()	0.9478664	0.9798438	1.045137	1.001611	1.041283	1.372962

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240273 173.1    5709258 305.0  5709258 305.0
Vcells 11790154  90.0   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.009012	0.0091635	0.0096359	0.009295	0.0094360	0.032655
2	sum+[()	0.009303	0.0096105	0.0104843	0.009792	0.0100435	0.046376

	expr	min	lq	mean	median	uq	max
1	sum2	1.00000	1.00000	1.000000	1.00000	1.000000	1.000000
2	sum+[()	1.03229	1.04878	1.088046	1.05347	1.064381	1.420181

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240336 173.1    5709258 305.0  5709258 305.0
Vcells 11792456  90.0   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.016272	0.0164560	0.0168704	0.0165820	0.0167310	0.039790
2	sum+[()	0.017655	0.0179795	0.0187115	0.0181925	0.0184385	0.045732

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.00000	1.000000	1.000000	1.000000	1.000000
2	sum+[()	1.084993	1.09258	1.109133	1.097123	1.102056	1.149334

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 100000 vector

All elements

> x <- data[["n = 100000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240337 173.1    5709258 305.0  5709258 305.0
Vcells 11792391  90.0   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 100000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum	0.051804	0.0520155	0.0527654	0.0531500	0.0532705	0.057469
1	sum2	0.105053	0.1052065	0.1058334	0.1053235	0.1054905	0.141959

	expr	min	lq	mean	median	uq	max
2	sum	1.000000	1.000000	1.000000	1.000000	1.00000	1.000000
1	sum2	2.027894	2.022599	2.005733	1.981628	1.98028	2.470184

Figure: Benchmarking of sum2() and sum() on n = 100000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240462 173.1    5709258 305.0  5709258 305.0
Vcells 11798845  90.1   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.038476	0.0386170	0.0394066	0.0386830	0.038786	0.101996
2	sum+[()	0.044886	0.0454885	0.0462292	0.0457295	0.046056	0.064244

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.00000	1.000000	1.00000	1.000000	1.0000000
2	sum+[()	1.166597	1.17794	1.173135	1.18216	1.187439	0.6298678

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240525 173.1    5709258 305.0  5709258 305.0
Vcells 11809255  90.1   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.072549	0.0727515	0.0746807	0.0730310	0.0746845	0.112514
2	sum+[()	0.083135	0.0841910	0.0881114	0.0850615	0.0866565	0.170008

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
2	sum+[()	1.145915	1.157241	1.179841	1.164731	1.160301	1.510994

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240588 173.1    5709258 305.0  5709258 305.0
Vcells 11829297  90.3   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.146397	0.146569	0.1477314	0.1466790	0.146982	0.205369
2	sum+[()	0.166384	0.168057	0.1697833	0.1686275	0.169579	0.196625

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.00000	1.000000	1.00000	1.000000
2	sum+[()	1.136526	1.146607	1.14927	1.149636	1.15374	0.957423

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 100000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 1000000 vector

All elements

> x <- data[["n = 1000000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240589 173.1    5709258 305.0  5709258 305.0
Vcells 11829368  90.3   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 1000000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum	0.501621	0.5194995	0.5432483	0.528009	0.531586	1.465403
1	sum2	1.004862	1.0307020	1.0333045	1.031092	1.034328	1.072684

	expr	min	lq	mean	median	uq	max
2	sum	1.00000	1.000000	1.000000	1.000000	1.000000	1.0000000
1	sum2	2.00323	1.984029	1.902085	1.952792	1.945739	0.7320061

Figure: Benchmarking of sum2() and sum() on n = 1000000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240714 173.1    5709258 305.0  5709258 305.0
Vcells 11889821  90.8   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.363530	0.3658475	0.3789617	0.3711655	0.3754525	0.788152
2	sum+[()	0.438339	0.4530720	0.4862498	0.4612240	0.4861020	1.021969

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.00000	1.000000
2	sum+[()	1.205785	1.238418	1.283111	1.242637	1.29471	1.296665

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240777 173.1    5709258 305.0  5709258 305.0
Vcells 11989865  91.5   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.723495	0.7295825	0.7508727	0.738803	0.7644305	0.905912
2	sum+[()	0.852317	0.8789855	1.1467230	1.315463	1.3500170	1.409307

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
2	sum+[()	1.178055	1.204779	1.527187	1.780533	1.766043	1.555678

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240840 173.1    5709258 305.0  5709258 305.0
Vcells 12190436  93.1   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	1.448964	1.463207	1.498134	1.492736	1.514633	2.004919
2	sum+[()	1.683709	1.713105	1.913191	1.738438	1.799778	7.315892

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
2	sum+[()	1.162009	1.170788	1.277049	1.164598	1.188259	3.648971

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 1000000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 10000000 vector

All elements

> x <- data[["n = 10000000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240841 173.1    5709258 305.0  5709258 305.0
Vcells 12190065  93.1   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 10000000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum	5.274421	5.401382	5.473633	5.477968	5.534058	5.646441
1	sum2	10.129370	10.226163	10.315181	10.285557	10.427081	10.569084

	expr	min	lq	mean	median	uq	max
2	sum	1.00000	1.00000	1.000000	1.000000	1.000000	1.000000
1	sum2	1.92047	1.89325	1.884522	1.877623	1.884166	1.871813

Figure: Benchmarking of sum2() and sum() on n = 10000000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3240966 173.1    5709258 305.0  5709258 305.0
Vcells 12790518  97.6   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	4.268098	4.333766	4.472542	4.370467	4.494381	5.251529
2	sum+[()	5.882755	7.707095	8.185736	7.785639	7.920075	18.760870

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.00000	1.000000	1.000000	1.000000
2	sum+[()	1.378308	1.778383	1.83022	1.781421	1.762217	3.572459

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241029 173.1    5709258 305.0  5709258 305.0
Vcells 13791193 105.3   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	7.501035	7.563637	7.807667	7.69571	7.867394	9.60192
2	sum+[()	9.847038	13.457123	13.709807	13.56098	13.789718	23.18444

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
2	sum+[()	1.312757	1.779187	1.755942	1.762148	1.752768	2.414563

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241092 173.1    5709258 305.0  5709258 305.0
Vcells 15791235 120.5   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	14.67454	15.31241	15.52328	15.64326	15.74342	17.15975
2	sum+[()	17.75512	19.81172	24.65424	26.20575	26.99759	33.87819

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.00000	1.000000	1.000000
2	sum+[()	1.209927	1.293834	1.588211	1.67521	1.714849	1.974282

Figure: Benchmarking of sum2() and sum+[()() on integer+n = 10000000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

Data type "double"

Data

> rvector <- function(n, mode = c("logical", "double", "integer"), range = c(-100, +100), na_prob = 0) {
+     mode <- match.arg(mode)
+     if (mode == "logical") {
+         x <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }     else {
+         x <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(x) <- mode
+     if (na_prob > 0) 
+         x[sample(n, size = na_prob * n)] <- NA
+     x
+ }
> rvectors <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rvector(n = scale * 100, ...)
+     data[[2]] <- rvector(n = scale * 1000, ...)
+     data[[3]] <- rvector(n = scale * 10000, ...)
+     data[[4]] <- rvector(n = scale * 1e+05, ...)
+     data[[5]] <- rvector(n = scale * 1e+06, ...)
+     names(data) <- sprintf("n = %d", sapply(data, FUN = length))
+     data
+ }
> data <- rvectors(mode = mode)

Results

n = 1000 vector

All elements

> x <- data[["n = 1000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241093 173.1    5709258 305.0  5709258 305.0
Vcells 21346364 162.9   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 1000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum	0.000937	0.0009575	0.0010382	0.0009950	0.0010325	0.004678
1	sum2	0.002480	0.0025285	0.0029389	0.0025805	0.0027080	0.033102

	expr	min	lq	mean	median	uq	max
2	sum	1.000000	1.000000	1.000000	1.000000	1.00000	1.000000
1	sum2	2.646745	2.640731	2.830709	2.593467	2.62276	7.076101

Figure: Benchmarking of sum2() and sum() on n = 1000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241218 173.1    5709258 305.0  5709258 305.0
Vcells 17347680 132.4   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum+[()	0.001372	0.0015245	0.0017279	0.0016025	0.0017085	0.011558
1	sum2	0.001905	0.0020395	0.0022567	0.0021220	0.0021730	0.014957

	expr	min	lq	mean	median	uq	max
2	sum+[()	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
1	sum2	1.388484	1.337816	1.306039	1.324181	1.271876	1.294082

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241281 173.2    5709258 305.0  5709258 305.0
Vcells 17347824 132.4   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum+[()	0.001870	0.0019985	0.0022490	0.0020915	0.0022400	0.014002
1	sum2	0.002289	0.0023360	0.0026003	0.0024330	0.0025445	0.015861

	expr	min	lq	mean	median	uq	max
2	sum+[()	1.000000	1.000000	1.000000	1.00000	1.000000	1.000000
1	sum2	1.224064	1.168877	1.156198	1.16328	1.135938	1.132767

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 1000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241344 173.2    5709258 305.0  5709258 305.0
Vcells 17348066 132.4   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum+[()	0.002831	0.0030265	0.0032822	0.0031190	0.0032495	0.016792
1	sum2	0.003000	0.0030855	0.0033699	0.0032065	0.0032845	0.018411

	expr	min	lq	mean	median	uq	max
2	sum+[()	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
1	sum2	1.059696	1.019494	1.026723	1.028054	1.010771	1.096415

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 10000 vector

All elements

> x <- data[["n = 10000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241345 173.2    5709258 305.0  5709258 305.0
Vcells 17347695 132.4   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 10000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum	0.007959	0.0080275	0.0081383	0.0080710	0.008106	0.014149
1	sum2	0.012019	0.0121200	0.0123893	0.0121935	0.012322	0.024850

	expr	min	lq	mean	median	uq	max
2	sum	1.000000	1.00000	1.000000	1.000000	1.000000	1.000000
1	sum2	1.510114	1.50981	1.522349	1.510779	1.520109	1.756308

Figure: Benchmarking of sum2() and sum() on n = 10000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241470 173.2    5709258 305.0  5709258 305.0
Vcells 17348748 132.4   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.005478	0.0056365	0.005936	0.005741	0.0058320	0.023643
2	sum+[()	0.006356	0.0066330	0.007286	0.006789	0.0070375	0.031268

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
2	sum+[()	1.160278	1.176794	1.227417	1.182547	1.206704	1.322506

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241533 173.2    5709258 305.0  5709258 305.0
Vcells 17350715 132.4   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.009143	0.0092735	0.0097529	0.0093860	0.0095245	0.042477
2	sum+[()	0.011242	0.0118460	0.0126331	0.0120905	0.0124100	0.041412

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.000000	1.0000000
2	sum+[()	1.229575	1.277403	1.295323	1.288142	1.302955	0.9749276

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 10000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241596 173.2    5709258 305.0  5709258 305.0
Vcells 17352757 132.4   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.016381	0.016566	0.0169003	0.0167325	0.0168650	0.032969
2	sum+[()	0.021515	0.022339	0.0233113	0.0226445	0.0230925	0.049600

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
2	sum+[()	1.313412	1.348485	1.379346	1.353324	1.369256	1.504444

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 100000 vector

All elements

> x <- data[["n = 100000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241597 173.2    5709258 305.0  5709258 305.0
Vcells 17352386 132.4   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 100000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum	0.077628	0.077864	0.0781125	0.0779455	0.0780470	0.081788
1	sum2	0.105096	0.105208	0.1056838	0.1053380	0.1054645	0.129068

	expr	min	lq	mean	median	uq	max
2	sum	1.000000	1.000000	1.000000	1.000000	1.000000	1.00000
1	sum2	1.353841	1.351176	1.352969	1.351431	1.351295	1.57808

Figure: Benchmarking of sum2() and sum() on n = 100000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241722 173.2    5709258 305.0  5709258 305.0
Vcells 17358840 132.5   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.037749	0.0385065	0.0400214	0.0388975	0.039074	0.116224
2	sum+[()	0.055322	0.0565515	0.0586653	0.0571910	0.058244	0.095883

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.0000	1.000000	1.0000000
2	sum+[()	1.465522	1.468622	1.465849	1.4703	1.490608	0.8249845

Figure: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241785 173.2    5709258 305.0  5709258 305.0
Vcells 17368884 132.6   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.072660	0.0745435	0.0763096	0.074757	0.075596	0.113927
2	sum+[()	0.104418	0.1081720	0.1119452	0.109947	0.112189	0.195060

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.00000	1.000000
2	sum+[()	1.437077	1.451126	1.466987	1.470725	1.48406	1.712149

Figure: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 100000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241848 173.2    5709258 305.0  5709258 305.0
Vcells 17390022 132.7   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.146417	0.1468015	0.1525214	0.1505095	0.1545380	0.225430
2	sum+[()	0.195938	0.1993685	0.2304425	0.2070100	0.2145965	0.405332

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
2	sum+[()	1.338219	1.358082	1.510887	1.375395	1.388633	1.798039

Figure: Benchmarking of sum2() and sum+[()() on double+n = 100000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 1000000 vector

All elements

> x <- data[["n = 1000000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241850 173.2    5709258 305.0  5709258 305.0
Vcells 17389659 132.7   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 1000000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum	0.772659	0.819567	0.8324265	0.834388	0.841484	0.915548
1	sum2	1.015984	1.059741	1.0764473	1.079457	1.086091	1.174845

	expr	min	lq	mean	median	uq	max
2	sum	1.000000	1.00000	1.000000	1.000000	1.000000	1.000000
1	sum2	1.314919	1.29305	1.293144	1.293711	1.290685	1.283215

Figure: Benchmarking of sum2() and sum() on n = 1000000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3241975 173.2    5709258 305.0  5709258 305.0
Vcells 17450112 133.2   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.507993	0.5601745	0.609585	0.609423	0.6560355	0.904999
2	sum+[()	0.722973	0.8044595	1.186473	1.314753	1.3507275	1.460954

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
2	sum+[()	1.423195	1.436087	1.946362	2.157373	2.058924	1.614316

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3242038 173.2    5709258 305.0  5709258 305.0
Vcells 17550156 133.9   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	0.79268	0.8674115	0.8985236	0.8871595	0.9208995	1.073021
2	sum+[()	1.24500	2.0127145	2.0613914	2.0419175	2.0710115	12.692151

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.000000	1.00000
2	sum+[()	1.570621	2.320369	2.294198	2.301635	2.248901	11.82843

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 1000000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3242101 173.2    5709258 305.0  5709258 305.0
Vcells 17750198 135.5   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	1.493228	1.545726	1.604957	1.590885	1.620530	2.112727
2	sum+[()	2.198436	3.745358	4.011810	3.793943	3.847101	14.064577

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.0000	1.000000	1.000000
2	sum+[()	1.472271	2.423041	2.499637	2.3848	2.373977	6.657073

Figure: Benchmarking of sum2() and sum+[()() on double+n = 1000000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

n = 10000000 vector

All elements

> x <- data[["n = 10000000"]]
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3242102 173.2    5709258 305.0  5709258 305.0
Vcells 17749827 135.5   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x), sum = sum(x), unit = "ms")

Table: Benchmarking of sum2() and sum() on n = 10000000+all data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
2	sum	8.197766	8.482046	8.59797	8.567894	8.732012	8.946944
1	sum2	10.509965	10.721270	10.88363	10.896505	11.072421	11.386470

	expr	min	lq	mean	median	uq	max
2	sum	1.000000	1.000000	1.000000	1.000000	1.000000	1.000000
1	sum2	1.282052	1.263996	1.265837	1.271783	1.268026	1.272666

Figure: Benchmarking of sum2() and sum() on n = 10000000+all data. Outliers are displayed as crosses. Times are in milliseconds.

A 20% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.2
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3242227 173.2    5709258 305.0  5709258 305.0
Vcells 18350280 140.1   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.2 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	6.700800	6.84244	7.015533	6.890695	7.098933	9.725433
2	sum+[()	8.996122	14.01923	14.510434	14.594601	14.820728	26.779905

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.000000	1.00000	1.000000
2	sum+[()	1.342545	2.048864	2.068329	2.118016	2.08774	2.753595

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.2 data. Outliers are displayed as crosses. Times are in milliseconds.

A 40% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.4
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3242290 173.2    5709258 305.0  5709258 305.0
Vcells 19351676 147.7   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.4 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	8.667995	8.779704	9.099489	8.914767	9.23872	10.65943
2	sum+[()	13.934503	20.904911	26.117672	21.364256	24.34419	284.23356

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.00000	1.000000	1.000000	1.000000	1.00000
2	sum+[()	1.607581	2.38105	2.870235	2.396502	2.635018	26.66498

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.4 data. Outliers are displayed as crosses. Times are in milliseconds.

A 80% subset

> x <- data[["n = 10000000"]]
> subset
[1] 0.8
> idxs <- sort(sample(length(x), size = subset * length(x), replace = FALSE))
> gc()
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  3242353 173.2    5709258 305.0  5709258 305.0
Vcells 21351718 163.0   38548248 294.1 87357391 666.5
> stats <- microbenchmark(sum2 = sum2(x, idxs = idxs), `sum+[()` = sum2_R(x, idxs = idxs), unit = "ms")

Table: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.8 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

	expr	min	lq	mean	median	uq	max
1	sum2	15.06396	16.19000	16.74258	16.63105	16.86282	30.89018
2	sum+[()	24.13433	38.23597	46.48052	38.91965	54.78060	310.60984

	expr	min	lq	mean	median	uq	max
1	sum2	1.000000	1.000000	1.000000	1.00000	1.000000	1.0000
2	sum+[()	1.602124	2.361703	2.776187	2.34018	3.248603	10.0553

Figure: Benchmarking of sum2() and sum+[()() on double+n = 10000000+0.8 data. Outliers are displayed as crosses. Times are in milliseconds.

Appendix

Session information

R version 3.6.1 Patched (2019-08-27 r77078)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS:   /home/hb/software/R-devel/R-3-6-branch/lib/R/lib/libRblas.so
LAPACK: /home/hb/software/R-devel/R-3-6-branch/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] microbenchmark_1.4-6    matrixStats_0.55.0-9000 ggplot2_3.2.1          
[4] knitr_1.24              R.devices_2.16.0        R.utils_2.9.0          
[7] R.oo_1.22.0             R.methodsS3_1.7.1       history_0.0.0-9002     

loaded via a namespace (and not attached):
 [1] Biobase_2.45.0       bit64_0.9-7          splines_3.6.1       
 [4] network_1.15         assertthat_0.2.1     highr_0.8           
 [7] stats4_3.6.1         blob_1.2.0           robustbase_0.93-5   
[10] pillar_1.4.2         RSQLite_2.1.2        backports_1.1.4     
[13] lattice_0.20-38      glue_1.3.1           digest_0.6.20       
[16] colorspace_1.4-1     sandwich_2.5-1       Matrix_1.2-17       
[19] XML_3.98-1.20        lpSolve_5.6.13.3     pkgconfig_2.0.2     
[22] genefilter_1.66.0    purrr_0.3.2          ergm_3.10.4         
[25] xtable_1.8-4         mvtnorm_1.0-11       scales_1.0.0        
[28] tibble_2.1.3         annotate_1.62.0      IRanges_2.18.2      
[31] TH.data_1.0-10       withr_2.1.2          BiocGenerics_0.30.0 
[34] lazyeval_0.2.2       mime_0.7             survival_2.44-1.1   
[37] magrittr_1.5         crayon_1.3.4         statnet.common_4.3.0
[40] memoise_1.1.0        laeken_0.5.0         R.cache_0.13.0      
[43] MASS_7.3-51.4        R.rsp_0.43.1         tools_3.6.1         
[46] multcomp_1.4-10      S4Vectors_0.22.1     trust_0.1-7         
[49] munsell_0.5.0        AnnotationDbi_1.46.1 compiler_3.6.1      
[52] rlang_0.4.0          grid_3.6.1           RCurl_1.95-4.12     
[55] cwhmisc_6.6          rappdirs_0.3.1       labeling_0.3        
[58] bitops_1.0-6         base64enc_0.1-3      boot_1.3-23         
[61] gtable_0.3.0         codetools_0.2-16     DBI_1.0.0           
[64] markdown_1.1         R6_2.4.0             zoo_1.8-6           
[67] dplyr_0.8.3          bit_1.1-14           zeallot_0.1.0       
[70] parallel_3.6.1       Rcpp_1.0.2           vctrs_0.2.0         
[73] DEoptimR_1.0-8       tidyselect_0.2.5     xfun_0.9            
[76] coda_0.19-3

Total processing time was 1.17 mins.

Reproducibility

To reproduce this report, do:

html <- matrixStats:::benchmark('sum2')

sum2 - HenrikBengtsson/matrixStats GitHub Wiki

sum2() benchmarks

Alternative methods

Data type "integer"

Data

Results

n = 1000 vector

All elements

A 20% subset

A 40% subset

A 80% subset

n = 10000 vector

All elements

A 20% subset

A 40% subset

A 80% subset

n = 100000 vector

All elements

A 20% subset

A 40% subset

A 80% subset

n = 1000000 vector

All elements

A 20% subset

A 40% subset

A 80% subset

n = 10000000 vector

All elements

A 20% subset

A 40% subset

A 80% subset

Data type "double"

Data

Results

n = 1000 vector

All elements

A 20% subset

A 40% subset

A 80% subset

n = 10000 vector

All elements

A 20% subset

A 40% subset

A 80% subset

n = 100000 vector

All elements

A 20% subset

A 40% subset

A 80% subset

n = 1000000 vector

All elements

A 20% subset

A 40% subset

A 80% subset

n = 10000000 vector

All elements

A 20% subset

A 40% subset

A 80% subset

Appendix

Session information

Reproducibility

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️