Benchmark results expand2b, SymEngine - gxyd/sympy GitHub Wiki
More recent version can be found in SymEngine Wiki here.
To keep track of the benchmark results of expand2b
benchmark of SymEngine.
Benchmark used
f = (x + y + z + w)**15
f * (f + w)
Result being a 6272 term expression
#####Results of expand2
expand2
uses regular SymEngine expressions
######On master
836ms
831ms
845ms
830ms
836ms
829ms
839ms
831ms
839ms
837ms
Maximum: 845ms
Minimum: 829ms
Average: 853.3ms
######On packint
1106ms
1110ms
1133ms
1105ms
1103ms
1109ms
1105ms
1101ms
1107ms
1119ms
Maximum: 1133ms
Minimum: 1101ms
Average: 1109.8ms
#####Results of expand2b
expand2b
uses the structure and poly_mul
that resides in the rings
######On master
Example run:
sumith@sumith-Lenovo-Z50-70:~/github/csympy/benchmarks$ sudo nice -n -19 ./expand2b
poly_mul start
poly_mul stop
95ms
number of terms: 6272
######On master
94ms
97ms
94ms
94ms
94ms
97ms
96ms
93ms
93ms
94ms
Maximum: 97ms
Minimum: 93ms
Average: 94.6ms
######On packint
Example run:
sumith@sumith-Lenovo-Z50-70:~/github/csympy/benchmarks$ sudo nice -n -19 ./expand2b
poly_mul start
poly_mul stop
114ms
number of terms: 6272
Result of 10 execution:
106ms
105ms
108ms
106ms
110ms
106ms
106ms
107ms
106ms
106ms
Maximum: 110ms
Minimum: 105ms
Average: 106.6ms
#####Why is there a slowdown in packint branch for expand and expand2b?
#####Results of expand2c
The most recent expand2c
uses the structure that uses piranha::integer
from Piranha. The new rings.cpp
can be found here
Example run:
sumith@sumith-Lenovo-Z50-70:~/github/csympy/benchmarks$ sudo nice -n -19 ./expand2c
poly_mul start
poly_mul stop
32ms
number of terms: 6272
Result of 10 execution:
27ms
27ms
26ms
26ms
27ms
26ms
26ms
26ms
27ms
26ms
Maximum: 27ms
Minimum: 26ms
Average: 26.4ms
#####Results of expand2d
Example run:
sumith@sumith-Lenovo-Z50-70:~/github/csympy/benchmarks$ sudo nice -n -19 ./expand2d
poly_mul start
poly_mul stop
23ms
number of terms: 6272
Here, the evaluate_sparsity()
gave the following result for the hash_set
0,11488
1,3605
2,1206
3,85
Result of 10 execution:
14ms
14ms
14ms
15ms
14ms
15ms
14ms
14ms
15ms
14ms
Maximum: 15ms
Minimum: 14ms
Average: 14.3ms
#####Piranha
results
The fateman1_perf
was re-written with the following benchmark.
Example run:
sumith@sumith-Lenovo-Z50-70:~/github/piranha/tests$ sudo nice -n -19 ./fateman1_perf 1
Running 1 test case...
0.013577s wall, 0.010000s user + 0.000000s system = 0.010000s CPU (73.7%)
*** No errors detected
Freeing MPFR caches.
Setting shutdown flag.
Result of 10 execution:
0.013577s wall
0.013190s wall
0.013875s wall
0.012964s wall
0.013724s wall
0.013539s wall
0.013469s wall
0.013343s wall
0.013011s wall
0.013515s wall
Average: 13.421ms
Maximum: 13.875ms
Minimum: 12.964ms
The wall time is used for comparison and stats.
Note: All the above are first 10 results of execution.
Inputs received from Ondřej Čertík and Francesco Biscani
On a new branch:
Changes: Used arr_int4
instead of vec_int
for monomial mul. std::array<int, 4>
Result: Nice percentage speedup
81ms
79ms
81ms
81ms
80ms
80ms
80ms
82ms
81ms
80ms
Max: 79ms
Min: 82ms
Average: 80.5ms
But using std::valarray
resulted in slow down, averaged around 112ms
.
There were very few instances were the syntactic sugar came handy. We are assuming that bottleneck is in memory allocation time, valarray will probably not bring much over vector.
Anyways there might be situations in which it's worth using it over vector, just something to keep in mind.