Task 3 - WGierke/tuk1_numa_a GitHub Wiki

Based on your learnings from the experiments, what are NUMA-specific factors that could influence a query optimizer’s decision?

For that tasks implement and measure the following query,

SELECT B.col1-8
FROM A, B
WHERE A.col9 = B.col1 and A.col10 = 42

Assume that both tables have a size of a) 1 Million rows and 200 columns b) 100 Million rows and 100 columns Furthermore Table A is always placed on Node 1 whereas Table B is supposed to be resident on Node 1 (local – local) and for a second case Table B is resident on Node 2 (local – remote) Column 10 is supposed to include values between 0 and 99 (uniform distributed) Column 9 in Table A has values in the range of (1 – 10,000 ) and Column 1 in Table B is unique with values from 1 to length of table.

Hints: Should we join first? Is materialization at the beginning helpful? Filter at the beginning or after the join?