concept: Data Skew - davidkhala/data-warehouse GitHub Wiki
Data skew means the data is not distributed evenly across the distributions
- some distributions finish their portion of a parallel query before others. Since the query can't complete until all distributions have finished processing, each query is only as fast as the slowest distribution.
-
一个处理器核心承担的工作比其他处理器核心多得多,进而成为瓶颈