Partioning Criteria vs Methods - rFronteddu/general_wiki GitHub Wiki
Summary
- Partitioning criteria is what you use,
- Partitioning method is how you use it.
Let’s say you’re storing log entries:
- Criteria: timestamp
- Method: Range partitioning (e.g., logs from Jan go in Partition A, Feb in Partition B)
Or:
- Criteria: user_id
- Method: Hash partitioning (distribute users across N shards)
Methods
- Range Partition: Divides data into segments based on a specified range of values for a partition key column.
- Key/Hash Based: Divides a table based on a hash function applied to a specified column, typically the ID column. (remember consistent hashing)
- List: Each partition is assigned a list of values
- Vertical (or Column): Splits a table by columns based on the frequency or type of access. For example separating frequently accessed columns from rarely accessed columns.
- Composite (or Hybrid): Combine multiple methods to create detailed and adaptable partitions. For example, first range and then hash.
More examples
- Partitioning Criteria - The basis or rule used to decide where data goes
- Think: What property of the data are we using to decide the partition?
Examples:
A date column (created_at)
A user ID
A product category
A hash of a key
- Partitioning Methods - The technique or algorithm used to apply the criteria
- Think: How do we use the chosen criteria to divide the data?
Examples:
Range method: Based on ranges of values (e.g., dates)
Hash method: Use a hash function on the value
List method: Explicitly map specific values to partitions
Composite method: Combine two or more methods
Vertical method: Split by columns instead of rows