2.2 Join Operation - KeonYeong/LKY GitHub Wiki
Overview :
Using 3 buffers, join is operated(2 buffers for table to be joined, 1 buffer for output buffer). Basic algorithm used for join operation is sort-merge join since the file system is composed of b+ tree. At first fetch the first pages from the each tables and result of join operation is written on output buffer. When the output buffer is full the data is written on disk.
Data Structure:
-
Buffer :: Modified
The frame is modified into Union structure in order to used as output buffer. Another type of data is initialized as char arrays. Hence when the result is made as string type "key,value,key,value". The result string is appended to the char array using 'strcat' method. When the buffer is full or soon to be fulled by the append, buffer is flushed to disk. Character array size is calculated according to the size of Buffer.
Functionality:
-
Overview
- fetchPage()
This is the function for fetching the pages onto buffer selected to be used in join.
- initSM()
This is the function of initializing the sort-merge join on b+tree file system.
-
join_table(): Main Function of join operation, sort-merge join is chosen for join method.
-
joinOut()
Function which write the result of join on output buffer.
- writePage()
When output buffer is full, this function is called and flush the page onto the disk.
- fetchPage()
Initialization:
- At first, we traverse the b+ tree to the leftmost side because the B+ tree is stored sequentially. Hence the leftmost side node contains the smallest keys of the tree. After then the node is fetched to the buffer. Same thing happens to the to-be-joined table but on the other buffer. Next the output buffer is initialized (memory is set to 0, and the first character of the array is set to 'null' character).
Join (Sort-merge):
- Sort-merging Join is chosen because the data to be joined are already sorted and stored. Hence in order to increase performance sort-merge join will help that. When the iterator reaches the end of the data on certain node, we look on the extra offset that the node has which refers to the sibling node (the next-smallest node). Then we fetch the new page on to the current buffer and iterate again.
- Sort-merging rule is simple. First when we have two tables to be joined. We choose the first table as the main table and second one as table to be joined. We give 'i' index to first table and 'j' index to second table. We iterate the first table until the data[i] value is bigger than or same as the data[j]. after then iterate the second table until the data[j] is bigger than or same as data[i]. If we couldn't find the same keys on both side and reach the end of the node, fetch another sibling page from each table to buffer and iterate again. If we finally get the data where data[i] == data[j], we get the key and value from each data and write on to the output buffer in order to save results. Whole this step is repeated until the one of the nodes becomes the last page of the table.
Output / Flush:
- When we write the join result on the output buffer, we save the result as a string type. Using 'sprintf' method, we save result as 'data[i].key,data[i].value,data[j].key,data[j].value' and append this string onto the output buffer using the 'strcat' method. Since there is no any index that implies the end of the buffer. Every time we check whether the 'length of buffer + length of result to be appended > buffer size' and if it does, we write the buffer to disk and if not we just append.
- When writing down to the disk, we open the new file and flush the buffer to the file made. After flush finishes, we initialize the output buffer again in order to reuse (memories set to 0, first character set to 'null' character).
Special:
-
Number of buffers
Since one page only has the 32 pairs of the records, I chose not to use many buffers. If we say we use several buffers, it's effective if we join in 2-ways (making the I/O specialist thread). However since the number of records are not numerous, the scheduling overhead and signaling overhead are not ignorable here. Hence in this project, number of buffers used is only 3 where 2 buffers are used as saving the records of two tables and the other one is used to save the result of join operation.
Perf:
-
Sample
This is the case of 524288 results made by the tables with 2^19 keys and 2^20 keys with the buffer size of 3.
- Since the keys are already sorted (because of the property of B+ tree), the performance seemed to be really fast even the number of keys to be joined are numerous.