3.4 Lock free MariaDB in ReadOnly Case - KeonYeong/LKY GitHub Wiki
Overall Description :
This project is to implement lock-free structure on the existing open-source database "MariaDB". Originally, the MariaDB was using the global lock in order to handle the lock table for the records in it. Through this project I tried to eliminate the global lock and check the improvement of performance. All the test cases were performed using 'sysbench'.
Original Design:
-
Insert + Delete The main bottleneck of the performance was due to the global mutex held before managing the lock table. Every time the record lock is added or deleted, the methods called lock_mutex_enter() and lock_mutex_exit() are called before and after the lock inserting/deleting method. Lock table is managed using hash. Page number containing the record is hashed and inserted into each buckets of hash table(lock table). Hence the unit of lock is page lock.
- Insert : Main method is lock_rec_lock(). Inside the lock_rec_lock(), there were main two functions inserting the record lock(One is lock_rec_lock_fast, the other one is lock_rec_lock_slow). At first all the locks are inserted through the lock_rec_lock_fast if there are no conflicts expected. However when conflicts are expected or the locks are already existing in the lock table. Ultimately the main function is operated through traversing the list and adding or deleting from the list in the hash bucket.
As shown in picture, mutex is held before and after the insertion.
This is the main function that insert the lock to hash table.
- Delete : Main methods are lock_release() and lock_rec_unlock(). For lock_release(), mutex is again held before and after the method call when in lock_rec_unlock() mutex is held inside the method. In lock_rec_unlock() unlocking and let the next lock granting the lock is the main operation it does by using lock_grant_and_move_on_rec() method. When the lock_release() just dequeue the lock from the hash table bucket using HASH_DELETE. Ultimately same as inserting again, it travers the list and delete the certain lock from the list.
Again, the mutex is held in the method.
Deleting the lock from hash table.
- Insert : Main method is lock_rec_lock(). Inside the lock_rec_lock(), there were main two functions inserting the record lock(One is lock_rec_lock_fast, the other one is lock_rec_lock_slow). At first all the locks are inserted through the lock_rec_lock_fast if there are no conflicts expected. However when conflicts are expected or the locks are already existing in the lock table. Ultimately the main function is operated through traversing the list and adding or deleting from the list in the hash bucket.
New Design:
-
Overall : First the unit of lock is decreased from page to the level of records. Hence when traversing the list, heap_no (record id) is added and checked during the loop. Moreover the tail is added to the bucket of the record in order to use atomic operation to insert. Every main operations are done by atmoic functions and global mutex locks are deleted. Besides in order to mark the deleted node, del_mark is added to the lock_t structure.
Tail is added to hash_cell_t structure.
del_mark and heap_no(Record ID) in lock_t structure.
While traversing, heap_no(Record ID) is also checked.
-
Insert : Several functions are modified, but main concept is that after traversing the list and grab the lock with same record. Checks whether there is conflict or not(unchanged). Then when inserting into the hash table, tail variable is used rather than the head(member 'node' in code) in hash_cell_t structure. Using the testAndSet atomic operation the new lock is added to tail atomically and change the previous next pointer to itself. Besides lock_rec_lock is also modified to lock_rec_lock2 and the format was reorganized.
ex) In hash_insert, test and set used.
-
Delete : In dequeue operation, Idea is that we compareAndSwap on the head(member 'node' in code) of hash_cell_t to the upcoming lock right after the head node, and make it new head (or sentinel) then finally retrieve the value from the node taken. In deleting the certain lock from the queue, after traversing the list and grabbing the interested lock. Checks whether there is conflict or not and also whether delete mark is already set or not, if delete mark is set we traverse again or fail the operation. If not then we delete the lock using the atomic operation. Deletion is lazily performed which means only a certain bit in lock ('del_mark' in this case) is set to 1 and the next pointer of the previous lock is connected to next pointer of interested lock. At last, garbage collector is created and periodically checks the nodes in hash tables for the deletion mark and frees if it's set.
Graph of performance:
This is the line graph showing the difference between the performance of original design and new design with threads number of 10, 20, 30. (The comparing criteria is the number of queries performed per second)
These are the final sysbench results of original and new design with 30 threads.
Result:
Global lock shows to be huge performance bottleneck on MariaDB since the lock overhead and increased contention in every searching are not ignorable. Hence by applying the atomic operations and certain lock-free concepts learned from the class, the elimination of global locks became possible which resulted in increase in performance with no invariant broken(good correctness).