Locking - nimrody/knowledgebase GitHub Wiki

  • Why locks are necessary in MVCC

    It depends on isolation level. Default isolation level in MySQL is REPEATABLE READ and not SERIALIZABLE. Without locks, lost updates are sometimes possible.

    The basic scenario where lost update is possible looks this way:

    session_1: BEGIN
    session_1: SELECT data from T1 where pk = 123
    
    session_2: BEGIN
    session_2: SELECT data from T1 where pk = 123
    session_2: UPDATE T1 set data = "value_2" where pk = 123
    session_2: COMMIT
    
    session_1: UPDATE T1 set data = "updated_value" where pk = 123
    session_1: COMMIT
    

    In this scenario, session-1 does not see "value-2", so "updated-value" does not take "value2" into account, and update of session-2 becomes lost. There are two ways to prevent this problem:

    1. Using SELECT FOR UPDATE instead of SELECT. This way session-1 take exclusive lock on the corresponding row, and session-2 execution is suspended until session-1 will commit.

    2. Using more strict SERIALIZABLE isolation level. In this mode, session-1 in MySQL will take shared lock on each selected row, and each SELECT effectively becomes SELECT ... LOCK IN SHARE MODE. This way session-2 also blocks until session-1 will commit. In PostgreSQL, SERIALIZED mode implemented the other way - SELECT works as usual (without locks), but during UPDATE session-1 will detect that the row was changed by concurrent transaction, and transaction of session-1 will be terminated with the exception "could not serialize access due to concurrent update".