Local Clone Flow - laurynas-biveinis/mysql-5.6 GitHub Wiki

The current version is at https://github.com/facebook/mysql-5.6/wiki/Local-Clone-Flow

In local clone, client and donor are the same instance on a single machine. It clones the current instance data to a specified file system path.

Single-Threaded Local Clone

Single-threaded local clone is the simplest case. The picture is simplified by not considering InnoDB clone stages.

local_clone_cf-3

Client, donor, and plugin are all a single thread calling different functions. Each step is iterated over all the storage engines.

  1. Call clone_begin(HA_CLONE_MODE_START). The storage engine creates the clone data snapshot and assigns a locator to it which is then passed to all the further calls. The task_id is set to 0, which is also passed to all the further calls.
  2. Call clone_apply_begin(HA_CLONE_MODE_START). The storage engine creates the clone target directories.
  3. Call clone_copy with file_cbk callback.
    1. In a loop, storage engine figures out what next data chunk to send until there are no more left. In the case of file data (as opposed to in-memory buffer data) that is (file, offset, count). The offset is maintained implicitly by the file descriptor position.
    2. The storage engine calls file_cbk with fd & count.
    3. file_cbk inside the clone plugin calls clone_apply with apply_file_cbk.
      1. The storage engine figures out where to put the received file data and prepares a Ha_clone_file instance (a wrapper around a fd opened for write)
      2. The storage engine calls apply_file_cbk with the Ha_clone_file instance.
        1. apply_file_cbk arranges for data to be copied from donor fd to client fd, perhaps also by using zero-copy optimisations.
  4. Call clone_end
  5. Call clone_apply_end

Local Clone with Multiple Threads

To achieve clone throughput target, the plugin may decide to spawn (and destruct) new threads between begin and end calls. They will execute the same step sequence with HA_CLONE_MODE_ADD_TASK instead of HA_CLONE_MODE_START for the begin-calling step. The task IDs tell the threads apart.

local_clone_multiple_threads-5

Here, each thread also goes through all three columns (plugin, donor, & client), but more than one thread goes through them.