Local Clone Flow - laurynas-biveinis/mysql-5.6 GitHub Wiki
The current version is at https://github.com/facebook/mysql-5.6/wiki/Local-Clone-Flow
In local clone, client and donor are the same instance on a single machine. It clones the current instance data to a specified file system path.
Single-Threaded Local Clone
Single-threaded local clone is the simplest case. The picture is simplified by not considering InnoDB clone stages.
Client, donor, and plugin are all a single thread calling different functions. Each step is iterated over all the storage engines.
- Call
clone_begin(HA_CLONE_MODE_START)
. The storage engine creates the clone data snapshot and assigns a locator to it which is then passed to all the further calls. Thetask_id
is set to0
, which is also passed to all the further calls. - Call
clone_apply_begin(HA_CLONE_MODE_START)
. The storage engine creates the clone target directories. - Call
clone_copy
withfile_cbk
callback.- In a loop, storage engine figures out what next data chunk to send until there are no more left. In the case of file data (as opposed to in-memory buffer data) that is
(file, offset, count)
. Theoffset
is maintained implicitly by the file descriptor position. - The storage engine calls
file_cbk
withfd
&count
. file_cbk
inside the clone plugin callsclone_apply
withapply_file_cbk
.- The storage engine figures out where to put the received file data and prepares a
Ha_clone_file
instance (a wrapper around a fd opened for write) - The storage engine calls
apply_file_cbk
with theHa_clone_file
instance.apply_file_cbk
arranges for data to be copied from donor fd to client fd, perhaps also by using zero-copy optimisations.
- The storage engine figures out where to put the received file data and prepares a
- In a loop, storage engine figures out what next data chunk to send until there are no more left. In the case of file data (as opposed to in-memory buffer data) that is
- Call
clone_end
- Call
clone_apply_end
Local Clone with Multiple Threads
To achieve clone throughput target, the plugin may decide to spawn (and destruct) new threads between begin
and end
calls. They will execute the same step sequence with HA_CLONE_MODE_ADD_TASK
instead of HA_CLONE_MODE_START
for the begin
-calling step. The task IDs tell the threads apart.
Here, each thread also goes through all three columns (plugin, donor, & client), but more than one thread goes through them.