Difference between SHARED and LOCAL filesystem - WorkflowSim/WorkflowSim-1.0 GitHub Wiki
A SHARED file system has only one storage for one data center while a LOCAL file system also has a local file system for each VM.
-
For stage-in, in the SHARED mode, we move all the input files to the shared storage at the beginning with a stage-in job. In the LOCAL mode, we move the input files for each task from the nearest VM (since a VM has local file system as well) or the shared file system if available to the VM that this task has been assigned to.
-
For data transfer cost, in the SHARED mode, the data transfer cost is already considered in the task execution time and therefore we do not calculate the data transfer cost for each job. But we do calculate the cost for the initial stage-in job. In the LOCAL mode, the data transfer cost for each job is added.
Why we need to distinguish them? In practice, we either have a shared file system such as a NFS or a distributed system such as HDFS. If you have some data aware algorithm to improve data locality, you need to use LOCAL. Otherwise, if your algorithms do not consider data, you may use SHARED fs to simplify your modeling.