BEP4: Remote Step Submission - Vipyr/BazaarCI GitHub Wiki
30 October 2019 - BEP Created
7 December 2019 - Minor P1 Mockup Updates
A natural next step for any process executor is to allow processes to execute in multiple independent processes. The machine hosting these processes shouldn't matter, it should only require that each machine is running the same version of Python and the same version of BazaarCI. Offloading work to both local OS threads and to cloud provided containers, VMs, or functions, is potentially very valuable. All of these submission execution strategies look very similar to the front-end, only significantly differing in the back-end. Each type should be a standalone plug in that can be installed independently, with it's own set of dependencies. User code should be able to declare and describe the submission type and location, then simply convert existing Step instances to remote submit to their described targets.
Implement a Remote
adapter class which can be specialized with different remote types:
-
MultiprocessingRemote
-
DockerRemote
-
AWSRemote
-
AWSLambdaRemote
-
AzureRemote
-
AzureFunctionsRemote
These Remote
object would implement a __call__(self, step)
method that adds their setup and teardown routines to the Step
instance via set_run_behavior
.
Frontend Mockup:
from bazaarci.runner import Graph, Product, MultiprocessingRemote, Step
g = Graph("g")
p = Product("p")
s1 = Step("Step1", g, target=lambda: print("Step1"))
s2 = Step("Step2", g, target=lambda: print("Step2"))
s1.produces(p) # Step 1 Produces p
s2.consumes(p) # Step 2 Consumes p
# Create a MultiprocessingRemote instance
mp_remote = MultiprocessingRemote()
# Inject setup and teardown run_strategy functions for s2
mp_remote(s2)
g.start() # Begin Graph Execution
Backend Mockup
class Remote:
def __call__(self, step):
raise NotImplementedError("Remote subclasses must implement __call__(self, step)!")
class MultiprocessingRemote(Remote):
def __init__(self):
super().__init__()
@staticmethod
def execute_in_process(func):
@wraps(func)
def wrapped(self):
process = multiprocessing.Process(target=func)
process.start()
process.join()
return wrapped
def __call__(self, step):
step.set_run_behavior(step, skip_if_redundant, wait_for_producers, MultiprocessingRemote.execute_in_process)
For the sake of predictability, this BEP should go in only after a way to retrieve the Step
run behaviors, in order, has been implemented. Ideally, step.set_run_behavior
would pass the existing behaviors in first. In the short term, this can be accomplished by using setattr(step, "run", MultiprocessingRemote.execute_in_process)
.