Asynchronicity - SolarisJapan/lunaris-wiki GitHub Wiki

What is Asynchronicity?

Asynchronous programming is simply programming tasks which can operate concurrently or at least without holding up the processing of code further into a application or script. In general scripts written when learning programming basics are almost exclusively synchronous which allows for some simplification but also leads to limitation.

Purpose and Use Cases

Asynchronicity can be used to allow concurrent execution of code. In general programs are executed so quickly that even if it is possible to achieve many of the purposes of a function asynchronously, it may not be worth the increased costs in code maintenance, edge-case handling, readability, and more.

However some tasks are so significantly slower than a typical script execution that performing them asynchronously may drastically improve execution speed, particularly if slow-unrelated tasks happen sequentially.

Here are a few times where you may want to consider asynchronous programming:

API function calls or anything done through an external network
UX improvement by reducing user wait time
Functions, database queries, etc. with long run-times
Any combination or multiples of the above sequentially

Of course the main stipulation of deciding if asynchronous programming is appropriate is if it's possible. It will only work if each element performed asynchronous does not require completion of the other tasks first or potentially impact the way the other tasks perform.

Methods of Implementation

There are a few main ways commonly employed to allow asynchronous programming. Through the use of background processing (jobs, queues, and workers), or by using a task supervisor.

Background Processing

To explain the terminology used here:

A job (or task) is placed into a queue which is being monitored by a worker. The worker performs the job while removing its entry from the queue.

Examples of background processing libraries include ActiveJob and Sidekiq for Ruby and Exq for Elixir.

This method of asynchronous programming may be advantageous if the job's completion has no bearing on the execution of the remaining code in the current script. The job is simply queued and forgotten by the current application runtime.

Task Supervision

Contrary to background processing, task supervision is likely appropriate if the current application runtime needs the result of the asynchronous task, just perhaps not immediately. Elixir has a built in module (Task)[https://hexdocs.pm/elixir/Task.html] for handling such tasks.

Elixir Task

An easy way to implement the usage of the Task module is:

Task.async(fn ->
  # Do something
end)
|> Task.await()

You can also queue multiple events

[function_1, function_2, function_3]
|> Enum.map(&Task.async(&1))
|> Enum.map(&Task.await(&1))

Task.await vs Task.yield vs Task.yield_many

Task.await

The Task docs can better explain how Task.await works, but essentially:

The function is called as soon as it is passed to Task.async
Task.await gives the task a timeout (in ms, default 5000)
Task is processed
- If completed then processing is released back to the main application runtime
- If timeout passes before completion an error is raised

So while all the tasks functions occur concurrently, the above snippet occurs sychronously. I.e. each task is given a timeout sequentially, but the timeout shorts when the task is complete (or complete before even reaching the start of the timeout).

Therefore this list of tasks would take longer than a single timeout to finish but would not raise an error as none of the individual timeouts expired. The following example will take almost 14 seconds to finish despite each timeout being 5 seconds.

[fn -> Process.sleep 4000 end, fn -> Process.sleep 8800 end, fn -> Process.sleep 13600 end]
|> Enum.map(&Task.async(&1))
|> Enum.map(&Task.await(&1))

Task.yield

Unlike Task.await, Task.yield will not throw an exception if the timeout is passed, rather it will release processing back to the main application runtime. The user can then decide how to handle unfinished tasks: shutdown the task, wait longer, or whatever is appropriate. Just like to Task.await, each yield is given it's own timeout.

This code segment will take exactly 15 seconds, but none of the functions will finish before their respective timeout.

[fn -> Process.sleep 6000 end, fn -> Process.sleep 11000 end, fn -> Process.sleep 16000 end]
|> Enum.map(&Task.async(&1))
|> Enum.map(&Task.yield(&1))

Task.yield_many

In such cases where the UX may be impacted by how efficient your processing is, it may be nice to know exactly how long the maximum timeout would be rather than it depending on the weak links (i.e. slow execution) and order of tasks in a list. This is where Task.yield_many may be most appropriate. Rather than mapping through the tasks, you simply past the list directly and provide a total timeout.

The following will allow only up to 5 seconds total for each task to complete, and its return value will show which have completed prior to the timeout. See the docs to better understand how to read the output.

[fn -> Process.sleep 6000 end, fn -> Process.sleep 11000 end, fn -> Process.sleep 16000 end]
|> Enum.map(&Task.async(&1)) |> Task.yield_many()

Concerns

These are some things to keep in mind when using asynchronous processing.

Shopify Webhooks

Shopify webhooks are a great example of something that should be processed asynchronously, namely via background processing. Shopify will continue to send you webhooks until you respond with a 200 status code. After too many failures they may delete that webhook from your app.

To prevent this from happening the webhooks should simply be accepted with minimal manipulation and queued as a job for a worker to perform later. Shopify is happy that your webhooks all pass, and you have better control over the frequency and max attempt count of the retries, better logging and error handling potential, and you can choose when to try the worker after fixing any errors.

Race Conditions

If your asynchronous processing is manipulating the database then you should be mindful of possible race conditions. As an example from EasyPoints, a job would be queued every few minutes via a cron job to give points for any fulfillments over a week old. It is possible that a customer had multiple fulfillments created for them within a few minute window, and the batch job would attempt to give them points for both of these at the same time.

Two separate instances of the worker responsible for awarding points would access the database at the same time to see their current balance and increase it by the awarded amount. However they both accessed the database before either of them had finished so one of the workers overwrote the job the other work performed.

If you are performing batch jobs that manipulate the database, you may want to group all updates to a single entry pro grammatically, and then preform an update only once per entry per batch.

Idempotency

Particularly important for background processing is idempotency. Essentially if a task is performed multiple times in a row it should have the same impact as if it were preformed once.