4.Adding a GenServer Part 2 - abarr/remote GitHub Wiki
After ensuring I have a working test for the GenServer
, it is back to the requirements.
Requirments
As it stands, the updated list looks like this
:remote
app launchesGenServer
when the application starts - DONE- Holds state %{max_number: :integer, timestamp: :etc_datetime} - DONE
:timestamp
starts asnil
and is updated each time someone queries theGenServer
- DONE:timestamp
represents the last time someone queried theGenServer
- TODO
- Every 60 seconds, the
GenServer
runs a job - DONE
- The
GenServer
updates every user's points value with a random value between 0..100 - TODO - The
Genserver
updates:max_number
in the state to a new random number between 0..100 - DONE
- The
GenServer
accepts a call to return Users with points greater than:max_number
with a limit of2
and the:timestamp
from the previous call. - TODO
Given the second part of the second requirement depends on requirement number 3, I will come back to it after implementing the update.
Updating 1,000,000 Rows
I tested several options.
- I used
Ecto.Repo.update_all/2
$ iex -S mix
Erlang/OTP 24 [erts-12.0] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1]
Interactive Elixir (1.13.1) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> Remote.Repo.update_all(Remote.Users.User, set: [points: Enum.random(0..100), updated_at: DateTime.utc_now()])
[debug] QUERY OK source="users" db=3870.1ms decode=0.6ms queue=0.1ms idle=278.6ms
UPDATE "users" AS u0 SET "points" = $1, "updated_at" = $2 [67, ~U[2022-01-10 10:17:03Z]]
{1000000, nil}
Unfortunately, all rows were updated to the same random number. So it was not an option to use Ecto.Repo.update_all/2
without adding additional looping to generate unique random numbers.
- Rather than use
Enum
, I tried using rawSQL
as a fragment generating the number using thePostgres
random
function. I started by working out how to get a random:integer
# random function - random number 0.0 to 1.0
SELECT random()
0.9474413789530018
# random function with floor - rounds random number
SELECT floor(random())
0
# random function with floor and multiplied by the high number - low number
select floor(random() * (100 - 0))
44
# But if you try it with a low number other than 0, it can fail
select floor(random() * (100 - 20))
11
# So to account for variable ranges add on the low value
select floor(random() * (100 - 20)) + 20
52
# This actually returns type of `double precision` so it is also important to cast the value
select cast(floor(random() * (100 - 20)) + 20 AS integer)
68
with a SQL
fragment to generate a random :integer
, I then tested the code
update(User,
set: [
points:
fragment(
"cast(floor(random() * (? - ?)) + ? AS integer) ",
type(^max_value, :integer),
type(^min_value, :integer),
type(^min_value, :integer)
),
updated_at: fragment("now() at time zone 'utc'")
]
)
|> Repo.update_all([])
This worked fine but still took between 5 and 9 secs.
- Create a new table, copying the
Users
table, updating the new table and then DROPUser
and ALTER new table
CREATE TABLE updated_users AS
SELECT * FROM users;
UPDATE updated_users
SET points = floor(random() * (100 + 1)), updated_at = now() at time zone 'utc';
DROP TABLE IF EXISTS users;
ALTER TABLE updated_users RENAME TO users;
# Results
SELECT 1000000
UPDATE 1000000
DROP TABLE
ALTER TABLE
5.1s
This consistently took more than 5s
; running each query separately reduced the ALTER TABLE
to 18ms
, but it was still long enough that a call to query the table while the new table was being copied and then updated would result in a timeout.
- Combine the
CREATE
UPDATE
queries to reduce the time the rows are locked
The default timeout when calling a GenServer.call
is 5 secs, so I was aiming for something that consistently updated 1,000,000 rows in less than half that time. I rewrote my query:
CREATE TABLE updated_users
(
id,
points,
inserted_at,
updated_at
) AS
SELECT id, cast(floor(random() * (100 - 0)) + 0 AS integer) , inserted_at, now() at time zone 'utc'
FROM users;
# Result
SELECT 1000000
2.2s
By combining this with a DROP
and ALTER
query, I updated 1,000,000 rows with new points random :integer
and updated_at with eth current utc
timestamp in just 2 secs.
I created two functions to be used by the handle_info
def update_all_users_points(max, min) do
Enum.each(update_sql(min, max), fn sql ->
Ecto.Adapters.SQL.query(Repo, sql)
end)
end
defp update_sql(min, max) do
[
"""
CREATE TABLE updated_users (id, points, inserted_at, updated_at) AS
SELECT id, cast(floor(random() * (#{max} - #{min})) + #{min} AS integer) , inserted_at, now() at time zone 'utc'
FROM users;
""",
"""
DROP TABLE users;
""",
"""
ALTER TABLE updated_users RENAME TO users;
"""
]
end