Protocol v4 - drawpile/Drawpile GitHub Wiki
This document describes the third revision of the Drawpile protocol. It is implemented in Drawpile 2.0.
The goals of this redesign were:
- Better decoupling of client features from the server
- Improved client version-agnostism
- Groundwork for semi-trusted servers
- Simplified and more robust server implementation
In the old protocol version, messages were divided into two types: Meta and Command. Command messages are used to modify the drawing board, while Meta messages are used for things such as user login notifications and ACL changes. The server was also responsible for upholding access controls by filtering out commands from locked users or to locked layers. This naturally means the server must keep partial track of the drawing board state (which layers exist) to correctly filter the commands.
This causes several problems:
- Code duplication: the server needs implementations of layer creation, deletion, undo, etc. commands
- More potential bugs due to the above
- Coupling: the server must be able to parse Command messages, which means the client and server must support the exact same protocol
The new protocol divides the Meta message type into two: Control and Meta. Control messages are used only between the client and server and are never added to the session history. The new Meta type is the set of messages that are not Commands, but are to be included in the history (i.e. are recordable), such as user login/logout notifications and chat messages.
All message types are also divided into two categories: Transparent and Opaque. Transparent messages include all Control messages and some Meta messages. The rest of Meta and all of Command messages are Opaque.
Transparent messages are those messages the server must be able to parse or generate. (Administration commands, user login notifications, etc.) Opaque messages are those the server does not need to deserialize, but can treat as binary blobs. It is enough that the server adds the opaque messages to the session history and distributes them in a consistent order.
However, the server is allowed to validate Opaque messages and disconnect misbehaving clients if it knows the protocol version, but clients MAY NOT rely on this and MUST perform validation themselves.
With the new message types, the protocol version scheme is updated to a
namespaced three component (ns:X.Y.Z
) semantic versioning scheme. The components are:
- ns - protocol namespace. The server can potentially support non-Drawpile clients.
- X - server version. This is incremented when the server compatability is broken (i.e. non-interoperable changes to Transparent messages)
- Y - major version. This is incremented when non-control messages are changed. If a Transparent meta message is changed, both this and the server version should change.
- Z - minor version. No changes to the messages, but client interpretation of some of them has changed. (e.g. meaning of brush size or brush rendering algorithm changed)
A server should accept any client whose major protocol version matches that of the server. All clients in a session should share the exact version, however.
The new message categorization
-
Control
- Server message (new)
- Expect n more bytes
- Graceful disconnect notification
- Ping
-
Meta (transparent)
- User join
- User leave
- Session ownership (new)
- Session title change
-
Meta (opaque)
- (Recorded) Chat (server message separated to its own message type)
- General ACL (new)
- Layer ACL
- User ACL (new)
- Interval
- Move pointer
- Marker
-
Command
- No changes
In the previous version, the message type identified whether they belong to Meta or Command types. In the new version, the server cares only whether a message is Transparent or Opaque, while the client is still more interested in the Meta/Command division.
As before, the range of type numbers is 0..255:
- 0..31: Control messages (transparent)
- 32..63: Meta messages (transparent)
- 64..127: Meta messages (opaque)
- 128..255: Command messages (opaque)
This is similar to the division in the old protocol where 0..127 were Meta messages and 128..255 Command messages. Messages in range 32..255 are recordable.
Removed features
The following old commands are removed:
- Login handshake: replaced by Server Message command
- User Attributes: replaced by User ACL and Session ownership. Immutable flags (such as Mod and Auth) are moved to Join message.
- Session config: replaced by Server Message and General ACL messages
- Session snapshotting (replaced by session resetting)
In the new protocol, the Snapshot message is removed in favor of simpler session initialization and explicit session resetting.
Message format
Even though the server does not have to understand opaque messages, it should still enforce context IDs to prevent users from impersonating each other. To this end, the context ID is moved to the message envelope:
struct Message {
uint16_t payloadLength;
uint8_t messageType;
uint8_t contextId;
char *payload;
};
As before, ID 0 is reserved for the server itself and clients that have not yet joined a session. Additionally, ID 255 is reserved for spectator mode clients. (Future feature.)
When a pre-v4 client connects to a server, it will interpret the new contextId byte (which is always zero during login) as the first byte of the message payload. Since the login payload is a UTF-8 encoded string, the zero byte will cause the string to be interpreted as zero length. The client will display a "unsupported server" message and disconnect. Likewise, when a v4 client connects to an old server, the first byte of the hello message will be interpreted as the context ID (which is ignored at this point.) Since the old hello message is not a valid JSON document, the new client will also disconnect and show an error message.
Access controls
From the client perspective, the biggest change is how access controls are handled. Previously, the server was the one solely responsible for enforcing user locks. However, now that the command data is opaque and the server no longer tracks the session state, locking must be performed on the client side.
This is accomplished by an ACL filter that sits between the network client and the local session history. The filter maintains its own state (general user locks and layer ACLs) and drops incoming messages from locked users or that target locked layers.
Note that the ACL filter is used only when receiving data from the network. In case of session recordings, the recording is assumed to be prefiltered. (TODO: this means serverside recordings must be filtered before they can be used.)
ACL changes are accepted from users designated as operators (session owners.) Session ownership is assigned with the new Session owner message. This message can be sent by other operators or by the server itself when it assigns initial ownership to the first user.
Clients that continuously misbehave (send commands even though they should know they are locked or send invalid data) can be automatically reported to the server using the Report abuse message. How the server responds is up to the implementation, but typically it should kick users reported by the majority of session participants.
The ACL related meta messages in the new protocol are:
- User ACL: set user specific locks (list of locked users)
- Layer ACL: set layer specific locks (layer general lock and exclusive access)
- General ACL: set lock settings related to whole session
- Session wide lock (locks everyone, owners included)
- Default lock bit (are new users locked by default)
- Layer control lock (limit layer creation/deletion/etc. to owners)
Clients should default to most permissive settings. If no ACL messages are sent, everyone is allowed to draw on everything as well as modify layers and resize the canvas.
When the hosting user uploads the initial session state, the commands have already been filtered by the client's ACL filter. ACL change commands should not be included, except at the very end to set the configured settings.
Session states
There are four possible states for the session to be in:
Initialization: This is the initial state for a newly created session. While
initializing, the hosting client uploads the initial state, which should correspond to
the client's own session history. The init-complete
command signals the end
of the history upload and the server moves to the running state. If new users
log in during the initialization state, any (non-control) messages they send are held in a queue.
If the hosting user leaves before completing the initial upload, the session will move to running
state.
Running: This is the normal running state. Upon entering this state, any possibly queued messages are added to the session history. Messages received from users are added to the history as normal in this state. When the last client of a non-persistent session leaves, it will move to the shutdown state.
Reset: This state is similar to the initialization state. It can be triggered by session
owners with the session-reset
command. During reset, all input from users other than the resetter
is queued like in the initialization state. The resetter uploads a new state snapshot, which will
replace the existing session history. Once the snapshot is complete, it will be sent to all
logged in users. If the resetter leaves before finishing the upload, the reset will be cancelled
and the server returns to running state as if nothing had happened.
Shutdown: This is a shortlived state in which the server prepares to shut down the session. No new logins are accepted in this state and when transitioning to it, all clients are disconnected. Once all users have disconnected, the session will be deleted.
Semi-trusted server
The division of the protocol to Transparent and Opaque parts makes it possible to encrypt the opaque messages, as the server does not need to be able to parse them. This makes it possible to implement servers where the administrator cannot access the session content. Metadata, such as the users who were logged in and how much they participated, cannot be protected, hence semi-trusted.
This feature is defined on the Session encryption page.
Design constraints:
- Only opaque messages can be encrypted
- Can be used with or without TLS
- Server cannot drop encrypted messages, since many/most encryption modes do not tolerate it
- Server must be able to relay crypto setup info during login
- Direct chat is not encrypted and should not be used in this mode
- New message type for crypto initialization?
Changed messages
With the exception of the context ID move, most of messages remain unchanged from the previous protocol version.
Server message
The server message replaces both the login message type and the OPCMD type Chat messages. Server messages are now UTF-8 encoded JSON documents.
Messages sent by the client should always take the form:
{
"cmd": "command name",
"args": [ ... ],
"kwargs": { ... }
}
The args is a list of positional arguments and kwargs is a map of named arguments. Both are optional.
Supported commands (after login) are:
- init-complete: signals the completion of session initial data upload
- sessionconf: adjust server related session parameters
- kick: kick a user off the session
- list-users: return a list of users (admin tool)
- session-status: return session status report (admin tool)
- kill-session: forcibly shut down the session (moderator only)
- announce-session: announce the session at a listing server
- unlist-session: cancel session announcement
Messages sent by the server should always contain the attribute type
.
The form varies by type, but is typically like this:
{
"type": "message",
"error": "error code" (when type="error"),
"message: "message contents or human readable error message"
}
Possible types are:
- message: a generic message (typically shown in the chat window)
- chat: off-the-record chat message
- alert: an alert type message (typically shown in the alert popup balloon)
- error: an error message (in response to a command)
- result: output of a successfull command
- sessionconf: session settings update (see more below)
- login: login handshake message
Login
The login messages are fairly straightforward conversions of the old login messages to JSON.
See Login process for details.
Session initialization
Previously, the hosting client would send out the initial session state concurrently using Snapshot messages. However, this will not work in the new protocol, since we want to ensure that all Opaque messages sent by a client end up in the session history in the same order. (This is to make session encryption easier.)
Since server controlled session snapshotting is no longer supported, we can do away with the Snapshot message all together. Instead, a fresh session starts out in a special initialization state. In this state, only the first user (the hosting user) may send commands. If any other user joins while initialization is incomplete, their messages will be held in a buffer until the session is ready.
During initialization, context IDs are not enforced for the hosting user, since they may be uploading a previously recorded session. Once the initial data has been uploaded, the hosting user will inform the server with a server command message:
{
"cmd": "init-complete"
}
After this, any buffered commands will be added to the history and context IDs will be enforced for all users.
Note that clients later connecting will not be able to distinguish initialization commands from live session commands. Therefore, setting the sessions access controls should be done at the very end of the initialization and initialization commands (if inherited from a previous session) should be pre-filtered. The Session Owner Change command should not be used during initialization.
Session initialization may contain UserJoin messages for context IDs other than the hosting user. Once initialization is complete, the server should generate Logout messages for all non-existent users.
Session reset
Session resetting is a generalisation of the session snapshot feature from the old protocol. Unlike snapshotting, resetting is initiated by a (session owner) client. Like snapshotting, the reset replaces the session history, but while snapshotting was intended to be transparent to clients, resetting can be used to entirely change the session content.
The purpose of session resetting is to reduce the amount of history new users have to download when joining a long running session.
A session owner initiates the reset by sending a server command:
{
"cmd": "session-reset"
}
Usually, as part of the snapshot procedure, the initiating client will lock the canvas and disable logins for the duration. (Any clientside locks will be cleared naturally by the reset.)
Upon receiving the session-reset command, the server will clear the session history and go into a session initialization state, like when the session was first created. All input from users other than the resetter will be put on hold until the reset is complete. When the server is ready, it will inform the resetter with a message:
{
"type": "reset",
"state": "init",
"message": "Prepared to receive session data"
}
All further commands from the resetting user will then be treated like in the session
initialization phase. Typically, the reset will be a snapshot of the current canvas,
but it can also be a new blank slate, for example. As in initialization mode, the user
sends an init-complete
command to finish the reset. At this point, the server
has a complete new session starting point and any queued messages (usually there should
be none) can be added to it and all users unblocked.
If the resetter disconnects before initialization is complete, the new state is discarded and the original restored. The session will then continue as if nothing had happened.
The server will prepend the following messages to the reset:
- Join messages for currently logged in users
- Session ownership assignment
- The current session title
The resetter is responsible for setting any opaque access control flags.
When the server has the complete reset snapshot, it will inform all logged in users, the resetter included, before sending the new session content as normal:
{
"type": "reset",
"state": "reset",
"message": "Session reset"
}
Upon receiving this message, clients must clear their session history and canvas content. This includes setting the canvas size to zero and clearing all access control flags, including session ownership status. (Note: this affects only the clientside flags: user count limit and other server enforced settings are not changed.)
Clients should not assume their brush settings (or any other state) were restored in the reset.
If session encryption is used, all clients must change their keys before continuing. The resetter must do the keychange as the first thing in the reset snapshot. Since messages sent before the reset (i.e. those held in the queue during the reset) cannot be decrypted afterwards, the resetter should lock all users, then wait for a while before starting the reset to minimize the chances of any message being enqueued.
Session ownership
Previously, session ownership/operator status was assigned with the User Attributes command, which is now removed.
The payload of the SO message is a list of context IDs. The list implicitly contains the ID of the user sending the message (i.e. users cannot deop themselves) The new operator list replaces the existing one. List items referencing nonexistent users are silently ignored. (Such errors can happen when a user logs out at the same time they are granted ownership.)
The SO message may only be sent by a session owner or the server itself. Since this is a Transparent message directly related to a concept the server is aware of, the server is in charge of filtering it. Thus, the client ACL filter may accept all SO messages sent by the server.
Session config
Server message replaces the old SessionConf and SessionTitle messages. When sent by the client, the message format is:
{
"cmd": "sessionconf",
"kwargs": {
"closed": bool,
"persistent": bool,
"preservechat": bool,
"title": string,
"nsfm": bool
}
}
The keyword arguments are optional. Omitted settings will not be changed. Note. The session lock bits were moved to the General ACL message. The sessionconf message contains only information relevant to the server and, unlike ACL messages, should not affect the interpretation or filtering of other messages on the client side. E.g. changing the "closed" flag will not have any effect on the current users, but will determine whether or not the server will let new users in.
The nsfm
flag tags the session as age-restricted. Once set, the flag cannot
be unset expect by resetting the session.
When session settings change, the server will send a message:
{
"type": "sessionconf",
"config": {
(same properties as in sessionconf.kwargs)
}
}
Chat
In the old protocol, chat messages could be send in two modes: direct (off-the-record) or recorded. In direct mode, the message was resent by the server directly to all current session participants and not stored in the session history. In recorded mode, the message was added to the session history like every other command. In protocol v4, the opaque/transparent distinction complicates things. There are two solutions:
- Separate messages for direct (transparent) and recorded (opaque) chat
- Single transparent chat messages with flag that determines its type
The problem with the latter approach is that transparent messages cannot be encrypted, and the content of the chat is very much something that should be encrypted. This also means that direct chat cannot (or should not) be used in encrypted mode.
In the new protocol, the first approach is used. Direct chat messages are sent using the server message command.
Sending a direct message:
{
"cmd": "chat",
"args": ["chat message content"],
"kwargs": { ... extra options for clients ... }
}
The server redistributes the message:
{
"type": "chat",
"user": user ID,
"message": "chat message",
"options": { ... kwargs content ... }
}
The session config setting "preservechat" selects whether clients should send a direct or recorded chat message by default. In encrypted mode, clients should always default to recorded messages, or even disable direct message sending entirely.