teleop_web - dingdongdengdong/astra_ws GitHub Wiki
astra_teleop_web
Detailed Code Diagrams and Explanations for This document provides detailed Mermaid diagrams and explanations for the Python files in the astra_teleop_web
package, focusing on a comprehensive representation of the code flow, split into logical sections for readability.
astra_teleop_web/src/astra_teleop_web/webserver.py
This script sets up and runs an aiohttp web server that handles WebRTC connections, serves static web files, and streams camera feeds to the client browser. This diagram is split into several parts to detail different aspects of the script.
WebServer Class Initialization and Server Startup
graph TD
A[Start] --> B["Initialize WebServer Class"];
B --> C["Initialize track dict (head, wrist_left, wrist_right)"];
C --> D["Initialize datachannel dict (control)"];
D --> E["Initialize on_hand, on_pedal, on_control callbacks to None"];
E --> F["Create new asyncio event loop for server thread"];
F --> G["Start server in new thread (asyncio_run_thread_in_new_loop)"];
G --> H["Run Server Coroutine (async)"];
H --> I["Get running asyncio loop"];
I --> J["Create aiohttp Application"];
J --> K["Initialize pc dict {}"];
K --> L["Add on_shutdown hook"];
L --> M["Add POST route /offer"];
M --> N["Add static route / for static files"];
N --> O["Add on_response_prepare hook for CORS headers"];
O --> P{Cert.pem exists?};
P -- No --> Q["Generate SSL certs (openssl)"];
Q --> R["Create SSL context"];
P -- Yes --> R;
R --> S["Setup aiohttp AppRunner"];
S --> T["Create aiohttp TCPSite (0.0.0.0:9443, ssl)"];
T --> U["Start TCP Site"];
U --> V["Log server start info"];
V --> W["Wait forever (serving)"];
W --> X["End Server Thread"];
Explanation:
This section of the diagram details the initial setup and startup of the WebServer
. The WebServer
class is initialized, setting up dictionaries to manage WebRTC media tracks and data channels, and initializing callback attributes that will be used to interface with the teleoperation logic. A new asyncio event loop is created specifically for the web server to run in a separate thread, allowing it to operate concurrently with other parts of the application. The run_server
coroutine, executed within this new loop, sets up the aiohttp web application, defines routes for handling incoming requests (specifically the /offer
for WebRTC signaling), configures serving of static files, adds hooks for handling events like server shutdown and preparing responses (for CORS headers), and sets up SSL for secure connections. Finally, it starts the TCP site listener, making the server accessible.
/offer
Request
Handling the graph TD
AA["POST /offer Request Received"] --> AB["Get JSON parameters from request"];
AB --> AC["Create RTCSessionDescription from offer parameters"];
AC --> AD{pc dict contains 'head'?};
AD -- Yes --> AE["Return HTTPBadRequest (Multiple connection)"];
AD -- No --> AF["Create new RTCPeerConnection"];
AF --> AG["Store PC in pc dict with key 'head'"];
AG --> AH["Define on_connectionstatechange handler"];
AH --> AI["Log connection state change"];
AI --> AJ{Connection State is 'failed'?};
AJ -- Yes --> AK["Close PC"];
AK --> AL["Delete pc['head']"];
AL --> AM["Set datachannel['control'] = None"];
AM --> AN["Set track dict values = None"];
AN --> AO["Handler End"];
AJ -- No --> AP{Connection State is 'closed'?};
AP -- Yes --> AL;
AP -- No --> AO;
AH --> AQ["Define on_datachannel handler"];
AQ --> AR["Log datachannel creation"];
AR --> AS{Channel Label};
AS -- "hand" --> AT["Define message handler for 'hand'"];
AT --> AU["Call on_hand callback with parsed message (camera_matrix, distortion_coefficients, corners, ids)"];
AU --> AV["Handler End"];
AS -- "pedal" --> AW["Define message handler for 'pedal'"];
AW --> AX["Call on_pedal callback with parsed message (pedal_real_values)"];
AX --> AV;
AS -- "control" --> AY["Set datachannel['control'] = channel"];
AY --> AZ["Define message handler for 'control'"];
AZ --> BA["Call on_control callback as asyncio task with parsed message (control_type)"];
BA --> AV;
AS -- Unknown --> BB["Raise Exception ('Unknown label')"];
BB --> AV;
AQ --> BC["Create FeedableVideoStreamTrack for head"];
BC --> BD["Add Head Track to PC (sendonly, mid 0)"];
BD --> BE["Create FeedableVideoStreamTrack for wrist_left"];
BE --> BF["Add Wrist_left Track to PC (sendonly, mid 1)"];
BF --> BG["Create FeedableVideoStreamTrack for wrist_right"];
BG --> BH["Add Wrist_right Track to PC (sendonly, mid 2)"];
BH --> BI["Set Remote Description with Offer"];
BI --> BJ["Create Answer"];
BJ --> BK["Set Local Description with Answer"];
BK --> BL["Return JSON Response (sdp, type)"];
BL --> AO;
Explanation:
This diagram details the handling of the incoming WebRTC offer from the client. When a POST request is received at the /offer
endpoint, the server parses the SDP offer and type. It checks if a connection already exists to prevent duplicates. A new RTCPeerConnection
is created, and event handlers are set up for connection state changes (to clean up on failure or closure) and for incoming data channels. Based on the data channel's label ("hand", "pedal", "control"), specific message handlers are defined to parse the received JSON data and call the appropriate callbacks (on_hand
, on_pedal
, on_control
). Three FeedableVideoStreamTrack
instances are created for the camera feeds and added to the peer connection as sendonly transceivers. The server then sets the remote description with the client's offer, creates an answer, sets its local description, and returns the answer in a JSON response to the client, completing the WebRTC signaling handshake.
FeedableVideoStreamTrack Functionality
graph TD
CA["FeedableVideoStreamTrack.feed called (image_with_timestamp)"] --> CB{Queue Not Full?};
CB -- Yes --> CC["Put image_with_timestamp in queue"];
CC --> CD["End feed"];
CB -- No --> CE["Try getting oldest item from queue"];
CE --> CF["Mark task done for retrieved item"];
CF --> CG["Log 'lost one image'"];
CG --> CH["Put new image_with_timestamp in queue"];
CH --> CD;
CE -- Queue Empty --> CI["Log 'times fly!'"];
CI --> CH;
DA["FeedableVideoStreamTrack.recv called (async)"] --> DB{Ready State is 'live'?};
DB -- No --> DC["Raise MediaStreamError"];
DB -- Yes --> DD["Run q.get in thread executor"];
DD --> DE["Get image_with_timestamp from queue"];
DE --> DF["Mark task done for retrieved item"];
DF --> DG["Create av.video.VideoFrame from image"];
DG --> DH["Set frame pts and time_base"];
DH --> DI["Return frame"];
Explanation:
This section focuses on the custom FeedableVideoStreamTrack
class, which is crucial for injecting video frames into the WebRTC stream from other parts of the application. The feed
method shows how new image frames are added to an internal queue, prioritizing the latest frame by potentially dropping older ones if the queue is full. The recv
method, which is called by the aiortc
library when it needs a frame to send over WebRTC, demonstrates how a frame is retrieved from the queue asynchronously, converted into an av.video.VideoFrame
, and stamped with presentation time information before being returned.
Helper and Camera Feeding Functions
graph TD
EA["control_datachannel_log called (message)"] --> EB{Control Datachannel exists?};
EB -- Yes --> EC["Send JSON message via control datachannel"];
EC --> ED["End log"];
EB -- No --> ED;
FA["track_feed called (name, image_with_timestamp)"] --> FB{Track exists for name?};
FB -- Yes --> FC["Call track.feed with image_with_timestamp"];
FC --> FD["End track_feed"];
FB -- No --> FD;
FC -- Exception --> FE["Pass _ignore error_"];
FE --> FD;
GA["feed_webserver called in thread (webserver, device)"] --> GB["Open Camera with cv2.VideoCapture"];
GB --> GC["Set Camera Properties"];
GC --> GD{Loop indefinitely};
GD --> GE["Read Camera Frame"];
GE --> GF["Convert frame to RGB"];
GF --> GG["Get current time_ns"];
GG --> GH["Prepare image_with_timestamp tuple"];
GH --> GI["Call webserver.track_feed for device name"];
GI -- Success --> GJ["Continue Loop"];
GI -- Exception --> GK["Pass _ignore error_"];
GK --> GJ;
GJ --> GD;
HA["Main execution block"] --> HB["Create WebServer Instance"];
HB --> HC["Start feed_webserver threads for 'head', 'wrist_left', 'wrist_right'"];
HC --> HD["Assign print function to webserver callbacks (on_hand, on_pedal, on_control)"];
HD --> HE["Loop infinitely (main thread)"];
HE --> HF["End Script"];
Explanation:
This final section for webserver.py
details helper functions and the camera feeding process. control_datachannel_log
provides a safe way to send messages to the client over the "control" data channel if it's available. track_feed
allows feeding image data to a specific video track, handling cases where the track might not exist. The feed_webserver
function, designed to run in a separate thread, continuously captures frames from a specified camera device using OpenCV, prepares the image data and timestamp, and calls webserver.track_feed
to push the frame into the appropriate WebRTC video stream. The main execution block demonstrates how a WebServer
instance is created, camera feeding threads are started for each camera, and placeholder callbacks are assigned before the main thread enters an infinite loop to keep the process running.
astra_teleop_web/src/astra_teleop_web/teleoprator.py
This script defines the core teleoperation logic, processing input from the web interface and interacting with the robot's control system. This diagram is also split for better readability.
Teleopoperator Class Initialization
graph TD
A[Start] --> B["Initialize Teleopoperator Class"];
B --> C["Instantiate WebServer"];
C --> D["Assign Teleopoperator methods to WebServer callbacks (on_hand, on_pedal, on_control)"];
D --> E["Initialize robot control callbacks to None (on_pub_goal, on_pub_gripper, etc.)"];
E --> F["Initialize state variables (teleop_mode, percise_mode, lift_distance, Tscam, Tcamgoal_last, gripper_lock, last_gripper_pos, far_seeing)"];
F --> G["Get solve function from astra_teleop.process"];
G --> H["End Initialization"];
Explanation:
This diagram shows the initial setup of the Teleopoperator
class. It creates an instance of the WebServer
to handle web communication. Crucially, it links its own methods (hand_cb
, pedal_cb
, control_cb
) to the corresponding callback attributes of the WebServer
, ensuring that it receives data from the web client. It also initializes several attributes that are expected to be assigned callback functions by the main robot control script, allowing the Teleopoperator
to send commands and receive state information from the robot. Various state variables, including those related to teleoperation mode, precision settings, pose tracking, and gripper control, are initialized. Finally, it obtains the solve
function from the astra_teleop.process
module for performing ArUco pose estimation.
hand_cb
, pedal_cb
, control_cb
)
Handling Input Callbacks (graph TD
IA["hand_cb Called (camera_matrix, distortion_coefficients, corners, ids)"] --> IB["Call solve to get Tcamgoal transforms (left, right)"];
IB --> IC{Tcamgoal not None for side?};
IC -- Yes --> ID{Update Tcamgoal_last for side};
ID --> IE{Percise Mode Enabled?};
IE -- Yes --> IF["Apply Low Pass Filter to Tcamgoal"];
IF --> IG["Update Tcamgoal_last with Filtered Pose"];
IE -- No --> IG["Update Tcamgoal_last with Raw Pose"];
IG --> IH{Teleop Mode Active?};
IH -- Yes --> II{Check for valid Tcamgoal_last and Tscam};
II -- Valid --> IJ["Calculate Tsgoal = Tscam @ Tcamgoal_last"];
IJ --> IK["Call on_pub_goal with Tsgoal"];
IK --> IL["Update Head Tilt based on Lift Distance"];
IL --> IM["Call on_pub_head"];
IM --> IN["End hand_cb"];
II -- Invalid --> IO["Log and Raise Exception"];
IO --> IN;
IH -- No --> IN;
IC -- No --> IN;
JA["pedal_cb Called (pedal_real_values)"] --> JB["Process Pedal Values (clip, normalize)"];
JB --> JC{Teleop Mode};
JC -- "arm" --> JD["Process Arm Mode Pedals"];
JD --> JE["Calculate Lift Velocity_Update Tscam_Lift_Distance"];
JE --> JF["Process Gripper Pedals_Lock"];
JF --> JG["Update last_gripper_pos if not locked"];
JG --> JH["Call on_pub_gripper"];
JH --> JI["Call on_cmd_vel with 0,0"];
JI --> JJ["End pedal_cb"];
JC -- "base" --> JK["Process Base Mode Pedals"];
JK --> JL["Calculate Linear_Angular Velocity"];
JL --> JM["Call on_cmd_vel with velocities"];
JM --> JJ;
JC -- Other --> JJ;
KA["control_cb Called (control_type)"] --> KB["Log Command Type"];
KB --> KC{Control Command Type};
KC -- "reset" --> KD["Update Teleop Mode to None"];
KD --> KE["Reset Gripper Position"];
KE --> KF["Call reset_arm #async"];
KF --> KG["Call on_reset"];
KG --> KH["Log Reset Event"];
KH --> KI["End control_cb"];
KC -- "done" --> KJ["Call on_done"];
KJ --> KK["Log Done Event"];
KK --> KI;
KC -- "teleop_mode_*" --> KL["Update Teleop Mode_Percise Mode"];
KL --> KI;
KC -- "percise_mode_*" --> KM["Update Percise Mode"];
KM --> KN{Teleop Mode is Arm?};
KN -- Yes --> KO["Update Teleop Mode to Arm"];
KO --> KI;
KN -- No --> KI;
KC -- "gripper_lock_*" --> KP["Set Gripper Lock State"];
KP --> KQ["Log Gripper Lock State"];
KQ --> KI;
Explanation:
This diagram section details the main callback functions that process input from the web client. hand_cb
is triggered by incoming ArUco data. It uses the solve
function to get marker poses, applies filtering in precise mode, and if a teleop mode is active and valid data is available, calculates the desired end-effector pose in the robot's base frame and sends it to the robot via on_pub_goal
. It also updates the head tilt. pedal_cb
handles pedal input, processing values and calculating lift/gripper commands in "arm" mode or base velocities in "base" mode, calling the corresponding robot control callbacks. control_cb
processes discrete control commands, logging the command, updating the teleop mode, resetting the robot arms, or setting gripper lock states, and calling relevant robot control callbacks.
Reset and Mode Change Helper Methods
graph TD
LA["reset_Tscam Called (async)"] --> LB["Reset Tscam and Tcamgoal_last"];
LB --> LC{"Wait for new Tcamgoal_last (loop)"};
LC -- Available --> LD{"For Left and Right Side"};
LD --> LE["Get Current EEF Pose (on_get_current_eef_pose)"];
LE --> LF["Get Tcamgoal from Tcamgoal_last"];
LF --> LG["Calculate Tscam = Tsgoal @ inv(Tcamgoal)"];
LG --> LH["Log Tscam"];
LH --> LD;
LD -- Done --> LI["End reset_Tscam"];
LC -- Not Available --> LC;
MA["update_percise_mode Called (percise_mode)"] --> MB["Set percise_mode flag"];
MB --> MC["Update solve function scale"];
MC --> MD["Call reset_Tscam (async)"];
MD --> ME["End update_percise_mode"];
NA["reset_arm Called (async)"] --> NB["Set far_seeing and lift_distance"];
NB --> NC["Get Initial EEF Poses (on_get_initial_eef_pose)"];
NC --> ND{"Wait for Arm to Reach Goal Pose (loop)"};
ND -- Reached --> NE["End reset_arm"];
ND -- Not Reached --> NF{"For Left and Right Side"};
NF --> NG["Check Pos/Rot Distance to Goal"];
NG --> NH{"Within Tolerance?"};
NH -- No --> NI["Log Resetting Info"];
NI --> NJ["Call on_pub_goal"];
NJ --> NK["Call on_pub_gripper"];
NK --> NL["Update Head Tilt based on far_seeing"];
NL --> NM["Call on_pub_head"];
NM --> NN["Sleep"];
NN --> NF;
NH -- Yes --> NF;
Explanation:
This section details helper methods within teleoprator.py
related to resetting the system and changing teleoperation modes. reset_Tscam
asynchronously resets the camera-to-base transformation by waiting for new marker pose data from the camera and using the robot's current end-effector pose to recalculate the transform. update_percise_mode
updates the precision setting and triggers a recalculation of Tscam
. reset_arm
asynchronously moves the robot arms to a predefined initial pose, waiting until they reach the target within a specified tolerance before completing, and also updates the head tilt.
Other Helper Methods
graph TD
OA["get_head_tilt"] --> OB["Calculate Head Tilt Angle based on lift_distance"];
OB --> OC["Return Angle"];
PA["error_cb Called msg"] --> PB["Call webserver.control_datachannel_log threadsafely"];
PB --> PC["End error_cb"];
Explanation:
This final section for teleoprator.py
describes simpler helper functions. get_head_tilt
calculates a suitable head tilt angle based on the robot arm's lift distance using a linear interpolation formula. error_cb
provides a convenient way to send error messages back to the client via the control data channel, ensuring the message is sent safely from any thread.
webserver.py
, teleoprator.py
, and Static Files
Interaction between This diagram shows the overall flow of data and control between the client (web browser), server components, and the robot control system.
graph LR
A[Web Browser Client _Static Files_] -->|HTTP/HTTPS Request| B[WebServer];
B -->|Serve index.html, .js, .css| A;
A -->|WebRTC Offer _SDP, ICE_| B;
B -->|WebRTC Answer _SDP, ICE_| A;
F[Camera Feed Threads] -->|Feed Frames| G[FeedableVideoStreamTrack];
G -->|Stream Video| B;
B -->|WebRTC Video Stream| A;
A -->|WebRTC Data Channel _hand_: ArUco Data from OpenCV.js| B;
A -->|WebRTC Data Channel _pedal_: Pedal Data| B;
A <-->|WebRTC Data Channel _control_: Commands_Status| B;
B -->|Call hand_cb| C[Teleopoperator];
B -->|Call pedal_cb| C;
B -->|Call control_cb| C;
C -->|Call on_pub_goal| D[Robot Control System];
C -->|Call on_pub_gripper| D;
C -->|Call on_pub_head| D;
C -->|Call on_cmd_vel| D;
D -->|Call on_get_current_eef_pose| C;
D -->|Call on_get_initial_eef_pose| C;
D -->|Call on_reset| C;
D -->|Call on_done| C;
D -->|Sensor Feedback _e.g., Current Pose_| C;
Explanation:
This diagram provides an overview of how the different components of the astra_teleop_web
system interact. The Web Browser Client first requests static files from the WebServer
to load the interface. It then establishes a WebRTC connection. Camera Feed Threads capture images from the robot's cameras and feed them into FeedableVideoStreamTrack
instances, which stream video to the client via the WebServer
. The client, using JavaScript and OpenCV.js, processes its local camera feed to detect ArUco markers and sends this data, along with pedal inputs and control commands, to the WebServer
over WebRTC data channels. The WebServer
receives this data and triggers the corresponding callback methods in the Teleopoperator
. The Teleopoperator
processes these inputs, calculates the desired robot actions, and sends commands to the "Robot Control System" (representing the robot's low-level control interface) via a set of predefined callbacks. The Robot Control System executes the commands and can provide feedback back to the Teleopoperator
. This interaction enables a user to remotely teleoperate the robot through the web interface using hand tracking and pedal controls.