LCR AGT Troubleshooting - directedmachines/customer-support GitHub Wiki

Table of Contents

Overview

This page describes various scenarios and warning messages that might occur during autonomous operation, enabled through the Autonomous Ground Task (AGT) User Interface.

Symptoms

Motion disabled due to obstacles

Obstacle avoidance (O.A) path planner is a critical component of both manual and autonomous motion. Please review the Obstacle Avoidance Guide to better understand its behavior. Various warnings due to O.A are explained below.

To troubleshoot:

  • Look at the obstacle avoidance stream to see where the obstacles are detected
  • Ensure there are no exclusion areas defined in the AGT plan
  • If no obvious obstacles are detected review the UI stream for the active depth sensor. Detected obstacles may be caused by a corrupted depth image, see: LCR Realsense Camera Troubleshooting - Corrupted depth image

Heading Oscillation

If the LCR is not behaving as expected while inside a row, it is very likely the plan is mis-configred. Other times, rarely, it might be due to

  • Drive Train Faults

  • Sensor Faults

  • GPS antenna is not placed on top of axle, center of solar panel (got knocked out of its magnetic base)

  • Magnetic heading is not correct, GPS based corrections not applied. Click on the map in AGT UI, select "Display magnetic heading corrections" to see the pose estimator heading clusters. Large corrections (over 20 degrees) indicate need for re calibration

  • To test for GPS or magenetics, perform the following steps:

  1. Move the robot into an open area
  2. Create two waypoints connected by a path that follow the North -> South azimuth
  3. Move the robot down the newly created path
  • If the magnetics follow the direction of the robot, then the issue is found with the GPS and not with the magnetometer. If this issue is found with the magnetometer, such as if:
  1. Magnetic indicator is oscillating, even when robot is stationary: likely a calibration is needed (See the Calibration Guide)
  2. Position indicator (green arrow) of robot, even when robot is stationary, moves, jumps on the map: GPS antenna likely not connected properly to GPS board, or antenna has fallen off
  3. Robot path oscillates when navigating a straight path between WPs: GPS antenna not placed in center of rotation
  • If the GPS reads a red "N/A", perform the following steps:
  1. First, in the manual UI, hit the cog in the bottom left and search "Serial" for the serial device list (dashboard/devices/serial), and "System" for the system log (core/management/system-log).
  2. In the serial device list, the device will be found under state(text) as "/dev/serial/by-id/usb-u-blox_AG_-_www.u-blox.com_u-blox_GNSS_receiver-if00": "../../ttyACM2", if this is not found then the GPS is unplugged, open an issue in GitHub for it to be plugged back in.
  3. In the system log you should find "[GPS_ROV_HUB_BOARD configured in GPS_NORMAL mode]" by searching (ctrl+F) for "GPS", if you don't, then the GPS is not configured properly and the LCR should be restarted. If this does not fix the issue, create a GitHub issue for the GPS to be inspected.
  4. If everything is functioning properly, open the GPS Service Stream (click on stream(text)), and verify that it is updating, this is done by refreshing the page and noting that everything is updating properly. If it is not, the LCR should be restarted. If this does not work, try to reboot the LCR. If the reboot fails, open an issue in GitHub for the LCR to be manually inspected.

For configuration related causes

  • Carefully review task settings (implement width, set task speed lower) and parameter domain settings
  • Review Guide 4 and Guide 5, section relating to structured row navigation
  • Tune the Obstacle Avoidance controller gains and / or the topological planner gains. This should be done by a software team member only

Well tuned parameters, minimum oscillation (more than usual due to sky puppet)

Big Oscillation, Topological planner PID gains set too low, motion is under-damped

Slow motion in rows

When AGT is active the robots translation speed is dictated by the following:

  • Task Speed percent, set in Task Settings. A value of 20% is a good option for rows, for wheeled robots
  • Obstacle avoidance - proximity to obstacles will override the task speed and cause autonomy to slow down the robot

Obstacle avoidance uses a combination of parameters to determine translation speed but its important to balance how far the robot can "see" vs translation speed. If the robot appears too slow, maybe bias is set so it gets very close to obstacles, or maxDetectionDistanceZMM is set too high: a value of 5000mm is good value, for most cases, seeing longer, in tight spaces, will result in throttling down

Please use the Point Cloud visualization to determine the obstacle avoidance impact to autonomy, explained in this guide

AGT UI map will not load or loads in offline mode (blue markers/lines)

Notifications

"Confirm direction (Axle / Casters)"

After switching Autonomous Ground Task (AGT) plans or following a period of being idle you must set the direction of the plan.

"select destination or route"

This is shown when first loading a plan, or after selecting "Clear route" on an AGT plan. Click on a waypoint or work area to set the destination. See Autonomous Task Guides for more information: https://github.com/directedmachines/customer-support/wiki/Land-Care-Robot#operation

Warnings

Please search this page for the text seen on the AGT screen, to find the correct sub section

"Communication error, reload page"

UI lost connection, refresh to fix. If persists, troubleshoot connectivity.

"Plan mismatch, reload page" or "Invalid version"

UI has a plan loaded that does not match what the software runtime is using. This can happen when multiple users are using AGT, changing plans on the same robot. Only one user should be actively using AGT to edit or change plans, other users should observe only.

"Too far from nearest path or waypoint (xx m)"

The LCR is too far from a known path or WP and the user must either expand the plan, adding a WP and path to the LCR's current location, then attempt autonomy, or the user must drive the LCR closer using manual control.

"invalid initial position: move closer to a work area corner"

When creating or modifying a work area, the LCR must be close to one of the corners (vertices) of the polygon, for work area autonomy to start.

"Danger:obstacle detection disabled in " + label

This warning is displayed during an AGT work area where physical obstacles are ignored. This is dangerous and should only be done in areas with no known obstacles. To get rid of the warning modify "avoid obstacles" to all in the Work Area Settings.

"Obstacles detected, path blocked"

The depth sensor is reporting obstacles in the path. See LCR AGT Troubleshooting - Motion disabled due to obstacles

"left / right obstacle very close"

An obstacle is within the minimum safe distance on the left or right. Sometimes, this can be a false positive, due to the depth sensor image being corrupted (due to lighting conditions, failed calibration). Use the point cloud visualization and depth camera UI to see if the depth image is corrupted. Also, review depth sensor troubleshooting

Screen Shot 2022-10-14 at 8 46 40 AM

"failure applying bias"

The LCR is inside a parameter domain that has been configured to apply a bias. Obstacle avoidance requires an object to be detected on the side of the bias, otherwise it disables motion. If an obstacle blocks the path, bias can also fail.

Mis aligned so structure not seen on bias side

Screen Shot 2022-10-14 at 2 03 11 PM

Obstacle on other side of bias (middle of row)

Screenshot 2023-07-13 at 2 29 06 AM

Diagnosis

  • Examine point cloud view (remember to zoom out, so you can see far obstacles). See if obstacles seen on biased side
  • Use color and depth image to determine if robot is offset, or not oriented properly. For example in image below, we have a right side bias, but robot is facing to the left, is not seeing any obstacles to the right
  • use manual control, carefully, to re-center and re-orient robot, so it sees structure on the biased side
  • if AGT keeps loosing orientation, further parameter tuning might be required, for example, reducing obstacle avoidance gains

Shadows Causing False Positive

In some cases a shadow with a defined line that extends into the horizon can confuse the sensors. The sensors will pick this up as a chasm in the depth view as it will be attempting to project a ground plane based on the stereo sensors.

Always ensure that when the bias fails, or if false obstacles are seen, ensure they are correlated with the point cloud. These obstacles that seem to appear below the ground plane will show up as orange dots on the point cloud map, as seen in the image below. For further explanation please see: Depth-Sensors-and-False-Obstacles

image

"Obstacle avoidance fault, check Manual Control UI"

Go to manual UI and follow LCR-Manual-Control-Troubleshooting

"camera sensor not detecting a ground surface"

Possible causes:

  • The robot camera is pitched up, looking mostly at sky. As a safety feature, autonomy will stop
  • Image is corrupted, likely camera failed to calibrate due to being blinded by direct sunlight

Mitigation

camera looking at sky

In Manual UI, check if robot is indeed pitched up, simply drive forward until we see ground again. Below are examples of robot looking mostly at sky

camera depth image corrupted

Navigate to point cloud UI and check the depth image. If it appears corrupted, and robot was facing the sun, move robot so its orientation faces away from sun, restart runtime. After restart, verify depth image. If the navigation path is aligned with sun and sun is low in horizon, the depth camera sensor might continue having this issue and it will need to be delayed until illumination conditions change.

"Fault detected, check Manual Control UI"

Go to manual UI and follow LCR-Manual-Control-Troubleshooting

"Unsafe to move, switch to Manual Control UI"

Go to manual UI and follow LCR-Manual-Control-Troubleshooting

"Motion disabled, enable in Manual Control UI"

Robot motion is software-disabled. This automatically happens after a reboot or triggering the E-stop. Switch to the manual control UI and enable motion.

"Weak GPS signal, task paused"

  • Potential fixes
    • Ensure the LCR has a clear view of the sky
    • Check that the GPS antenna is in place on top of the panel.
    • Wait ~a minute for GPS quality to improve.

"GPS Sensor Fault"

  • The GPS sensor does not have a fix and autonomy will be disabled until the fault is cleared.
  • See the image to the right that shows the LCR indoors, the warning "GPS Sensor Fault" and the red "N/A" under the GPS accuracy in the top right-hand corner.
  • Potential fixes
    • Ensure the LCR has a clear view of the sky
    • Ensure the GPS antenna is intact and the SMA connector is tightly connected to the GPS board. Restart the robot from the UI.
    • Ensure the GPS board USB cable is connected to the USB Hub and the usb-u-blox_AG_-_www.u-blox.com_u-blox_GNSS_receiver-if00 device shows up when running ls -al /dev/serial/by-id when sshd into the robot.
    • Check that the blue power LED is lit up, and the "GPS FIX" LED is flashing 1x/s. Details here.
    • Power cycle the GPS board by unplugging and replugging the USB connector. Restart the robot from the UI.
    • Enable the GPS configuration through CAP by setting the state variable initialConfigStage="CONFIGURE_GPS" and restarting the robot from the UI. You can also add "options":["CONSOLE_LOGGING"] to the ublox configuration file to see detailed logs.
    • Download UBLOX u-center on Windows (not u-center 2), connect the USB to the computer, and confirm you can receive GPS nmea messages. If the problem persists you can reinstall the latest firmware (install instructions here).

"High current detected on Axle AUX Motor"

Self Disable Causes

Obstacles

"Obstacle encountered, policy set to STOP, switching to idle"

"Obstacle encountered during rotation"

  • If the obstacle was encountered during a zero turn and it is closer than half the implement width distance from the active "forward" end of the LCR, it will cause autonomy to stop. This behavior can not be overridden but you can change the implement width in task settings.

"missing forward depth profile"

No depth reading in the direction of motion, troubleshoot the camera in the direction of motion: LCR Realsense Camera Troubleshooting

"depth sensor is idle"

Depth sensor is turned off when idle to save power. This fault should clear on its own, otherwise troubleshoot the camera in the direction of motion: LCR Realsense Camera Troubleshooting

"obstacle very close or depth sensor fault"

Obstacles are present in the direction of motion, or depth sensor for that direction (casters or axle FWD) has a fault. See LCR AGT Troubleshooting - Motion disabled due to obstacles

"path blocked, check depth sensors, obstacles in area"

Obstacles are present in the direction of motion, or depth sensor for that direction (casters or axle FWD) has a fault. See LCR AGT Troubleshooting - Motion disabled due to obstacles

"near obstacles in the rear (of current FWD direction)"

Obstacles are present opposite the direction of motion (casters or axle FWD), or depth sensor for that direction has a fault. See LCR AGT Troubleshooting - Motion disabled due to obstacles

"Disabled due to path blocked for 5 seconds"

Obstacles are present in the direction of motion, or depth sensor for that direction (casters or axle FWD) has a fault. See LCR AGT Troubleshooting - Motion disabled due to obstacles

GPS/IMU

"Low GPS accuracy (%d mm)"

This is a warning that pauses autonomy. Should start again on its own, otherwise debug the GPS sensor.

"Magnetic calibration required"

See https://github.com/directedmachines/customer-support/wiki/LCR-Sensor-Troubleshooting#calibration

"Inertial (IMU) Sensor Fault"

See https://github.com/directedmachines/customer-support/wiki/LCR-Sensor-Troubleshooting#calibration

"Pose Estimation (GPS and/or IMU) Fault"

Generic catch all for pose-estimator issues. Please check

  • Pico Brain (004216) serial device listed in serial devices list
  • Pico Brain Stream Sample updating with compass / IMU data

You might need to open caster side enclosure and check USB C connection to Pico Brain device.

Electrical

"power draw too low on AUX motor"

This alert is sent when the AUX is enabled but the power draw is too low. It can be caused by the mower height being too high so it is not cutting, if a mower belt breaks, or if a mower blade becomes loose.

"possible stall, current draw high on AUX motor"

This alert is sent when the AUX is enabled and the current draw is too high. It is usually caused by the mower stalling when something is blocking the motion of the blades. For our mower see the Indication of a Stalled Mower Troubleshooting Section.

"drive motors stalling, adjust implement/attachment"

This alert is sent when there is large current draw on the drive motors when turning. It is usually caused by the implement being too low to the ground and getting stuck. Raise the implement and resume autonomy.

Safety Features

"Large heading error, switching to IDLE"

Magnetic heading (combined with GPS sensor) indicate that robot orientation is too far from path for current waypoint.

"Lack of progress in waypoint navigation"

This is a general catchall for the LCR not progressing along a path. Check the path is not blocked and the robot can be driven in manual mode.

"lack of progress in work area"

This alert is sent when an autonomous task is completing a work area but the completion percentage is not increasing. This can be caused by skinny plans or if the LCR becomes stuck.

"low power"

The LCR is below 20% power. When in low-power mode the robot can drive but it will be slower and AUX will be disabled.

"web client idle for too long"

This happens when the state variable 'clientIdleThresholdMicros' is set in the 'navigation-autonomous-ground-tasks-default.json' and the AGT UI page has been idle for too long. This is a useful safety feature to auto-disable supervised autonomy if there is a spotty LTE connection.

Issues with Plan

"Invalid plan (at least one waypoint not connected)"

All waypoints and work areas must be connected for a path to be valid see: LCR Autonomous Ground Task Guide Level 3

"Large plan, loading ..."

A plan is loading. If this is seen often you can split the plan into multiple smaller plans.

"destination not specified"

No destination selected, pick a waypoint or work area.

"plan ID does not match state: %s"

The plan ID does not match the most recent plan ID from the cloud. Go to the AGT UI, select View Auto Task -> synchronize to fix.

"W[X] does not reach A[X]"

Ensure that a waypoint is connected to a work area, by directly clicking on the point of work area (i.e A0) when creating waypoint paths. See: LCR Autonomous Ground Task Guide Level 3

Misc

"completion threshold reached"

This alert is sent when an autonomous task is completing a work area and it reaches the completion threshold. If the task stopped earlier than desired you can edit the work area setting "completion (%)" see: LCR Autonomous Ground Task Guide Level 1 - Work area settings

"destination reached: W[0|1| ...]"

This alert is sent when the destination waypoint is reached as expected.

Advanced Topics

Plan Copy between existing plans

When creating a new plan you can you use the Create / Restore option to copy an existing plan. If you are trying however to replace the contents of an existing plan (to update it for example using a different plan), you can use a JSON file drag and drop.

Steps:

  • load the plan you want to use as the source in the Fleet Management Plan UI
  • download the json of the plan you like (remove the /ui suffix and open a browser tab: http://directedmachines.com/navigation/graphs/remote/mbb-marg5-puppet-time:SITEORCH-XXXXXXXX)
  • click on the tab with raw JSON and "Download" as a JSON file
  • edit the JSON file in a text editor and modify the "documentSelfLink" field value to match exactly the plan name you are using as the target:
    • Original: "documentSelfLink": "/navigation/graphs/remote/bad-name-inv2202-oa:SITESOLR-xxxxxxxxx"
    • Edited value: "documentSelfLink": "/navigation/graphs/remote/better-name-inv2202-oa:SITESOLR-xxxxxxxxx"
  • save the JSON file contents
  • use the fleet plan UI to load the plan (an empty plan for example) you want to update with new contents
  • Click on the "Telemetry" button to show the telemetry time range dialog
  • Drag and drop the JSON file from your file explorer, anywhere in that dialog
  • page will reload with new plan contents. Plan name remains the same as the plan you are using as the target (currently loaded on UI when you drag and dropped the JSON file)

Telemetry dialog where you drag and drop the JSON file

Screenshot 2023-08-05 at 8 43 47 AM

Plan Version Recovery

Plans are versioned documents that are indexed in every fleet node, with the latest version also present on edge devices: all robots associated with the same site for example, will have local versions matching the last time each synchronized to fleet nodes. The "undo" button in the UI, asks the NavigationGraphService, which is the code component managing plans, to restore a previous documentVersion and make it the most recent.

When a plan gets overwritten by a new version, on occasion, UNDO might not work. The previous versions however all exist in the fleet nodes index, and can be retrieved as JSON documents, then those documents can be used to restore the latest version

Recovery Steps

  • Login to fleet management and navigation to the plan editor for the plan you wish to recover. For a deleted plan please see the next section
  • cut and paste the link portion of the URL, from the web browser address bar, as seen below (replace the "XXXX" with proper robot id):
  • complete URL: https://directedmachines.com/navigation/graphs/remote/default:LCR24ZS0-XXXXXXXX/ui
  • Link portion: /navigation/graphs/remote/default:LCR24ZS0-XXXXXXXXXX
  • in a different tab in the web browser, fill in the following URL, that issues a query for the latest version (implicitly) to the index, press enter to load the JSON document
  • https://directedmachines.com/core/document-index?documentSelfLink=/navigation/graphs/remote/default:LCR24ZS0-XXXXXXXX
  • scroll down the JSON document and note the documentVersion field: "documentVersion":78
  • update the query url to include a documentVersion query parameter, using the previous version
    • https://directedmachines.com/core/document-index?documentSelfLink=/navigation/graphs/remote/default:LCR24ZS0-XXXXXXXX&documentVersion=77
  • Download the contents of the page by right clicking somewhere on that tab contents and selecting Download file. It will download a file
  • In the original tab, with the plan UI loaded, drag and drop the JSON file into the Map Update dialog (click on map to show the dialog, while in "frozen" mode"
  • plan contents are now updated to those of the JSON file

Plan Deletion Recovery

The AGT Plans are stored as JSON files on each robot and also replicated on the fleet management nodes. When a plan is marked deleted, we add a tag, pendingDelete, in the tags section.

Using fleet management plan editor

The user friendly way to restore a plan uses the fleet management plan editor (not available on edge device AGT UI) Steps:

  • Login to fleet management
  • Click on site / robot associated with plan that was deleted
  • Select an existing plan, to load the plan edit UI
  • Click on the map
  • Click on "Create / Restore Plan"
  • Type the exact name of the plan that was deleted
  • Page will refresh and plan will load, with a warning: "Pending Delete"
  • To restore the plan, click "Edit Plan"

Screenshot 2023-07-09 at 8 39 48 PM Screenshot 2023-07-09 at 8 40 06 PM

Plan synchronization failure

If a plan is associated with a site, loading the AGT UI will cause a synchronization with the fleet cloud nodes. Sometimes the synchronization fail. To force a plan edit, add the following URL parameters and reload AGT. Then proceed to edit the plan, which will force synchronization with the fleet nodes:

?allowEdit&forceEdit

Using SSH and directed edit of JSON file

To recover the plan:

  • SSH into robot
  • edit plan json file, in dCentralizedSystems/cap-config/config
  • remove pendingDelete from the tags section
  • remove siteId from the tags section
  • save file
  • restart runtime
  • load AGT UI, and selected the recovered plan
  • Edit plan
  • make a small change, anywhere, to cause changes to propagate to fleet nodes
  • Associate siteId (if applicable) with plan
  • freeze plan

Section in plan showing the tags and pendingDelete tag to be removed:

Before

"tags": { "googleMapTilt": "0", "reset": "false", "id": "LCR24ZS0-xxxxxxxxx", "siteId": "SITEORCH-74xxxxxxxx" "undoVersion": "85", "googleMapTypeId": "hybrid", "reqGenerateEdgeSamples": "true", "implWidthMM": "1415", "obstaclePolicy": "AVOID", "graphSyncTimeEpochSeconds": "1686597626", "pendingDelete": "true" },

After

"tags": { "googleMapTilt": "0", "reset": "false", "id": "LCR24ZS0-xxxxxx", "undoVersion": "85", "googleMapTypeId": "hybrid", "reqGenerateEdgeSamples": "true", "implWidthMM": "1415", "obstaclePolicy": "AVOID", "graphSyncTimeEpochSeconds": "1686597626" }

AGT Page Reload Looping

When attempting to load the AGT page and the page partially loads, then automatically refreshes, the robot is stuck in a reload loop. To alleviate this, we need to SSH into the robot and delete the host specific default graph. Once ssh'd into the robot, perform the following command: rm config/navigation-autonomous-ground-tasks-default.<hostname>.json where is the robots identification.

⚠️ **GitHub.com Fallback** ⚠️