Motion_Planning - RicoJia/notes GitHub Wiki

========================================================================

Basic Flow (Acadmecism, old school)

========================================================================

Front End - Path Planning (High Dimension -> Rough Low Dimension Solution)
- Discrete Space
- Methods:
  1. A*, Dijkstra
  2. Jump Point Search (Based on A*, most cases it's better, but there're exceptions sometimes.)
  3. Sampling Based Method
    - PRM
    - RRT (随机撒点，将这个点连到最近的树上，盼望收敛，但不是最优)
    - Optimal Sampling-based Methods (RRT*)
    - Informed RRT*
  4. Kinodynamic Path Finding: consider kinematic constraints
    - state, state boundary value optimal control problem
    - State Lattice Approach (high dimension A*)
    - Kinodynamic RRT* (Common)
    - Hybrid A* (Common Method)
Back End - Trajectory Generation & Optimization
- Continuous Space
- Methods
  1. Minimum SNAP Trajectory Generation
    - Differential Flatness?
    - Closed form solution to SNAP
    - Time Allocation
  2. Soft and Hard Constrained Trajectory Optimization. What constraints - dynamic constraints
    - Soft: having a tendency to generate this
    - Hard: enforcing it.
  3. Markov Decision Process based (MDP)
    - Uncertainties & cost,
    - Minimax Cost
    - Value iteration, Real time Dynamic Programming
  4. MPC (Model Predictive Control)
    - Linear MPC
    - Non-linear MPC
Other Navigation Methods:
- End-to-End Navigation: directly from sensor information to navigation. Active research, but not applicable yet
- Navigation based on Neuro Nets
Now: realtime 3D point generation, estimate cost to get to a point
1. Basic Requirements:
  - collision free
  - Kinematically executable
  - Smooth
2. A good roboticist:
  - If the problem doesn't exist, it's not a problem
  - Must be an engineer first, know how things are implemented
  - Hot topics are just hot, don't follow them blindly
  - Don't just do things on simulation, do it in real life
  - Prefer simple yet effective solutions
  - Be honest, face problems
  - Don't do it, or do it well
  - Before Criticizing, try it.

========================================================================

Search Based Methods

========================================================================

Basic Concepts

C space is space that contains all robot configurations. You have to access the list and modify
- a robot is a point in C-space.
- need to represent the obstacles in C-space too. They can be simplified as balls (conservative, with inflation).
Graph Search
1. Setup: a map is a bunch of node, with connectivity. (undirected, cyclic, weighted)
2. So finding a path, is to search on a huge graph, yielding a search tree. The tree might be too big, so need to use heuristics to cut off the graph
3. Then backtrack from the node
Completeness: can find a path if there exists at least one.

BFS, DFS

BFS, DFS: use queue/stack and add nodes into it one by one, and pop nodes FIFO, LIFO

- BFS is better, because DFS does more redundant search?? - When the cost for all edges is 1, BFS will find the best path.
But what if the cost of edges is not all 1? Use heuristics (cost to goal), in Best First Search algos: Greedy Best-First-Search, Dijkstra, A*
- Cost has to do with energy and terrain
- Open List, Closed List(nodes that have been examined)
- Open list can use: std::multimap, std::priority_queue, std::make_heap

Best Search Methods

Greedy Best Search
- Not optimal, coz it doesn't consider the cost to get to the current node.
- But can be complete with a closed list, so it doesn't get stuck in a loop
- Procedure
  1. Open List with start node, closed list.
  2. Loop until loop is empty
    1. Pop the first node in the priority queue,
      - mark it as visited
      - and put it in closed list
    2. If the node is goal, quit
    3. For each neighbor node, If "unvisited":
      - mark as to_visit
      - calculate h(x), update it h(x) with min(h(x), h(x)_0)
      - push on to priority queue
  3. Back track from goal node in closed_list
Djikstra: all nodes will be explored. Like BFS.
- use cost to the current node h(x) only. Can't see how far we are to the goal
- Procedure: similar to greedy, except h(x) -> g(x) (existing cost)
A*: same as Greedy, except h(x) -> h(x) + g(x)
- As long as heuristics < actual cost (admissible), we have optimal
  - L2 norm, L3 norm... 0 distance (Dijkstra) are all admissible, but they're not close to the true cost
- In practice, this is what we need to spend time on.
- A start is more goal oriented
- weighted A*, epsilon * h(x). Can be orders of magnitude faster than A*.
Heuristics:
- There's a closed form "best heuristics": cost when no obstacle (aka diagonal)
- Tie breaker: prefer straight line over big turns: (similar triangle when straight)
- Heuristics Comparison

Jump Point Search (JPS)

Jump Point Search: truly outperforms the tie breaker, by not expanding neighbors that lead to the same cost g(x). Biggest Difference from A* is the neighbor list. "Jump neighbors"
1. Phylosophy:
  1. Why choose a point as neighbor, if there's a cheaper/ equally expensive way to get to it?
  2. We just forcus on the horizontal, vertical, and diagonal directions, because other rows, columns, diagonal cells will be visited next time
  3. Prefer straight lines. Horizontal, Diagonal, Vertical.
  4. We may visit more points, but pushing, popping for queue are O(log n)
2. Reality: Many times slower than A*, if the map is big
Define the forced neighbors: neighbors that we HAVE TO visit the current node first, because it's blocked by an obstacle so no other way can be cheaper. Therefore, the node with at least one forced neighbor must be evaluated in the future.
1. Two types of nodes with forced neighbors:
  - For completion, goal is qualified as a Node with at least one forced neighbor
  - by pattern matching
2. add the node with a forced neighbor to OL.
  - if we hit an obstacle/wall right on, it's not considered a node with forced neighbors
3. Now you must be wondering: are we missing something if we don't add the obstacle/wall? Nope. think: when there's no obstacle, goal node will be added in one scan as a "node with a forced neighbor". Then it gets evaluated, popped, and the path will be a simple straight line!
Our goal is to find a node with forced neighbor, in 8 directions, as a new jump point neighbor
for a quadrant, 1 out of 4:
1. from the current "look-ahead" base point, look diagonallyby one rectangle. If this cell has a forced neighbor, add it to open list
  
  caption
2. from the current "look ahead" base point, look straight. We just look straight, and not the row above/below us, because they're cheaper to get to (later, when we move 1 cell diagonally)
3. Before this step, if the current look-ahead base point has a forced neighbor, add to OL. Else, If still nothing is found, move diagonally
The BEST tutorial I found is actually this, with pseudocode
- JPS Online Demo

State Lattice Methods

From the current state, steer at different angles
Disadvantages:
- Graph is too big, takes too long

Hybrid A* (TODO)

Basic Principle:
1. Maintains the grid. But in each grid:
  - find the best state lattice in that grid
  - Use that state Lattice
2. Illustrations

========================================================================

Sampling Based Methods

========================================================================

PRM

PRM (early): 随机撒点,剔除冗余点，盼望收敛

RRT

[OMPL](1. OMPL example: example https://ompl.kavrakilab.org/OptimalPlanning_8cpp_source.html)
1. Sample a new point: not occupied, goal biased (select goal with a certain probability)
2. Find nearest in KD tree
3. X_new is the point in X_sample's direction, but x_dist apart?
4. Add X_new to the KD tree: if x_new - X_nearest is collision free
  - Should save parent to node
5. If X_new is close to the goal, back track
Disadvantages: In cluttered space, might take a long time to sample there.

nice example: https://github.com/AtsushiSakai/PythonRobotics#optimal-trajectory-in-a-frenet-frame
Frenet-seret frame planning?
Ego Graph

RRT*

Working Principle: 当加入新点时，1. Find min total cost vertex within a radius. 2. rewire the graph
- Consider the nearest x neighbors's total cost to goal: f(a_new) < f(b_new) < f_c(new), then a connects to new
- Rewire: C_new is better?
- After you find the solution, keep iterating.
Disadvantages: Will converge to an optimal path
- Vertex search and rewiring takes a lot of time!
Advantages:

Informed RRT

在已有的路径上画一个椭圆，并不停缩小椭圆范围, 比RRT* yao好

RRT Connect:

- grows tree from goal and start. 
- Question: 
    1. how the two ends connect each other? 
    2. If in cluttered space, how do they meet? 

    <p align="center">
    <img src="https://user-images.githubusercontent.com/39393023/141825998-8ffd1db5-b1e9-4f16-9313-77de546d1987.JPEG" height="600" width="width"/>
    </p>

Trajectory Optimization

Minisnap - Vijay Kumar, getting waypoints, flight through hoops
Working Principle?
- After generating a trajectory using the path waypoints, if a segment has a collision, then get the mid point of the two waypoints.

========================================================================

Kinodynamic Planning

========================================================================

Kinomatics: position, collision with obstacles. Dynamics: force, acceleration, velocity. Why KinoDynamic?
- Old school is coarse path planning to fine trajectory optimization. But if we can be a bit more fine in path planning, trajectory optimization will be easier
- Consider this scenario:
Car Models
Kinodynamic RRT*:
- Path added as a curve.
- In RRT, a straight line might hit a collision. But in Kinodynamic RRT*, that line may not.

State Lattice Planning

A lattice graph is not just grid + theta - each node is (x,y), but an edge is an Action. Two ways to do it: sample in control space, sample in state space.
Sampling in control Space. Easy to implement, but you need to forward simulate each state.
1. For every state in search tree, sample a control u
2. Integrate motion, see if there're collisions
3. Add the motion as an edge to the graph
4. E.g., Drones will get input as acceleration or jerk (derivative of acceleration). The whole model is an "integrator". Since A is a nilpotent, it eventually will become zero.
5. Disadvantages:
  1. Not goal oriented, you just randomly take actions
Sampling in State space: Higher efficiency, needs to solve "boundary value problem"(BVP)
1. Given start state and interval T, for every neighbor,
2. solve for the control.
3. Repeat the same for the outer neighbors.
4. Question: why two layers if the next waypoint is not known yet? Also

========================================================================

Trajectory Optimization

========================================================================

Why do we still need to optimize the path? Cuz Hybrid A* might not give you the best path
- We need: smooth path, i.e., the speed needs to be consecutive.
Differential Flatness - System x_dot = f(x,u), where the algebraic combination of some x, u and their derivatives can represent the whole system. Then the subset of (x,u) is called flat outputs
Quadrotor Example of differential flatness
1. 12 States, including xyz, RPY, and speed in their body frame. Then newton formula describes the net force, and euler eqn for net torque. Note the rotor forces must be perpendicular to the plane
  - Notations: [w_x, w_y, w_z] represents the axis of rotation in BODY frame. zw = [0, 0, 1], zB = z column for R_WB, the body frame [xB, yB, zB] in world frame
  - From Newton's eqn, we can get [xB, yB, zB]
2. Getting w x Zb: we can see they can be represented using (x,y,z) and their derivatives, too
  - Here, u1 is the input force that we know. a_dot is x''', zb = Z axis of the Body frame in world frame So w x Zb can be represented by (x,y,z,theta) and its derivatives of different orders
  - Represent wx, wy in body frame using (x,y,z, theta)
  - Represent wz in body frame using (x,y,z, theta)
Differentail Flatness: the drone is an underactuated system, so we can represent the system using 4 params.
Main Controller lauyout: Note we plan in the space (x,y,z), its first and second derivatives, and yaw (theta above, psi here). Our job in trajectory optimization is to figure out the 4 sets of values

Minisnap

For a 2D example, if we are given p,v,a as the start and end conditions, we need 5th order trajectory between each setgment. If given p,v,a,j, we need 7th order. Also note we can have a non-zero velocity here

But we may want to have higher order to ensure "smoothness"
Mini snap: from above, we know angular acceleration can be represented by snap and above. So the cost function J, see below, is the integral of (snap)². This is why it'called minimum snap
Smoothness: N = 5 for 1 segment, that we can minimize jerk. A mistake below: the constraints should be x, x', x''. Then we can write out x''' as p, and t, then minimize it.

caption
Important note: for numerical stability, we should solve for each segment from T0
We define cost J as L2 norm of the 4th derivative of the sum of trajectory value at time t. Eventually, we want to minimize this.
The constraints -TODO has two: derivative and continuity, and derivative can represent continuity implicitly
1. Derivative: when velocity, accelerations are given, we can write the following matrix for one trajectory (say [0, T])
2. Continuity: when velocity and acclerations are not given, we take the above, and apply this constraint:
GOAL: This is a convex optimization problem:
- J definition: for ALL trajectories:
- And now we are trying to optimize J, which is all params P, eventually.
A few challenges:
1. Challenge 1: Numerical instability. Solution: convert P to x_dots, such as velocity and acceleration
2. We use a matrix to convert params to x_dots, using the conversion matrix C, which also allows us to enforce constraints (p, v, a at start and end, and p at each waypoint)
  - Note C is, and we can enforce the continuity constraints here.
3. Then J can be written in this form:
4. Challenge 2: There are fixed variables we don't care about. So we separate the fixed variables and the variables for optimization.
5. So we can separate J into a fixed part (Xf) and a free part (Xp). Then, we get the derivative of this quadratic form, and get:
This optimization is applied to all waypoints, so it minimizes the whole cost.
If the trajectory hits an obstable, add another waypoint in the middle and replan.Reasoning is: if we have inserted so many points, eventually the path will be close to the path. But there's no guarantee on efficiency, etc.
Convex Solution:
1. OOQP: good QP problem solver! Better than MOSEK. CVX is a matlap wrapper

Minisnap Example: hw_5, using qp solver

Minisnap Setup - because we're trying to optimize J=integral(x''''^2), x'''' is snap!
Q is still defined as above. Here is its implementation:
Constraints
1. The polynomials and their derivatives
2. "positional" Constraints (derivative constraints)
3. Continuity Constraints (necessary if derivative constraints don't specify start and end points of two neighboring segments are the same!)
4. So to sum up, A_eq is finally:

Hard-Constraint Minisnap

get Q: The biggest thing is M matrix:

========================================================================

MDP based Planning

========================================================================

MDP Probability Model Formalization
1. Hay incertidumbre en los sensores y la ejecucion.
  - Incluyendo adonde va para explorar un espacio desconocido. Usa deep-reinforcement learning
2. Depende del conocimiento del robot en esta incertidumbre, hay dos tipos de MDP:
  1. Probablistic, hay un modelo del ruido
  2. Non-deterministic, no hay ningun modelo del ruido
3. Se define con: X(estado), theta (uncertainty, la incertidumbre), U(accion), L(el costo). Y nuestro objetivo es minimizar el costo total.
4. La transicion de los estatodos es la suma de estados sobre todos los ruidos posibles.
  - En realidad, p(X_k+1 | x_k, theta_k, u_k) esta establecida (set) a 1.
5. Normalmente calculamos "cost to goal", que se puede considerar como el opuesto de rewards
6. E.g.
Non-deterministic: want to minimize max cost-to-goal (minimax)
- Djikstra:
  1. Initialize costs G to inf, and G(goal) to 0
  2. Push goal to open list
  3. while not finding start:
    - pop first item from OL
    - Evaluate each neighbor:
      - G_temp(k) = L(x_k, theta_k, u_k) + G(K+1). max here refers to the "worst case noise"
      - G(k) = min(G, G_temp)
  4. Illustration
- Will ever stuck in a loop? No, because costs only adds. Once we find G_temp > G, we will stop adding the neighbor and quit the loop
- A* - just add admissible heuristic to start
- Disadvantages:
  1. only applies to graph, not on continuous space
  2. Overly pessimistic
Probablistic
1. Ya que tenemos un modelo de incertidumbre, podemos estimar expectation del cost-to-goal. (Bellman Optimal Equation)
2. Si supieramos (if we knew) el true cost-to-goal, podriamos elegir (would) la ruta facilmente. Pero no lo sabemos ahorita
3. Un metodo de estimar el true cost-to-gal es "value iteration":
  1. Inicializar cada cost-to-goal -> 0
  2. Definir un orden de evaluacion de los estados al azar (random)
  3. Si no converge
    1. por todos los estados:
      - G_temp(k) = min(uk) {E[L(x_k, theta_k, u_k) + G(x_k + 1)]}.
      - G(k) = min(G, G_temp)
  4. Ejemplo Minimal
4. E.g., value iterations
  1. calculaciones de los iterations
  2. Por que el cuesto es positivo, loop infinitivo no es la opcion optima.
5. Pero no queremos hacerlo a cada estado. Entonces tenemos "Real time dynamic programming":
  1. Inicializar cada estado con su distancia a gol
  2. while total error < THRE: 1.Elige una ruta basado en epsilon-greedy 2. por todos los estados: - G_temp(k) = min(uk) {E[L(x_k, theta_k, u_k) + G(x_k + 1)]}. - G(k) = min(G, G_temp)
6. Por ejemplo
  1. Iteration 1
  2. Iteration 2
  3. Iteration 3
value iteration (VI) convergence proof
1. this is called "undiscounted case". We need all costs to be non-negative. Then we know we won't have an infinite loop, since that way cost will be infinite. So we must have a finite solution, and finite state values.
2. BV(xk) is the new value after an iteration. That <= a value given any uk.
3. Then because all costs are positive, from child state to parent state, child state cost must > any parent state cost * probability
4. Then we know BV(xk) must be closer to V*(xk) than V(xk)

========================================================================

Model Predictive Control

========================================================================

It's also known as "receding horizon control" (RHC). Receding horizon means "mvoing time horizon". So each step, you have to optimize cost that considers control effort, final state under disturbances and constraints. So before it was slow. But now there's code generation for RHC solvers.
1. Model has system model (like F=ma) and problem model: (like min(dist))
2. Prediction: what F is like in 3s?
3. Control: choosing the best policy (like choosing the best F at each time)
4. Summary: we have an associate controller, a robust controller. Then we have a trajectory generator, which uses receding horizon (Predictive model) to minimize a cost (say SSE of all states.). No disturbace is considered here yet. Output = reference trajectory. Linear MPC and PSO (Particle Swarm Optimization)
Problem Model: final state and process
However, at each time, how to know the state?
1. zero order hold (discretization)
2. polynomial
3. neuro-network
Tools - non-convex optimization: Sequential Quadratic Programming (SQP), or Particle Swarming Opmization (PSO)
Para optimizar rapidamente, inventamos "tube based MPC", que hay
- un modelo matematico "nominal system" sin perturbaciones.
- Y anadiriamos un "associate controller" con una frequencia mas alta, para pelearse con las perturbaciones
- Se llama "tube-based MPC" pq el funcionara siempre y cuando su error se quedaria en un rango (range)
- Ademas, tenemos matlab MPC toolbox que puede generar codigo en C++
En MPC, hay 4 pasos mayores
1. Construir el modelo de prediccion. Time horizon
  - Representacion de matrices
2. Establecer optimizacion: eso es la solucion analitica. Aviso: for adding tracking reference signal r(t) at a given time, Bp becomes Bp-r(t)
3. Rosolver la optimizacion
  - Nota: quadprog() solamente acepta Ax < b. Entonces se necesita cambiar -
4. Recuperar la ruta: si hay una restriccion: a > -1, Al principio a < -1, pero (blando) soft constraints lo jala (pull) atras en vez de "no solution"
5. Nota: Hay overshoot pq queremos minimizar jerk total mientras
6. Soft Constraints

JLT + MPC

Jerk-Limited Trajectory (JLT)
- Solving for the trajectory is very time consuming. Another way is to look at this as a Boundary Value Problem. One efficient way is to use bang-bang control to generate optimum trajectories, then pick the highest scoring one
- jerk feels like changes in forces.
JLT example
1. Just a simple model, 3rd order integrator
2. We can have bang-bang on jerk, whose length is deterimned by the time of acc
3. One application is Safety Corridor:
  1. from a to b plan a route
  2. Use binary search to find point where we can plan a route to C.
  3. e.g.,
JLT + MPC
1. Is local planning. costmap is built, cost is its distance to the closest obstacle. We generate a bunch of JLT trajectory as guess, then We will use the MPC's cost function to evaluate each trajectory.
2. How do we generate those traj? Use PSO (Particle Swarm Optimization). PSO is a gradient-free optimization method
3. PSO: eventually, the particles will converge into a single point in (xy space)
4. PSO algo
  1. initialize particles (theta), each particle is a guess of v and w of the end point of a trajectory, after time tf. (v,w)initial value is random
    - Note: a particle is always a guess of parameters.
  2. For each iteration:
    1. For each particle:
      1. given current v,w, current_pos and guess v,w, generate a traj with 40 points (x,y)
        
        A neuro-network can be used for prediction: [theta_init, theta_end, v_init, v_end]
      2. get the cumulated cost of the trajectory
        
        In costmap, the global goal is already represented by the cell with 0 cost. The cost also encompasses the distance to the nearest obstacle, like log(dist_to_nearest_obs)).
        
        Therefore, the (v,w) with lowest cost must be collision free, and will lead to the goal
      3. If cost < cost*, choose the traj as the best traj; if cost < historical_cost, save the state of the current particle
      4. save the historical best cost and current global best velocity
    2. For each particle:
      1. get its direction to the particle as vf1, direction to its historical best as vf2, and current velocity vf3, then vf = rand1 * vf1 + rand2 * vf2 + rand3 * vf3
    3. Move the robot using (v,w) for 0.1s
5. Example:

========================================================================

Maps

========================================================================

2.5D map: https://github.com/anybotics/grid_map
1. Fixed camera angle
2. every 2D pixel has an elevation
3. 可以对网格进行密集的切分, grid map can be queried O(1)
二叉树可以用来表示3D 物体位置: (octomap):
- one child has 8 neighbors
- So 1 3D position can lead to this "sparse" constellation
  
  caption
- Indexing time: You need nlog(n) time to search thru a tree
Voxel Hashing
- Most sparse, used for RGBD reconstruction
- principle
Point Cloud: point cloud library
- unordered,
- No index Query
- Rotational camera might be cheaper
Distance Field maps: TSDF (Truncated Signed Distance Functions), ESDF (Euclidean Signed Distance Functions)
- positive distance from camera to obstacle. From obstacle to inf is negative distance.
- But it only cares about distance within a certain area. Outside the area is NA. So not very useful
- ESDF instead has all the values. Fiesta
- Voronoi Digram Map: at the center of the polygon stands a cell tower. Points within a polygon has the center as the closest cell tower
  - Skeleton Map? Extracts "skeleton" out of ESDF and does planning on a voronoi diagram map?
Compliant control
- Nice explanation: https://robotics-explained.com/impedancecontrol

Motion_Planning - RicoJia/notes GitHub Wiki

Basic Flow (Acadmecism, old school)

Search Based Methods

Basic Concepts

BFS, DFS

Best Search Methods

Jump Point Search (JPS)

State Lattice Methods

Hybrid A* (TODO)

Sampling Based Methods

PRM

RRT

RRT*

Informed RRT

RRT Connect:

Trajectory Optimization

Kinodynamic Planning

State Lattice Planning

Trajectory Optimization

Minisnap

Minisnap Example: hw_5, using qp solver

Hard-Constraint Minisnap

MDP based Planning

Model Predictive Control

JLT + MPC

Maps

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️