white_paper - RicoJia/notes GitHub Wiki

There is no right or wrong for what to do. It's ok if we don't have the same level of achievements as others, just do what you enjoy doing, man. Some are working in restaurants, some just want to chill. I like building robots. Nav and computer vision is a goal of the current phase. Building robots is also very simple: you run your programs on the robot and see what comes out of it. So relax, nothing will really impact it.

When we talk about work, We are living a life now. A life should have balance. Have a bi-focus for the week, each day, work on two things alternatively. Focus for 8 hours, no more. This period we have two difficulties: "aloneness" (because we need to find a team env), and "parallelism" (we need to balance the time between finding work and projects).

Along the way, you might be distracted by various things: people's new job postings, new job opportunities that attract you... But don't forget WHO YOU ARE and what you are interested in.

Methodologies

预习数学，关注头尾

Overall Projects

The slam class: 2D SLAM, 3D SLAM: Finish the first pass of the slam course in 9 weeks. (54 days), Try to aim for 5 weeks (30 days)
- 1 days for NDT ✅
- 1 day for Motion Planning Interview Prep ✅
- 2 days for NDT odometer ✅
- 2/2 days for Inc NDT ✅
- 3 days for in-direct method
- 4 days for loosly coupled LIO
- 1 day for profiling direct, incremental NDT
- 4 days, loosely coupled LIOChap
- 4 days for chap 8, IEKF
- 4 days for Pre-integrator
- 5 days for Tightly Coupled LIO
- 3 days for Homework
- 4 days for chap 9
- 2 days for homework
- 3 days for chap 10
- 2 days for homework
ESKF
- IMU calibration
- ESKF model
Then, we need to think about SLAM advanced. Checking out the papers.
- Unfinished Textbook questions
- LOAM
- LeGo-LOAM
- LIO-SAM
Job: 2-4 months, land a remote / houston Job
Delays: + 1

Mumble Rover: a ROS 2 rover, with 3D SLAM, 3D Localization, and 3D Navigation. This should be on a simulator first, then on a real robot (with at least a new microSD card, or even SSD), Rpi5 (oct.2023)
- ROS 2 Notes
- ROS 2 is a high priority:
  - Create an image for navigation.
    - Gazebo, or IsaacSim?
  - Create an image + byobu
- Physical Robot:
  - Test program TODO (HIGH)
    - between rpi and rover IMU
    - Between nvidia nano and the IMU
  - Burning Question: is the IMU on the waveshare good enough? IMU Calibration.
Deep Learning Hands-On: train MobileNet V2, DeepLab V3+, ViT, YOLO, SSD.
- Train on Pascal VOC for single-label classification (D)
- Train on coco dataset for multi-label classification (D)
- Assemble and train the MobileNet V2 framework (On-hold, Can unblock on Jan 3 - Jan 6, 8h input)
- Try ViT model (2 work week project, can be on-hold until slam class finishes IMU, Integration, and 2D SLAM)
  - Try the exisitng ViT model and weights on object detection. (5h)
  - Try an existing backbone while putting together a transformer (25h)
  - Train another backbone (e.g., VGG) (optional, 5h + 20h training hours)
Udacity Machine Learning
- Future ideas for Robots
- Other motion planning methods
- MPC motion planner
- Deep Reinforcement Learning Methods?
- Long term project: 2D slam book, check out included variants.

Time Budgets

During work days, 2h for personal growth (8am - 10 am) during weekends, work hours 5-8h. 15h is safe bet, 20h is a good goal. Friday:

Grocery Shopping (1.5h) Saturday:
Cooking (1.5h) Saturday will be an off day for me.

Below, I Have Some Trivial Goals For Setting Things Up

Start a setup.sh for setting up a new laptop (D)
Find a soccer team
Document progress, and questions along the way (D)
Need to find a co-working space, get to know some like-minded people (D)
Find a chinese immigrant community. Is there a group chat?

Goals for Oct 22 - Nov 12

Lidar, new base
Google what below does:
- TensorFlow Serving / TorchServe: Serving models in production.
- TensorBoard: For monitoring training processes and visualizing metrics.
- Seaborn: For creating detailed plots and charts.
- DVC (Data Version Control): Managing datasets and model versions.
- Experience with transfer learning, fine-tuning models.
Olah's LSTM article (0.5 day)
Student T Distribution (0.5 day)
Write about t-SNE
Look at Autograd, how it works. https://pytorch.org/docs/stable/notes/autograd.html (1 day)
Set up Deblurring network. (1 week)
- Rerun 3D model with the deblurring network. (week 2)
Add to resume (3 days):
- Add pytorch, deep learning to resume
- Add a section on personal website to showcase your projects
Try SAM (1 week)
Try RGBD SLAM/ORB SLAM, the official one. (4 days)
Try fiftyone for sam2
More: Face recognition
Gradient clipping
TODO: Try backward prop for layer norm. See Karpathy's tutorial.

Project - SLAM Course

Project - 3D RGBD SLAM

Play around with Isaac Sim would be nice. Try 3D SLAM with Handwritten Math. (Week 3) Try 3D SLAM with stereo camera and build a 3D model of the place (Currently it's shitty)

Deep Lab V3+ Project

Try DeepLab V3+ network, make a wrapper for it. ✅
Try SAM (1 week, 🔭)

Image Deblurring Project (Unstarted)

Computer Vision Techniques

Edge dtection
Smoothing
Gradient Calculation
Optical Flow
OpenCV DNN
Object Tracking

Onward and Upward

In-Depth scenario and ask how you would act, like a "flight simulator training"
Object Detection (唐宇迪)
- Mask RCNN: https://edu.51cto.com/course/20420.html
  - Facebook Impl: https://github.com/facebookresearch/maskrcnn-benchmark
- YOLOV1
- YOLOV2
- YOLOV3
- YOLOV4
- YOLOV5
- EfficientNet
- Efficent Det
- Detr
- Deformable DETR
- Fcos
- YOLOV6
- YOLOV7
- YOLOV8
Semantic Segmentation
- UNet
- U2Net?
- DeepLab V1, V2, V3
- Mask2former V1, V2
- SSD?
Object Tracking, Instance Segmentation, Semantic change Detection
Solid understanding of machine learning training/deployment pipelines and their implementation.
Deploying models with TensorRT and ONNX (Serve, Scale AI)
Multi-sensor feature extraction and fusion, object detection and tracking, 3D Estimation, and embodied AI with Transformer based models. (Serve)
Traversability prediction.
Writing and maintaining automated continuous integration tests (D)
dice focal and tversky losses?
mixed training: https://pytorch.org/blog/what-every-user-should-know-about-mixed-precision-training-in-pytorch/
Visual Transformer
- NLP theory: Tokenization and Embeddings:
  - Know how images are represented as pixel grids and how they can be transformed into sequences of patches.
  - Patch Embeddings: Understand how images are divided into patches and how each patch is embedded into a vector to serve as input tokens.
- Path:
  - "Attention is All You Need"
  - "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale", 2020
  - NLP applications

Papers to Read (NLP concepts like tokenization, )

Shallow layer Conv
GRU
LSTM (hard, vanishing gradient theory)
LSTM: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Above and beyond GANs, VAEs for tasks like image synthesis.

Career Items

You're familiar with simulators such as Omniverse, OpenAI Gym, MuJoCo, Unity, or other video game environments.
Design, train and deploy learning-based perception models for on-robot perception systems. Perception models should be able to do multi-modal learning capturing different semantics such as segmentation, object detection, scene understanding and tracking.
Build foundational model for vision language and action that can exhibit good reasoning and maneuvering capability. Understands transformer based ML architecture really well.

DeepLab V3+

Deep Lab Implementation
- Create docker image (0.5h)
  - Use rwthik?
  - See if you are on orin nano. Not sure how to differentiate? 1. Rename dockerfile
  - code
- Coco Data loader
- Deep Lab v3 custom implementation
- Finish the deep lab vids. Organize
- Might need to do resnet first
- Watch radeontop and nvidia smi
- Create your own folder, share trained weights on google.

Done Projects

RGBD Slam: Being able to see rgbd slam integration, with hand-written optimization.
Coursera Study: All the way to transformer (Nov 26 - Jan 2, D)
- The transformer project (Training has taken: 397h). Need 580h (24 days) to reach 0.054. 6 more hours. Then release the weights