Useful Links (Tools, Models, Projects, Datasets, Simulators) - feliyur/exercises GitHub Wiki
Tools / Models / Paper Code
Vision
| Name / Link | Description | License |
|---|---|---|
| DepthAnything | Monocular depth prediction | Apache 2.0 |
| Segment Anything 2 | Segmentation | |
| https://timm.fast.ai/ | Library of pytorch vision models. | Apache 2.0 |
Video Generation
| Name / Link | Description | License |
|---|---|---|
| Sora | ||
| Veo | ||
| GAIA-2 | Driving simulation | proprietary |
Automotive
| Name / Link | Description | License |
|---|---|---|
| Patchwork++ | Segment ground in Lidar measurement | GPL-3.0 |
Audio and Speech
| Name / Link | Description | License |
|---|---|---|
| CosyVoice 3 | Voice Synthesis | Apache 2.0 |
| F5-TTS | Voice Synthesis | |
| Spark-TTS | Voice Synthesis | |
| NVIDIA Parakeet | Text 2 speech | |
| PlayDiffusion | Speech to text |
LLM Assistant Chat
| Name / Link | Description | License |
|---|---|---|
| ChatGPT | ||
| Claude | ||
| DeepSeek |
LLLms
| Name / Link | Description | License |
|---|---|---|
| Gemini | ||
| Gemini Diffusion |
Coding LLM / Agents / Tools
| Name / Link | Description | License | Notes |
|---|---|---|---|
| Github CoPilot | |||
| Cursor | |||
| Claude 4 Sonnet / Opus | |||
| Devstral | Top open-source | ||
| Codestral | Available on ollama. |
Mistral agents API https://mistral.ai/news/agents-api
Model usage examples https://github.com/unslothai/notebooks
Auto web-surf agent https://huggingface.co/blog/Hcompany/holo1
Robotics
| Name / Link | Description | License |
|---|---|---|
| PyRoki | Open-source python library for inverse kinematics |
Tools
| Name / Link | Description | License |
|---|---|---|
| latexify | Compile python code into latex formulae | MIT |
| dacite | Instantiates a dataclass object using a dictionary (apparently, recursively) | MIT |
| easydict, addict | Attribute dict implementations in python | LGPL-v3.0, MIT respectively |
| Open3D | Library for visualization and manipulation of 3D data. | MIT |
| Compiler Explorer | Compile and run C++ code online | |
| KDbg | graphical debugging interface | GPL-2.0 |
| bagpy | Load ROS bagfiles | MIT |
| pygeodesy | Geodesy utils | MIT |
| json crack | Visualize config | MIT |
Datasets
| Name \ Link | Description |
|---|---|
| CamVid | Motion-based Segmentation and Recognition Dataset |
| Make3D | Range image dataset |
| NYU Depth | Indoor Segmentation and Support Inference from RGBD Images |
| CityScapes | Semantic Understanding of Urban Street Scenes |
| KITTY | Outdoor driving datasets / vision benchmark suite. 3D Object Benchmark |
| Pascal3D+ | Massive 3D Object detection and pose estimation |
| Dubrovnik6K | Location recognition in urban outdoors |
| DeepLoc | large-scale urban outdoor localization dataset |
| CambridgeLandmarks | Localization, collected for PoseNet, using smartphone. Includes images, camera poses and Sfm reconstructions. |
| Cars Dataset | Stanford Cars Dataset. 16,185 images of 196 classes of cars |
| ModelNet | collection of 3D CAD models for objects, some annotated with orientation |
| SUN Dataset | Object detection, scene recognition. Collection of annotated images covering a large variety of environmental scenes, places and the objects within. |
| BigBIRD | 3D Database of Object Instances. 125 objects, images, RGB-D point clouds, pose information and segmentation, reconstructed meshes. |
| TUM RGB-D Dataset | Kinect data. Color and depth images of a Microsoft Kinect sensor along the ground-truth trajectory of the sensor. Indoors. |
| Matterport3D | Indoor environments, RGB-D, segmentation |
| Active Vision Dataset | "simulation of motion for object instance recognition in real-world environments" - RGB-D images, bounding boxes. |
| Objectron | Object-centric video clips with 3d detections and ground truth poses. |
| ScanNet | RGB-D (indoor) video dataset annotated with 3D camera poses, surface reconstructions, and instance-level semantic segmentations. |
| SceneNet | Photorealistic synthethic indoor trajectories with ShapeNet objects. |
| ShapeNet | Richly-annotated, large-scale dataset of 3D shapes. |
| Oxford RobotCar | RGB, Lidar, Radar |
| Driving Video Datasets | |
| BDD100K and BDD | Annotated driving videos from Berkeley. |
| Small Datasets | |
| Alderley Day/Night Dataset | Day/night street videos for the same route, with frame correspondences. |
Simulators
| Name \ Link | Description | Popularity | Accessed Date |
|---|---|---|---|
| AI2-THOR | Photorealistic Interactive Environments for AI Agents, indoors, photorealistic, interactive, physics. Documentation. | 1 | Nov 2020 |
| VizDoom | Doom-based AI Research Platform for Reinforcement Learning from Raw Visual Information | 1 | 2018 |
| OpenAI Gym | A toolkit (and environments) for developing and comparing reinforcement learning algorithms | 1 | 2018 |
| House3D | "Based on" Princeton's SUNCG. Has depth annotation and semantic labels. | 1 | 2018 |
| G2D | Allows collecting images and depth along a specified track in GTA V environment. Requires buying GTA for ~40$ (e.g. from STEAM). Windows only. Similar project DeepGTAV | 1 | 2018 |
| DeepMind Lab | 3D environments, prefer speed over realism | 1 | 2018 |
| AirSim | Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research, outdoor | 2 | 2018 |
| ESIM | Event camera simulator. Based on ROS. Very limited in input format as is, input either from UnrealCV or rendering of scene file (.obj). | 1 | 2018 |
| Carla | Autonomous driving simulator. Reasonably realistic | 1 | Feb 2019 |
| EuroPilot | Python interface for autonomous driving simulator, based on Euro Truck 2 (technically, captures screent output, so can be used with any game) | 1 | Feb 2019 |
Mapping / SLAM
| Name \ Link | Description | License | Note |
|---|---|---|---|
| OmniMapper | SLAM framework build on top of ROS + gtsam | MIT-like | From Henrik Christensen's group. |
| COLMAP | Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline (Schönberger, Pollefeys, Frahm) | BSD (Commercial) | |
| Hydra | Scene graph construction / semantic mapping. | MIT | From Luca Carlone's group |