12 Problems - shivamvats/notes GitHub Wiki

Potential of Robotics: Most of robotics research is either focused on niche applications like space exploration and deep sea exploration or on problems motivated by the money-making agenda of big tech companies - self-driving, drone delivery, etc. Is there any potential in robotics for solving problems motivated by the needs of a poor economy?
Transfer Learning: The idea that you can transfer what you learnt from one problem to help you better/faster solve another problem almost seems too good to be true. How far can we take this? Is there a theoretical framework to characterize this?
Planning with Poor Dynamics Model: What are the different ways to deal with an imperfect model? Can I leverage prior knowledge to limit the damage done to my plans?
Stiffness in Manipulation: Most of manipulation is limited to position control. It is known that variable impedance control is critical to more involved and interesting tasks, like, inserting a key in a key-hole, rotating a handle, wiping a surface, etc. Transferring gains learnt in simulation is difficult because it depends heavily on dynamics properties while there is no easy way to learn gains from human demos. How should we deal with this?
Force Sensing for Robust Manipulation: Franka provides joint torques and end-effector forces and torques. How can I leverage this for robust manipulation under uncertainty.
Planning in continuous MDP with a goal RL usually tries to learn the value function or the policy for the whole state space. Is finding the optimal policy for a fixed start state easier than the full problem? Policy search methods and CMA are greedy and often converge prematurely. How do I get a nice exploratory behavior, especially when I have a goal function.
Set up a Chess board: Autonomously plan and learn skills that allow the robot to open and set up a chess board.
Policy Learning as Supervized Learning: How to represent actions/trajectories and how to generate good data with low variance.
Partial observability how to learn abstractions and hierarchical policies under state uncertainty.
Open-loop vs Closed-loop Policies Is learning open-loop policies easier than learning closed-loop policies? The search space for the latter is clearly bigger but the reward signal for the former is much sparser. What situations are they better suited for?