Gotchas - techconative/llm-finetune GitHub Wiki
-
The container created for GCP VM created with marketplace colab seems to be working fine for this.
-
When there is a OOM error on consecutive fine-tuning/inferences, try restarting the container/machine. That would free-up some memory. Of course this won't work when the resource itself not sufficient for the task.