Hacks I had to do for HuggingFace Transformers - USC-LHAMa/CSCI544_Project GitHub Wiki

High-Level Steps

  • git clone https://github.com/huggingface/transformers
  • Create python venv, install the torch, install from requirements.txt, requirements-dev.txt, and other required packages as errors were shown...
  • Code fix-ups
  • Install LHAMa transformers package: python -m pip install -e . from local transformers directory containing setup.py
  • Let'er train!

Config I Ran

python3.7 -m torch.distributed.launch --nproc_per_node=1 ./examples/run_squad.py \ --model_type bert \ --model_name_or_path bert-large-uncased-whole-word-masking \ --do_train \ --do_lower_case \ --do_eval \ --train_file input/train-v2.0_trunc.json \ --predict_file input/dev-v2.0.json \ --learning_rate 3e-5 \ --num_train_epochs 1 \ --max_seq_length 384 \ --doc_stride 128 \ --output_dir ../models/wwm_uncased_finetuned_squad/ \ --per_gpu_eval_batch_size=3 \ --per_gpu_train_batch_size=3 \ --no_cuda \ --version_2_with_negative \ --local_rank=-1 \

Code Changes

  • Comment out references in run_squad.py to torch.distributed.barrier()
  • Reference the warmup class in run_squad.py with from transformers import WarmupLinearSchedule as get_linear_schedule_with_warmup
  • Change instantiation of scheduler in run_squad.py to scheduler = get_linear_schedule_with_warmup(optimizer, args.warmup_steps, t_total) (line ~105)

Reference Links