Hacks I had to do for HuggingFace Transformers - USC-LHAMa/CSCI544_Project GitHub Wiki

High-Level Steps

git clone https://github.com/huggingface/transformers
Create python venv, install the torch, install from requirements.txt, requirements-dev.txt, and other required packages as errors were shown...
Code fix-ups
Install LHAMa transformers package: python -m pip install -e . from local transformers directory containing setup.py
Let'er train!

Config I Ran

python3.7 -m torch.distributed.launch --nproc_per_node=1 ./examples/run_squad.py \ --model_type bert \ --model_name_or_path bert-large-uncased-whole-word-masking \ --do_train \ --do_lower_case \ --do_eval \ --train_file input/train-v2.0_trunc.json \ --predict_file input/dev-v2.0.json \ --learning_rate 3e-5 \ --num_train_epochs 1 \ --max_seq_length 384 \ --doc_stride 128 \ --output_dir ../models/wwm_uncased_finetuned_squad/ \ --per_gpu_eval_batch_size=3 \ --per_gpu_train_batch_size=3 \ --no_cuda \ --version_2_with_negative \ --local_rank=-1 \

Code Changes

Comment out references in run_squad.py to torch.distributed.barrier()
Reference the warmup class in run_squad.py with from transformers import WarmupLinearSchedule as get_linear_schedule_with_warmup
Change instantiation of scheduler in run_squad.py to scheduler = get_linear_schedule_with_warmup(optimizer, args.warmup_steps, t_total) (line ~105)

Reference Links

distributed: https://github.com/facebookresearch/maskrcnn-benchmark/issues/257
Warmup class: https://github.com/huggingface/transformers/issues/1837