Hacks I had to do for HuggingFace Transformers - USC-LHAMa/CSCI544_Project GitHub Wiki
High-Level Steps
git clone https://github.com/huggingface/transformers
- Create
python venv
, install thetorch
, install fromrequirements.txt
,requirements-dev.txt
, and other required packages as errors were shown... - Code fix-ups
- Install LHAMa transformers package:
python -m pip install -e .
from local transformers directory containingsetup.py
- Let'er train!
Config I Ran
python3.7 -m torch.distributed.launch --nproc_per_node=1 ./examples/run_squad.py \ --model_type bert \ --model_name_or_path bert-large-uncased-whole-word-masking \ --do_train \ --do_lower_case \ --do_eval \ --train_file input/train-v2.0_trunc.json \ --predict_file input/dev-v2.0.json \ --learning_rate 3e-5 \ --num_train_epochs 1 \ --max_seq_length 384 \ --doc_stride 128 \ --output_dir ../models/wwm_uncased_finetuned_squad/ \ --per_gpu_eval_batch_size=3 \ --per_gpu_train_batch_size=3 \ --no_cuda \ --version_2_with_negative \ --local_rank=-1 \
Code Changes
- Comment out references in
run_squad.py
totorch.distributed.barrier()
- Reference the warmup class in
run_squad.py
withfrom transformers import WarmupLinearSchedule as get_linear_schedule_with_warmup
- Change instantiation of scheduler in
run_squad.py
toscheduler = get_linear_schedule_with_warmup(optimizer, args.warmup_steps, t_total)
(line ~105)