Training German Newspapers - UB-Mannheim/kraken GitHub Wiki
Training of kraken model for German newspapers
Ground Truth
Deutscher Reichsanzeiger
Ground truth for German newspaper "Deutscher Reichsanzeiger und PreuΓischer Staatsanzeiger" (1819β1945)
Dataset: https://github.com/UB-Mannheim/reichsanzeiger-gt
Austrian Newspapers
NewsEye / READ OCR training dataset from Austrian Newspapers (1864β1911)
Dataset: https://github.com/UB-Mannheim/AustrianNewspapers
Neue ZΓΌrcher Zeitung
Ground truth for swiss newspaper "Neue ZΓΌrcher Zeitung" (1780β1947)
Dataset: https://github.com/UB-Mannheim/NZZ-black-letter-ground-truth
(based on STRΓBEL, Phillip; CLEMATIDE, Simon. Improving OCR of black letter in historical newspapers: the unreasonable effectiveness of HTR models on low-resolution images. 2019.)
Hakenkreuzbanner
Ground truth for a political newspaper of the Mannheim region (1931β1945)
Dataset: https://github.com/UB-Mannheim/hkb-gt
Evaluationset(!)
Preparing data for training
git clone https://github.com/UB-Mannheim/reichsanzeiger-gt
git clone https://github.com/UB-Mannheim/AustrianNewspapers
git clone https://github.com/UB-Mannheim/NZZ-black-letter-ground-truth
git clone https://github.com/UB-Mannheim/hkb-gt
./reichsanzeiger-gt/data/download_images.sh
mv -r ./reichsanzeiger-gt/data/images/* ./reichsanzeiger-gt/data/reichsanzeiger-1820-1939/GT-PAGE
./hkb-gt/data/download_images_to_page.sh
fdfind -a --full-path './reichsanzeiger-gt/data/reichsanzeiger-1820-1939/' -e xml >> german_newspapers.list
fdfind -a --full-path './NZZ-black-letter-ground-truth/data' -e xml >> german_newspapers.list
fdfind -a --full-path './AustrianNewspapers/data' -e xml >> german_newspapers.list
fdfind -a --full-path './hkb-gt/data' -e xml >> german_newspapers.list
shuf < german_newspapers.list | shuf > shuf_german_newspapers.list
export OMP_NUM_THREADS=1
Training with basemodel
Download basemodel german_print
wget https://ub-backup.bib.uni-mannheim.de/~stweil/tesstrain/kraken/german_print/german_print_best.mlmodel
Creating an apache arrow file with ketos -> german_newspapers.arrow is about 4 GB in size
time ketos compile --format-type xml --files ./shuf_german_newspapers.list --workers 12 -o ./german_newspapers.arrow
Extracting lines ββββββββββββββββββββββββββββββββββββββββ 100% 236003/236003 -:--:-- -:--:--
Output file written to ./german_newspapers.arrow
ketos compile --format-type xml --files ./shuf_german_newspapers.list 12 -o 25965,31s user 14011,24s system 1173% cpu 56:45,47 total
Comparing different neural network topolgies for training a basemodel
Here we test the topologies most commonly used in the Kraken (kraken, htru) and Transkribus (htr+) communities, a topology generated by OpenAI's GPT (gpt), and a unique variant derived from these topologies (sgd).
Interestingly, the results showed that there were no significant differences in training cycles or achieved accuracies among most of these topologies. This suggests that the choice of topology may be less critical than previously thought, as long as a basic level of complexity is maintained.
The larger topologies did not show a noticeable positive impact on performance (on our validation set). This may indicate that the training data (just printed Text, no handwritting) and the variations of characters and fonts may be too low in complexity to fully exploit the potential of these neural networks. Therefore, when training with larger datasets, it might be useful to choose a more complex topology, especially using mixed datasets (prints and manuscripts).
More detailed information about the specific network topologies, their parameters, and the accuracies achieved can be found in the attached tables. This provides a comprehensive overview and allows interested parties to review the results in detail and evaluate them for their own applications.
Information about the parameter and training
Data preparation
The datasets contains: Reichsanzeiger-gt, Austrian Newspapers and NZZ-Blackletter.
(kraken-venv) jkamlah@notebook20 ξ° ~/Coding/models/german_newspapers/kraken ξ° time ketos compile --format-type xml --files ./shuf_german_newspapers_31-12-2023.list --workers 12 -o ./german_newspapers_2023_12.arrow
Extracting lines ββββββββββββββββββββββββββββββββββββββββ 100% 222610/222610 -:--:-- -:--:--
Output file written to ./german_newspapers_2023_12.arrow
ketos compile --format-type xml --files --workers 12 -o 21375,98s user 34524,70s system 1569% cpu 59:21,94 total
kraken
This network topology is often used and recommended by Benjaming Kiessling, the main developer of Kraken.
It is rather small, but performs quite well in the evaluation.
(kraken-venv) β jkamlah@notebook20 ξ° ~/Coding/models/german_newspapers/kraken ξ° time nice ketos train -f binary -o ./20231231/kraken/german_newspapers -d cuda:0 --lag 10 -r 0.0001 -B 4 -w 0 -s '[1,120,0,1 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,13,32 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 Mp2,2 Cr3,9,64 Do0.1,2 S1(1x0)1,3 Lbx200 Do0.1,2 Lbx200 Do.1,2 Lbx200 Do]' german_newspapers_2023_12.arrow
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA RTX A4000 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
ββββββ³ββββββββββββ³βββββββββββββββββββββββββββ³βββββββββ³βββββββββββββββββββββββββββ³βββββββββββββββββββββββββββ
β β Name β Type β Params β In sizes β Out sizes β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β 0 β val_cer β CharErrorRate β 0 β ? β ? β
β 1 β val_wer β WordErrorRate β 0 β ? β ? β
β 2 β net β MultiParamSequential β 4.1 M β [[1, 1, 120, 400], '?'] β [[1, 264, 1, 50], '?'] β
β 3 β net.C_0 β ActConv2D β 1.3 K β [[1, 1, 120, 400], '?'] β [[1, 32, 120, 400], '?'] β
β 4 β net.Do_1 β Dropout β 0 β [[1, 32, 120, 400], '?'] β [[1, 32, 120, 400], '?'] β
β 5 β net.Mp_2 β MaxPool β 0 β [[1, 32, 120, 400], '?'] β [[1, 32, 60, 200], '?'] β
β 6 β net.C_3 β ActConv2D β 40.0 K β [[1, 32, 60, 200], '?'] β [[1, 32, 60, 200], '?'] β
β 7 β net.Do_4 β Dropout β 0 β [[1, 32, 60, 200], '?'] β [[1, 32, 60, 200], '?'] β
β 8 β net.Mp_5 β MaxPool β 0 β [[1, 32, 60, 200], '?'] β [[1, 32, 30, 100], '?'] β
β 9 β net.C_6 β ActConv2D β 55.4 K β [[1, 32, 30, 100], '?'] β [[1, 64, 30, 100], '?'] β
β 10 β net.Do_7 β Dropout β 0 β [[1, 64, 30, 100], '?'] β [[1, 64, 30, 100], '?'] β
β 11 β net.Mp_8 β MaxPool β 0 β [[1, 64, 30, 100], '?'] β [[1, 64, 15, 50], '?'] β
β 12 β net.C_9 β ActConv2D β 110 K β [[1, 64, 15, 50], '?'] β [[1, 64, 15, 50], '?'] β
β 13 β net.Do_10 β Dropout β 0 β [[1, 64, 15, 50], '?'] β [[1, 64, 15, 50], '?'] β
β 14 β net.S_11 β Reshape β 0 β [[1, 64, 15, 50], '?'] β [[1, 960, 1, 50], '?'] β
β 15 β net.L_12 β TransposedSummarizingRNN β 1.9 M β [[1, 960, 1, 50], '?'] β [[1, 400, 1, 50], '?'] β
β 16 β net.Do_13 β Dropout β 0 β [[1, 400, 1, 50], '?'] β [[1, 400, 1, 50], '?'] β
β 17 β net.L_14 β TransposedSummarizingRNN β 963 K β [[1, 400, 1, 50], '?'] β [[1, 400, 1, 50], '?'] β
β 18 β net.Do_15 β Dropout β 0 β [[1, 400, 1, 50], '?'] β [[1, 400, 1, 50], '?'] β
β 19 β net.L_16 β TransposedSummarizingRNN β 963 K β [[1, 400, 1, 50], '?'] β [[1, 400, 1, 50], '?'] β
β 20 β net.Do_17 β Dropout β 0 β [[1, 400, 1, 50], '?'] β [[1, 400, 1, 50], '?'] β
β 21 β net.O_18 β LinSoftmax β 105 K β [[1, 400, 1, 50], '?'] β [[1, 264, 1, 50], '?'] β
ββββββ΄ββββββββββββ΄βββββββββββββββββββββββββββ΄βββββββββ΄βββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββ
Trainable params: 4.1 M
Non-trainable params: 0
Total params: 4.1 M
Total estimated model params size (MB): 16
stage 0/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:03 β’ 0:00:00 14.33it/s val_accuracy: 0.985 val_word_accuracy: 0.928 early_stopping: 0/10 0.98519
stage 1/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:05 β’ 0:00:00 14.21it/s val_accuracy: 0.989 val_word_accuracy: 0.949 early_stopping: 0/10 0.98946
stage 2/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:05 β’ 0:00:00 14.16it/s val_accuracy: 0.991 val_word_accuracy: 0.956 early_stopping: 0/10 0.99098
stage 3/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:52 β’ 0:00:00 14.31it/s val_accuracy: 0.992 val_word_accuracy: 0.961 early_stopping: 0/10 0.99178
stage 4/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:57 β’ 0:00:00 14.44it/s val_accuracy: 0.992 val_word_accuracy: 0.962 early_stopping: 0/10 0.99218
stage 5/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:57 β’ 0:00:00 14.26it/s val_accuracy: 0.993 val_word_accuracy: 0.966 early_stopping: 0/10 0.99289
stage 6/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:57 β’ 0:00:00 14.13it/s val_accuracy: 0.993 val_word_accuracy: 0.966 early_stopping: 0/10 0.99292
stage 7/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:57 β’ 0:00:00 14.20it/s val_accuracy: 0.993 val_word_accuracy: 0.966 early_stopping: 1/10 0.99292
stage 8/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:59 β’ 0:00:00 14.02it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 0/10 0.99344
stage 9/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:54 β’ 0:00:00 14.13it/s val_accuracy: 0.993 val_word_accuracy: 0.968 early_stopping: 1/10 0.99344
stage 10/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:01 β’ 0:00:00 14.28it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 2/10 0.99344
stage 11/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:02 β’ 0:00:00 14.25it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99362
stage 12/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:03 β’ 0:00:00 14.16it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99363
stage 13/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:00 β’ 0:00:00 14.30it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99371
stage 14/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:56 β’ 0:00:00 14.11it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99380
stage 15/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:59 β’ 0:00:00 14.20it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99385
stage 16/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:54 β’ 0:00:00 14.08it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99390
stage 17/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:57 β’ 0:00:00 14.18it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 1/10 0.99390
stage 18/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:56 β’ 0:00:00 14.34it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 2/10 0.99390
stage 19/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:02 β’ 0:00:00 13.84it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 0/10 0.99395
stage 20/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:52 β’ 0:00:00 14.17it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 0/10 0.99401
stage 21/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:54 β’ 0:00:00 14.28it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 1/10 0.99401
stage 22/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:58 β’ 0:00:00 14.34it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 0/10 0.99402
stage 23/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:55 β’ 0:00:00 14.00it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 0/10 0.99406
stage 24/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:56 β’ 0:00:00 14.13it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99414
stage 25/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:57 β’ 0:00:00 14.17it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 1/10 0.99414
stage 26/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:03 β’ 0:00:00 14.15it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 2/10 0.99414
stage 27/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:06 β’ 0:00:00 13.90it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99419
stage 28/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:01 β’ 0:00:00 14.10it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 1/10 0.99419
stage 29/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:51 β’ 0:00:00 14.54it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 2/10 0.99419
stage 30/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:57 β’ 0:00:00 14.15it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 3/10 0.99419
stage 31/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:57 β’ 0:00:00 14.12it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 4/10 0.99419
stage 32/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:58 β’ 0:00:00 14.48it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 5/10 0.99419
stage 33/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:58 β’ 0:00:00 14.24it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 6/10 0.99419
stage 34/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:55 β’ 0:00:00 14.31it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 7/10 0.99419
stage 35/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:57 β’ 0:00:00 14.18it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 8/10 0.99419
stage 36/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:52 β’ 0:00:00 14.09it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 9/10 0.99419
stage 37/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:55 β’ 0:00:00 13.97it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99427
stage 38/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:03 β’ 0:00:00 14.23it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99431
stage 39/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:00 β’ 0:00:00 14.23it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 1/10 0.99431
stage 40/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:54 β’ 0:00:00 14.38it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 2/10 0.99431
stage 41/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:55 β’ 0:00:00 14.08it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 3/10 0.99431
stage 42/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:51 β’ 0:00:00 14.06it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 4/10 0.99431
stage 43/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:00:04 β’ 0:00:00 13.78it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 5/10 0.99431
stage 44/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:54 β’ 0:00:00 13.70it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 6/10 0.99431
stage 45/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:59 β’ 0:00:00 14.05it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 7/10 0.99431
stage 46/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:00:00 β’ 0:00:00 13.61it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 8/10 0.99431
stage 47/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:00:03 β’ 0:00:00 13.76it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 9/10 0.99431
stage 48/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:58 β’ 0:00:00 13.85it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 10/10 0.99431
Moving best model ./20231231/kraken/german_newspapers_38.mlmodel (0.9943079948425293) to ./20231231/kraken/german_newspapers_best.mlmodel
nice ketos train -f binary -o ./20231231/krakenD/german_newspapers -d cuda:0 210504,49s user 7813,78s system 119% cpu 50:42:36,20 total
htru
This network topology is often used by Thibault ClΓ©rice and Alix Chague, the main developer of HTR-United.
It is quite complex and could potentially outperform smaller networks if manuscripts or mixed datasets were used.
time nice ketos train -f binary -o ./20231231/htru/german_newspapers -d cuda:0 --lag 10 -r 0.0001 -B 4 -w 0 -s '[1,120,0,1 Cr4,2,32,4,2 Gn32 Cr4,2,64,1,1 Gn32 Mp4,2,4,2 Cr3,3,128,1,1 Gn32 Mp1,2,1,2 S1(1x0)1,3 Lbx256 Do0.5 Lbx256 Do0.5 Lbx256 Do0.5]' german_newspapers_2023_12.arrow
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA RTX A4000 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
ββββββ³ββββββββββββ³βββββββββββββββββββββββββββ³βββββββββ³ββββββββββββββββββββββββββ³ββββββββββββββββββββββββββ
β β Name β Type β Params β In sizes β Out sizes β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β 0 β val_cer β CharErrorRate β 0 β ? β ? β
β 1 β val_wer β WordErrorRate β 0 β ? β ? β
β 2 β net β MultiParamSequential β 5.7 M β [[1, 1, 120, 400], '?'] β [[1, 264, 1, 49], '?'] β
β 3 β net.C_0 β ActConv2D β 288 β [[1, 1, 120, 400], '?'] β [[1, 32, 30, 200], '?'] β
β 4 β net.Gn_1 β GroupNorm β 64 β [[1, 32, 30, 200], '?'] β [[1, 32, 30, 200], '?'] β
β 5 β net.C_2 β ActConv2D β 16.4 K β [[1, 32, 30, 200], '?'] β [[1, 64, 29, 199], '?'] β
β 6 β net.Gn_3 β GroupNorm β 128 β [[1, 64, 29, 199], '?'] β [[1, 64, 29, 199], '?'] β
β 7 β net.Mp_4 β MaxPool β 0 β [[1, 64, 29, 199], '?'] β [[1, 64, 7, 99], '?'] β
β 8 β net.C_5 β ActConv2D β 73.9 K β [[1, 64, 7, 99], '?'] β [[1, 128, 7, 99], '?'] β
β 9 β net.Gn_6 β GroupNorm β 256 β [[1, 128, 7, 99], '?'] β [[1, 128, 7, 99], '?'] β
β 10 β net.Mp_7 β MaxPool β 0 β [[1, 128, 7, 99], '?'] β [[1, 128, 7, 49], '?'] β
β 11 β net.S_8 β Reshape β 0 β [[1, 128, 7, 49], '?'] β [[1, 896, 1, 49], '?'] β
β 12 β net.L_9 β TransposedSummarizingRNN β 2.4 M β [[1, 896, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 13 β net.Do_10 β Dropout β 0 β [[1, 512, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 14 β net.L_11 β TransposedSummarizingRNN β 1.6 M β [[1, 512, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 15 β net.Do_12 β Dropout β 0 β [[1, 512, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 16 β net.L_13 β TransposedSummarizingRNN β 1.6 M β [[1, 512, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 17 β net.Do_14 β Dropout β 0 β [[1, 512, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 18 β net.O_15 β LinSoftmax β 135 K β [[1, 512, 1, 49], '?'] β [[1, 264, 1, 49], '?'] β
ββββββ΄ββββββββββββ΄βββββββββββββββββββββββββββ΄βββββββββ΄ββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββ
Trainable params: 5.7 M
Non-trainable params: 0
Total params: 5.7 M
Total estimated model params size (MB): 22
stage 0/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:48:05 β’ 0:00:00 17.44it/s val_accuracy: 0.987 val_word_accuracy: 0.935 early_stopping: 0/10 0.98660
stage 1/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:49:45 β’ 0:00:00 16.97it/s val_accuracy: 0.99 val_word_accuracy: 0.951 early_stopping: 0/10 0.98987
stage 2/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:49:49 β’ 0:00:00 16.99it/s val_accuracy: 0.991 val_word_accuracy: 0.957 early_stopping: 0/10 0.99111
stage 3/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:36 β’ 0:00:00 16.62it/s val_accuracy: 0.992 val_word_accuracy: 0.96 early_stopping: 0/10 0.99188
stage 4/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:32 β’ 0:00:00 16.26it/s val_accuracy: 0.992 val_word_accuracy: 0.964 early_stopping: 0/10 0.99247
stage 5/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:42 β’ 0:00:00 16.14it/s val_accuracy: 0.993 val_word_accuracy: 0.965 early_stopping: 0/10 0.99282
stage 6/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:55:55 β’ 0:00:00 13.90it/s val_accuracy: 0.992 val_word_accuracy: 0.963 early_stopping: 1/10 0.99282
stage 7/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:22 β’ 0:00:00 16.70it/s val_accuracy: 0.993 val_word_accuracy: 0.966 early_stopping: 0/10 0.99306
stage 8/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:55:13 β’ 0:00:00 13.74it/s val_accuracy: 0.993 val_word_accuracy: 0.967 early_stopping: 0/10 0.99325
stage 9/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:18 β’ 0:00:00 16.38it/s val_accuracy: 0.993 val_word_accuracy: 0.967 early_stopping: 1/10 0.99325
stage 10/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:37 β’ 0:00:00 16.44it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 0/10 0.99350
stage 11/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:28 β’ 0:00:00 16.18it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99383
stage 12/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:08 β’ 0:00:00 16.45it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 1/10 0.99383
stage 13/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:30 β’ 0:00:00 16.49it/s val_accuracy: 0.994 val_word_accuracy: 0.969 early_stopping: 2/10 0.99383
stage 14/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:32 β’ 0:00:00 16.42it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 3/10 0.99383
stage 15/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:32 β’ 0:00:00 16.24it/s val_accuracy: 0.994 val_word_accuracy: 0.969 early_stopping: 4/10 0.99383
stage 16/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:31 β’ 0:00:00 16.26it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99404
stage 17/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:51 β’ 0:00:00 14.36it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99408
stage 18/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:07 β’ 0:00:00 16.57it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 1/10 0.99408
stage 19/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:35 β’ 0:00:00 16.37it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 0/10 0.99420
stage 20/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:38 β’ 0:00:00 14.41it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99424
stage 21/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:23 β’ 0:00:00 14.08it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 1/10 0.99424
stage 22/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:45 β’ 0:00:00 14.34it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99425
stage 23/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:49 β’ 0:00:00 16.64it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 1/10 0.99425
stage 24/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:37 β’ 0:00:00 16.34it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 2/10 0.99425
stage 25/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:36 β’ 0:00:00 16.78it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99426
stage 26/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:38 β’ 0:00:00 16.71it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99434
stage 27/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:37 β’ 0:00:00 16.63it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 1/10 0.99434
stage 28/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:38 β’ 0:00:00 16.63it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99438
stage 29/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:22 β’ 0:00:00 16.68it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 1/10 0.99438
stage 30/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:37 β’ 0:00:00 16.46it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 2/10 0.99438
stage 31/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:41 β’ 0:00:00 16.21it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 3/10 0.99438
stage 32/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:41 β’ 0:00:00 16.32it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 4/10 0.99438
stage 33/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:37 β’ 0:00:00 16.66it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 5/10 0.99438
stage 34/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:42 β’ 0:00:00 16.47it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99439
stage 35/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:43 β’ 0:00:00 16.42it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99442
stage 36/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:43 β’ 0:00:00 16.43it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 1/10 0.99442
stage 37/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:54:10 β’ 0:00:00 16.80it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 2/10 0.99442
stage 38/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:29 β’ 0:00:00 16.31it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 3/10 0.99442
stage 39/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:43 β’ 0:00:00 16.28it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 4/10 0.99442
stage 40/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:44 β’ 0:00:00 16.29it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 5/10 0.99442
stage 41/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:21 β’ 0:00:00 16.64it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 6/10 0.99442
stage 42/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:53:59 β’ 0:00:00 16.45it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 7/10 0.99442
stage 43/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:29 β’ 0:00:00 16.48it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 8/10 0.99442
stage 44/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:41 β’ 0:00:00 16.55it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 9/10 0.99442
stage 45/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:44 β’ 0:00:00 16.68it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 10/10 0.99442
Moving best model ./20231231/htru/german_newspapers_35.mlmodel (0.9944181442260742) to ./20231231/htru/german_newspapers_best.mlmodel
nice ketos train -f binary -o ./20231231/htru/german_newspapers -d 183509,75s user 9714,60s system 128% cpu 41:43:41,61 total
htr+
This is the network topology was developed by CITlab and used in the Transkribus project.
It is quite complex and could potentially outperform smaller networks if manuscripts or mixed datasets were used.
time nice ketos train -f binary -o ./20231231/htr+/german_newspapers -d cuda:0 --lag 10 -r 0.0001 -B 4 -w 0 -s '[1,128,0,1 Cr4,2,8,4,2 Cr4,2,32,1,1 Mp4,2,4,2 Cr3,3,64,1,1 Mp1,2,1,2 S1(1x0)1,3 Lbx256 Do0.5 Lbx256 Do0.5 Lbx256 Do0.5]' german_newspapers_2023_12.arrow
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA RTX A4000 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
ββββββ³ββββββββββββ³βββββββββββββββββββββββββββ³βββββββββ³ββββββββββββββββββββββββββ³ββββββββββββββββββββββββββ
β β Name β Type β Params β In sizes β Out sizes β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β 0 β val_cer β CharErrorRate β 0 β ? β ? β
β 1 β val_wer β WordErrorRate β 0 β ? β ? β
β 2 β net β MultiParamSequential β 4.8 M β [[1, 1, 128, 400], '?'] β [[1, 264, 1, 49], '?'] β
β 3 β net.C_0 β ActConv2D β 72 β [[1, 1, 128, 400], '?'] β [[1, 8, 32, 200], '?'] β
β 4 β net.C_1 β ActConv2D β 2.1 K β [[1, 8, 32, 200], '?'] β [[1, 32, 31, 199], '?'] β
β 5 β net.Mp_2 β MaxPool β 0 β [[1, 32, 31, 199], '?'] β [[1, 32, 7, 99], '?'] β
β 6 β net.C_3 β ActConv2D β 18.5 K β [[1, 32, 7, 99], '?'] β [[1, 64, 7, 99], '?'] β
β 7 β net.Mp_4 β MaxPool β 0 β [[1, 64, 7, 99], '?'] β [[1, 64, 7, 49], '?'] β
β 8 β net.S_5 β Reshape β 0 β [[1, 64, 7, 49], '?'] β [[1, 448, 1, 49], '?'] β
β 9 β net.L_6 β TransposedSummarizingRNN β 1.4 M β [[1, 448, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 10 β net.Do_7 β Dropout β 0 β [[1, 512, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 11 β net.L_8 β TransposedSummarizingRNN β 1.6 M β [[1, 512, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 12 β net.Do_9 β Dropout β 0 β [[1, 512, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 13 β net.L_10 β TransposedSummarizingRNN β 1.6 M β [[1, 512, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 14 β net.Do_11 β Dropout β 0 β [[1, 512, 1, 49], '?'] β [[1, 512, 1, 49], '?'] β
β 15 β net.O_12 β LinSoftmax β 135 K β [[1, 512, 1, 49], '?'] β [[1, 264, 1, 49], '?'] β
ββββββ΄ββββββββββββ΄βββββββββββββββββββββββββββ΄βββββββββ΄ββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββ
Trainable params: 4.8 M
Non-trainable params: 0
Total params: 4.8 M
Total estimated model params size (MB): 19
stage 0/β ββββββββββββββββββββββββ 50088/50088 0:45:17 β’ 0:00:00 18.54it/s val_accuracy: 0.983 early_stopping: 0/10 val_word_accuracy: 0.921 0.98301
stage 1/β ββββββββββββββββββββββββ 50088/50088 0:46:34 β’ 0:00:00 17.92it/s val_accuracy: 0.988 early_stopping: 0/10 val_word_accuracy: 0.944 0.98826
stage 2/β ββββββββββββββββββββββββ 50088/50088 0:48:04 β’ 0:00:00 17.12it/s val_accuracy: 0.991 early_stopping: 0/10 val_word_accuracy: 0.955 0.99067
stage 3/β ββββββββββββββββββββββββ 50088/50088 0:48:17 β’ 0:00:00 17.64it/s val_accuracy: 0.992 early_stopping: 0/10 val_word_accuracy: 0.96 0.99153
stage 4/β ββββββββββββββββββββββββ 50088/50088 0:47:39 β’ 0:00:00 17.59it/s val_accuracy: 0.992 early_stopping: 0/10 val_word_accuracy: 0.962 0.99213
stage 5/β ββββββββββββββββββββββββ 50088/50088 0:47:42 β’ 0:00:00 17.28it/s val_accuracy: 0.993 early_stopping: 0/10 val_word_accuracy: 0.964 0.99261
stage 6/β ββββββββββββββββββββββββ 50088/50088 0:47:39 β’ 0:00:00 17.36it/s val_accuracy: 0.993 early_stopping: 0/10 val_word_accuracy: 0.966 0.99285
stage 7/β ββββββββββββββββββββββββ 50088/50088 0:47:45 β’ 0:00:00 17.65it/s val_accuracy: 0.993 early_stopping: 0/10 val_word_accuracy: 0.967 0.99313
stage 8/β ββββββββββββββββββββββββ 50088/50088 0:46:58 β’ 0:00:00 18.58it/s val_accuracy: 0.993 early_stopping: 1/10 val_word_accuracy: 0.967 0.99313
stage 9/β ββββββββββββββββββββββββ 50088/50088 0:45:45 β’ 0:00:00 18.53it/s val_accuracy: 0.993 early_stopping: 0/10 val_word_accuracy: 0.968 0.99342
stage 10/β βββββββββββββββββββββββ 50088/50088 0:45:43 β’ 0:00:00 18.07it/s val_accuracy: 0.993 early_stopping: 1/10 val_word_accuracy: 0.967 0.99342
stage 11/β βββββββββββββββββββββββ 50088/50088 0:45:43 β’ 0:00:00 18.41it/s val_accuracy: 0.994 early_stopping: 0/10 val_word_accuracy: 0.969 0.99369
stage 12/β βββββββββββββββββββββββ 50088/50088 0:45:46 β’ 0:00:00 18.34it/s val_accuracy: 0.994 early_stopping: 0/10 val_word_accuracy: 0.97 0.99382
stage 13/β βββββββββββββββββββββββ 50088/50088 0:45:41 β’ 0:00:00 18.84it/s val_accuracy: 0.994 early_stopping: 0/10 val_word_accuracy: 0.971 0.99389
stage 14/β βββββββββββββββββββββββ 50088/50088 0:45:45 β’ 0:00:00 18.38it/s val_accuracy: 0.994 early_stopping: 0/10 val_word_accuracy: 0.971 0.99389
stage 15/β βββββββββββββββββββββββ 50088/50088 0:45:45 β’ 0:00:00 18.49it/s val_accuracy: 0.994 early_stopping: 1/10 val_word_accuracy: 0.97 0.99389
stage 16/β βββββββββββββββββββββββ 50088/50088 0:45:42 β’ 0:00:00 18.49it/s val_accuracy: 0.994 early_stopping: 0/10 val_word_accuracy: 0.972 0.99403
stage 17/β βββββββββββββββββββββββ 50088/50088 0:45:45 β’ 0:00:00 18.31it/s val_accuracy: 0.993 early_stopping: 1/10 val_word_accuracy: 0.968 0.99403
stage 18/β βββββββββββββββββββββββ 50088/50088 0:45:45 β’ 0:00:00 18.46it/s val_accuracy: 0.994 early_stopping: 0/10 val_word_accuracy: 0.972 0.99410
stage 19/β βββββββββββββββββββββββ 50088/50088 0:45:46 β’ 0:00:00 18.41it/s val_accuracy: 0.994 early_stopping: 0/10 val_word_accuracy: 0.972 0.99416
stage 20/β βββββββββββββββββββββββ 50088/50088 0:45:47 β’ 0:00:00 18.18it/s val_accuracy: 0.994 early_stopping: 1/10 val_word_accuracy: 0.971 0.99416
stage 21/β βββββββββββββββββββββββ 50088/50088 0:45:46 β’ 0:00:00 18.49it/s val_accuracy: 0.994 early_stopping: 2/10 val_word_accuracy: 0.972 0.99416
stage 22/β βββββββββββββββββββββββ 50088/50088 0:45:47 β’ 0:00:00 18.20it/s val_accuracy: 0.994 early_stopping: 0/10 val_word_accuracy: 0.972 0.99419
stage 23/β βββββββββββββββββββββββ 50088/50088 0:45:49 β’ 0:00:00 18.49it/s val_accuracy: 0.994 early_stopping: 0/10 val_word_accuracy: 0.973 0.99421
stage 24/β βββββββββββββββββββββββ 50088/50088 0:45:42 β’ 0:00:00 18.48it/s val_accuracy: 0.994 early_stopping: 0/10 val_word_accuracy: 0.973 0.99430
stage 25/β βββββββββββββββββββββββ 50088/50088 0:45:44 β’ 0:00:00 18.24it/s val_accuracy: 0.994 early_stopping: 1/10 val_word_accuracy: 0.973 0.99430
stage 26/β βββββββββββββββββββββββ 50088/50088 0:45:46 β’ 0:00:00 18.01it/s val_accuracy: 0.994 early_stopping: 2/10 val_word_accuracy: 0.972 0.99430
stage 27/β βββββββββββββββββββββββ 50088/50088 0:45:44 β’ 0:00:00 18.34it/s val_accuracy: 0.994 early_stopping: 3/10 val_word_accuracy: 0.972 0.99430
stage 28/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:18 β’ 0:00:00 17.95it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 4/10 0.99430
stage 29/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:53:32 β’ 0:00:00 14.57it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 0/10 0.99438
stage 30/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:54:49 β’ 0:00:00 15.01it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 1/10 0.99438
stage 31/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:54:25 β’ 0:00:00 15.30it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 2/10 0.99438
stage 32/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:48:15 β’ 0:00:00 17.84it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 3/10 0.99438
stage 33/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:53:27 β’ 0:00:00 14.30it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 4/10 0.99438
stage 34/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:49:38 β’ 0:00:00 17.53it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 5/10 0.99438
stage 35/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:49:29 β’ 0:00:00 14.25it/s val_accuracy: 0.995 val_word_accuracy: 0.974 early_stopping: 0/10 0.99451
stage 36/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:54:41 β’ 0:00:00 17.24it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 1/10 0.99451
stage 37/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:52:41 β’ 0:00:00 17.80it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 2/10 0.99451
stage 38/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:51:58 β’ 0:00:00 14.66it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 3/10 0.99451
stage 39/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:51:29 β’ 0:00:00 17.50it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 4/10 0.99451
stage 40/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:47:38 β’ 0:00:00 17.61it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 5/10 0.99451
stage 41/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:47:39 β’ 0:00:00 17.47it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 6/10 0.99451
stage 42/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:47:39 β’ 0:00:00 17.41it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 7/10 0.99451
stage 43/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:50:48 β’ 0:00:00 14.55it/s val_accuracy: 0.994 val_word_accuracy: 0.974 early_stopping: 8/10 0.99451
stage 44/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:49:31 β’ 0:00:00 17.31it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 9/10 0.99451
stage 45/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:47:42 β’ 0:00:00 17.41it/s val_accuracy: 0.994 val_word_accuracy: 0.973 early_stopping: 10/10 0.99451
Moving best model ./20231231/htr+/german_newspapers_35.mlmodel (0.9945129752159119) to ./20231231/htr+/german_newspapers_best.mlmodel
nice ketos train -f binary -o ./20231231/htr+/german_newspapers -d cuda:0 10 171697,09s user 11665,18s system 130% cpu 38:55:34,52 total
gpt
This is the network topology was recommended by ChatGPT based on the other networks as input.
It is quite complex and could potentially outperform smaller networks if manuscripts or mixed datasets were used.
time nice ketos train -f binary -o ./20231231/gpt/german_newspapers -d cuda:0 --lag 10 -r 0.0001 -B 4 -w 0 -s '[1,120,0,1 Cr3,3,32,1,1 Gn32 Mp2,2 Cr3,3,64,1,1 Gn64 Mp2,2,2,2 Cr3,3,128,1,1 Gn128 Mp2,2,2,2 Cr3,3,256,1,1 Gn256 Mp2,2,2,2 S1(1x0)1,3 Lbx256 Do0.2 Lbx256 Do0.2 Lbx256 Do0.2]' german_newspapers_2023_12.arrow
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA RTX A4000 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
ββββββ³ββββββββββββ³βββββββββββββββββββββββββββ³βββββββββ³βββββββββββββββββββββββββββ³βββββββββββββββββββββββββββ
β β Name β Type β Params β In sizes β Out sizes β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β 0 β val_cer β CharErrorRate β 0 β ? β ? β
β 1 β val_wer β WordErrorRate β 0 β ? β ? β
β 2 β net β MultiParamSequential β 7.9 M β [[1, 1, 120, 400], '?'] β [[1, 264, 1, 25], '?'] β
β 3 β net.C_0 β ActConv2D β 320 β [[1, 1, 120, 400], '?'] β [[1, 32, 120, 400], '?'] β
β 4 β net.Gn_1 β GroupNorm β 64 β [[1, 32, 120, 400], '?'] β [[1, 32, 120, 400], '?'] β
β 5 β net.Mp_2 β MaxPool β 0 β [[1, 32, 120, 400], '?'] β [[1, 32, 60, 200], '?'] β
β 6 β net.C_3 β ActConv2D β 18.5 K β [[1, 32, 60, 200], '?'] β [[1, 64, 60, 200], '?'] β
β 7 β net.Gn_4 β GroupNorm β 128 β [[1, 64, 60, 200], '?'] β [[1, 64, 60, 200], '?'] β
β 8 β net.Mp_5 β MaxPool β 0 β [[1, 64, 60, 200], '?'] β [[1, 64, 30, 100], '?'] β
β 9 β net.C_6 β ActConv2D β 73.9 K β [[1, 64, 30, 100], '?'] β [[1, 128, 30, 100], '?'] β
β 10 β net.Gn_7 β GroupNorm β 256 β [[1, 128, 30, 100], '?'] β [[1, 128, 30, 100], '?'] β
β 11 β net.Mp_8 β MaxPool β 0 β [[1, 128, 30, 100], '?'] β [[1, 128, 15, 50], '?'] β
β 12 β net.C_9 β ActConv2D β 295 K β [[1, 128, 15, 50], '?'] β [[1, 256, 15, 50], '?'] β
β 13 β net.Gn_10 β GroupNorm β 512 β [[1, 256, 15, 50], '?'] β [[1, 256, 15, 50], '?'] β
β 14 β net.Mp_11 β MaxPool β 0 β [[1, 256, 15, 50], '?'] β [[1, 256, 7, 25], '?'] β
β 15 β net.S_12 β Reshape β 0 β [[1, 256, 7, 25], '?'] β [[1, 1792, 1, 25], '?'] β
β 16 β net.L_13 β TransposedSummarizingRNN β 4.2 M β [[1, 1792, 1, 25], '?'] β [[1, 512, 1, 25], '?'] β
β 17 β net.Do_14 β Dropout β 0 β [[1, 512, 1, 25], '?'] β [[1, 512, 1, 25], '?'] β
β 18 β net.L_15 β TransposedSummarizingRNN β 1.6 M β [[1, 512, 1, 25], '?'] β [[1, 512, 1, 25], '?'] β
β 19 β net.Do_16 β Dropout β 0 β [[1, 512, 1, 25], '?'] β [[1, 512, 1, 25], '?'] β
β 20 β net.L_17 β TransposedSummarizingRNN β 1.6 M β [[1, 512, 1, 25], '?'] β [[1, 512, 1, 25], '?'] β
β 21 β net.Do_18 β Dropout β 0 β [[1, 512, 1, 25], '?'] β [[1, 512, 1, 25], '?'] β
β 22 β net.O_19 β LinSoftmax β 135 K β [[1, 512, 1, 25], '?'] β [[1, 264, 1, 25], '?'] β
ββββββ΄ββββββββββββ΄βββββββββββββββββββββββββββ΄βββββββββ΄βββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββ
Trainable params: 7.9 M
Non-trainable params: 0
Total params: 7.9 M
Total estimated model params size (MB): 31
stage 0/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:08 β’ 0:00:00 15.31it/s val_accuracy: 0.987 val_word_accuracy: 0.934 early_stopping: 0/10 0.98657
stage 1/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:26 β’ 0:00:00 14.48it/s val_accuracy: 0.99 val_word_accuracy: 0.951 early_stopping: 0/10 0.98966
stage 2/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:58 β’ 0:00:00 14.20it/s val_accuracy: 0.991 val_word_accuracy: 0.957 early_stopping: 0/10 0.99089
stage 3/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:46 β’ 0:00:00 14.48it/s val_accuracy: 0.991 val_word_accuracy: 0.96 early_stopping: 0/10 0.99148
stage 4/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:43 β’ 0:00:00 13.63it/s val_accuracy: 0.992 val_word_accuracy: 0.963 early_stopping: 0/10 0.99219
stage 5/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:51 β’ 0:00:00 14.57it/s val_accuracy: 0.992 val_word_accuracy: 0.964 early_stopping: 0/10 0.99229
stage 6/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:56 β’ 0:00:00 13.41it/s val_accuracy: 0.993 val_word_accuracy: 0.965 early_stopping: 0/10 0.99257
stage 7/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:14 β’ 0:00:00 14.22it/s val_accuracy: 0.993 val_word_accuracy: 0.966 early_stopping: 0/10 0.99277
stage 8/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:20 β’ 0:00:00 14.38it/s val_accuracy: 0.993 val_word_accuracy: 0.967 early_stopping: 0/10 0.99284
stage 9/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:21 β’ 0:00:00 14.67it/s val_accuracy: 0.993 val_word_accuracy: 0.967 early_stopping: 0/10 0.99288
stage 10/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:21 β’ 0:00:00 14.70it/s val_accuracy: 0.993 val_word_accuracy: 0.968 early_stopping: 0/10 0.99317
stage 11/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:36 β’ 0:00:00 13.09it/s val_accuracy: 0.993 val_word_accuracy: 0.968 early_stopping: 1/10 0.99317
stage 12/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:00:46 β’ 0:00:00 14.51it/s val_accuracy: 0.993 val_word_accuracy: 0.968 early_stopping: 2/10 0.99317
stage 13/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:53 β’ 0:00:00 14.80it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 0/10 0.99323
stage 14/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:41 β’ 0:00:00 14.99it/s val_accuracy: 0.993 val_word_accuracy: 0.968 early_stopping: 0/10 0.99326
stage 15/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:41 β’ 0:00:00 14.80it/s val_accuracy: 0.993 val_word_accuracy: 0.968 early_stopping: 1/10 0.99326
stage 16/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:44 β’ 0:00:00 14.63it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 0/10 0.99327
stage 17/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:42 β’ 0:00:00 14.58it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 0/10 0.99345
stage 18/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:42 β’ 0:00:00 14.83it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 1/10 0.99345
stage 19/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:43 β’ 0:00:00 14.85it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99353
stage 20/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:46 β’ 0:00:00 14.61it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 1/10 0.99353
stage 21/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:12 β’ 0:00:00 14.19it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 2/10 0.99353
stage 22/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:22 β’ 0:00:00 14.09it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 3/10 0.99353
stage 23/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:00:30 β’ 0:00:00 14.71it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99356
stage 24/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:27 β’ 0:00:00 15.03it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 1/10 0.99356
stage 25/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:42 β’ 0:00:00 14.42it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 2/10 0.99356
stage 26/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:46 β’ 0:00:00 14.44it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 3/10 0.99356
stage 27/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:01 β’ 0:00:00 14.50it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 4/10 0.99356
stage 28/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:45 β’ 0:00:00 14.54it/s val_accuracy: 0.993 val_word_accuracy: 0.969 early_stopping: 5/10 0.99356
stage 29/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:40 β’ 0:00:00 14.19it/s val_accuracy: 0.994 val_word_accuracy: 0.969 early_stopping: 6/10 0.99356
stage 30/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:45 β’ 0:00:00 13.70it/s val_accuracy: 0.994 val_word_accuracy: 0.969 early_stopping: 7/10 0.99356
stage 31/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:43 β’ 0:00:00 14.88it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 8/10 0.99356
stage 32/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:29 β’ 0:00:00 14.86it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99357
stage 33/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:00 β’ 0:00:00 13.28it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99358
stage 34/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:43 β’ 0:00:00 15.01it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99360
stage 35/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:25 β’ 0:00:00 14.71it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 1/10 0.99360
stage 36/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:10 β’ 0:00:00 14.92it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 2/10 0.99360
stage 37/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:06 β’ 0:00:00 14.31it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99364
stage 38/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:20 β’ 0:00:00 13.65it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 1/10 0.99364
stage 39/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:10 β’ 0:00:00 14.97it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 2/10 0.99364
stage 40/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:05 β’ 0:00:00 14.74it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 3/10 0.99364
stage 41/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:58 β’ 0:00:00 14.84it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 4/10 0.99364
stage 42/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:26 β’ 0:00:00 13.70it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 5/10 0.99364
stage 43/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:31 β’ 0:00:00 14.74it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99369
stage 44/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:08 β’ 0:00:00 14.99it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 1/10 0.99369
stage 45/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:00:15 β’ 0:00:00 14.65it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 2/10 0.99369
stage 46/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:55 β’ 0:00:00 14.42it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99370
stage 47/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:27 β’ 0:00:00 14.38it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 1/10 0.99370
stage 48/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:55 β’ 0:00:00 14.74it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99371
stage 49/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:27 β’ 0:00:00 14.39it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 1/10 0.99371
stage 50/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:13 β’ 0:00:00 14.28it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99372
stage 51/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:59:01 β’ 0:00:00 14.41it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 1/10 0.99372
stage 52/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:58:44 β’ 0:00:00 14.40it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 2/10 0.99372
stage 53/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:38 β’ 0:00:00 14.47it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 3/10 0.99372
stage 54/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:31 β’ 0:00:00 14.33it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 4/10 0.99372
stage 55/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:33 β’ 0:00:00 14.31it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99380
stage 56/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:26 β’ 0:00:00 14.73it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 1/10 0.99380
stage 57/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:57:31 β’ 0:00:00 14.40it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 2/10 0.99380
stage 58/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:36 β’ 0:00:00 15.01it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 3/10 0.99380
stage 59/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:34 β’ 0:00:00 14.69it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99384
stage 60/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:37 β’ 0:00:00 14.88it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 1/10 0.99384
stage 61/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:39 β’ 0:00:00 14.71it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 2/10 0.99384
stage 62/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:35 β’ 0:00:00 14.78it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 3/10 0.99384
stage 63/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:38 β’ 0:00:00 14.74it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 4/10 0.99384
stage 64/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:36 β’ 0:00:00 14.83it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 5/10 0.99384
stage 65/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:36 β’ 0:00:00 15.05it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 6/10 0.99384
stage 66/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:37 β’ 0:00:00 14.84it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 7/10 0.99384
stage 67/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:36 β’ 0:00:00 14.76it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 8/10 0.99384
stage 68/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:55 β’ 0:00:00 14.55it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 9/10 0.99384
stage 69/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 0:56:38 β’ 0:00:00 14.83it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 10/10 0.99384
Moving best model ./20231231/gpt/german_newspapers_56.mlmodel (0.9943616986274719) to ./20231231/gpt/german_newspapers_best.mlmodel
sdg
This is network topology was developed by Jan Kamlah in the OCR-D Project, based on the other networks as input.
It is quite complex and could potentially outperform smaller networks if manuscripts or mixed datasets were used.
time nice ketos train -f binary -o ./20231231/sgd/german_newspapers -d cuda:0 --lag 10 -r 0.0001 -B 4 -w 0 -s '[1,144,0,1 Cr4,2,16,1,1 Mp4,2 Cr2,2,48,1,1, Gn24 Mp2,2 Cr2,2,72,1,1 Gn36 Mp2,2 S1(1x0)1,3 Lbx288 Do0.2,2 Lbx288 Do0.2,2 Lbx288]' german_newspapers_2023_12.arrow
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA RTX A4000 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
ββββββ³ββββββββββββ³βββββββββββββββββββββββββββ³βββββββββ³βββββββββββββββββββββββββββ³βββββββββββββββββββββββββββ
β β Name β Type β Params β In sizes β Out sizes β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β 0 β val_cer β CharErrorRate β 0 β ? β ? β
β 1 β val_wer β WordErrorRate β 0 β ? β ? β
β 2 β net β MultiParamSequential β 6.2 M β [[1, 1, 144, 400], '?'] β [[1, 264, 1, 49], '?'] β
β 3 β net.C_0 β ActConv2D β 144 β [[1, 1, 144, 400], '?'] β [[1, 16, 143, 399], '?'] β
β 4 β net.Mp_1 β MaxPool β 0 β [[1, 16, 143, 399], '?'] β [[1, 16, 35, 199], '?'] β
β 5 β net.C_2 β ActConv2D β 3.1 K β [[1, 16, 35, 199], '?'] β [[1, 48, 34, 198], '?'] β
β 6 β net.Gn_3 β GroupNorm β 96 β [[1, 48, 34, 198], '?'] β [[1, 48, 34, 198], '?'] β
β 7 β net.Mp_4 β MaxPool β 0 β [[1, 48, 34, 198], '?'] β [[1, 48, 17, 99], '?'] β
β 8 β net.C_5 β ActConv2D β 13.9 K β [[1, 48, 17, 99], '?'] β [[1, 72, 16, 98], '?'] β
β 9 β net.Gn_6 β GroupNorm β 144 β [[1, 72, 16, 98], '?'] β [[1, 72, 16, 98], '?'] β
β 10 β net.Mp_7 β MaxPool β 0 β [[1, 72, 16, 98], '?'] β [[1, 72, 8, 49], '?'] β
β 11 β net.S_8 β Reshape β 0 β [[1, 72, 8, 49], '?'] β [[1, 576, 1, 49], '?'] β
β 12 β net.L_9 β TransposedSummarizingRNN β 2.0 M β [[1, 576, 1, 49], '?'] β [[1, 576, 1, 49], '?'] β
β 13 β net.Do_10 β Dropout β 0 β [[1, 576, 1, 49], '?'] β [[1, 576, 1, 49], '?'] β
β 14 β net.L_11 β TransposedSummarizingRNN β 2.0 M β [[1, 576, 1, 49], '?'] β [[1, 576, 1, 49], '?'] β
β 15 β net.Do_12 β Dropout β 0 β [[1, 576, 1, 49], '?'] β [[1, 576, 1, 49], '?'] β
β 16 β net.L_13 β TransposedSummarizingRNN β 2.0 M β [[1, 576, 1, 49], '?'] β [[1, 576, 1, 49], '?'] β
β 17 β net.O_14 β LinSoftmax β 152 K β [[1, 576, 1, 49], '?'] β [[1, 264, 1, 49], '?'] β
ββββββ΄ββββββββββββ΄βββββββββββββββββββββββββββ΄βββββββββ΄βββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββ
Trainable params: 6.2 M
Non-trainable params: 0
Total params: 6.2 M
Total estimated model params size (MB): 24
stage 0/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:04:58 β’ 0:00:00 13.09it/s val_accuracy: 0.988 val_word_accuracy: 0.939 early_stopping: 0/10 0.98759
stage 1/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:29 β’ 0:00:00 12.79it/s val_accuracy: 0.991 val_word_accuracy: 0.953 early_stopping: 0/10 0.99055
stage 2/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:34 β’ 0:00:00 12.94it/s val_accuracy: 0.992 val_word_accuracy: 0.958 early_stopping: 0/10 0.99168
stage 3/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:33 β’ 0:00:00 12.86it/s val_accuracy: 0.992 val_word_accuracy: 0.962 early_stopping: 0/10 0.99236
stage 4/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:37 β’ 0:00:00 12.87it/s val_accuracy: 0.993 val_word_accuracy: 0.964 early_stopping: 0/10 0.99273
stage 5/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:41 β’ 0:00:00 12.74it/s val_accuracy: 0.993 val_word_accuracy: 0.966 early_stopping: 0/10 0.99314
stage 6/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:47 β’ 0:00:00 12.27it/s val_accuracy: 0.993 val_word_accuracy: 0.968 early_stopping: 0/10 0.99344
stage 7/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:48 β’ 0:00:00 12.95it/s val_accuracy: 0.994 val_word_accuracy: 0.968 early_stopping: 0/10 0.99355
stage 8/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:07:48 β’ 0:00:00 12.51it/s val_accuracy: 0.993 val_word_accuracy: 0.967 early_stopping: 1/10 0.99355
stage 9/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:48 β’ 0:00:00 12.70it/s val_accuracy: 0.994 val_word_accuracy: 0.969 early_stopping: 0/10 0.99366
stage 10/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:07:07 β’ 0:00:00 12.74it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99392
stage 11/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:43 β’ 0:00:00 12.55it/s val_accuracy: 0.994 val_word_accuracy: 0.969 early_stopping: 1/10 0.99392
stage 12/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:31 β’ 0:00:00 12.47it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99397
stage 13/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:42 β’ 0:00:00 12.41it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99400
stage 14/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:32 β’ 0:00:00 12.41it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 0/10 0.99403
stage 15/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:19 β’ 0:00:00 12.77it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 1/10 0.99403
stage 16/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:27 β’ 0:00:00 12.84it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 2/10 0.99403
stage 17/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:25 β’ 0:00:00 12.74it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 3/10 0.99403
stage 18/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:26 β’ 0:00:00 12.89it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99414
stage 19/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:29 β’ 0:00:00 12.79it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 1/10 0.99414
stage 20/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:34 β’ 0:00:00 12.92it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 2/10 0.99414
stage 21/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:25 β’ 0:00:00 12.79it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 3/10 0.99414
stage 22/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:26 β’ 0:00:00 12.94it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99416
stage 23/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:26 β’ 0:00:00 12.60it/s val_accuracy: 0.994 val_word_accuracy: 0.97 early_stopping: 1/10 0.99416
stage 24/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:31 β’ 0:00:00 12.81it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 2/10 0.99416
stage 25/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:28 β’ 0:00:00 12.73it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 0/10 0.99426
stage 26/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:27 β’ 0:00:00 12.62it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 1/10 0.99426
stage 27/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:31 β’ 0:00:00 12.73it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 2/10 0.99426
stage 28/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:26 β’ 0:00:00 12.78it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 0/10 0.99429
stage 29/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:34 β’ 0:00:00 11.81it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 1/10 0.99429
stage 30/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:30 β’ 0:00:00 12.41it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 0/10 0.99436
stage 31/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:07:38 β’ 0:00:00 12.34it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 1/10 0.99436
stage 32/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:08:01 β’ 0:00:00 12.13it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 2/10 0.99436
stage 33/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:07:04 β’ 0:00:00 12.69it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 3/10 0.99436
stage 34/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:15 β’ 0:00:00 12.95it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 4/10 0.99436
stage 35/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:07:30 β’ 0:00:00 11.97it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 5/10 0.99436
stage 36/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:06 β’ 0:00:00 12.78it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 6/10 0.99436
stage 37/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:40 β’ 0:00:00 12.85it/s val_accuracy: 0.994 val_word_accuracy: 0.972 early_stopping: 7/10 0.99436
stage 38/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:06:42 β’ 0:00:00 12.65it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 8/10 0.99436
stage 39/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:41 β’ 0:00:00 12.94it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 9/10 0.99436
stage 40/β ββββββββββββββββββββββββββββββββββββββββ 50088/50088 1:05:43 β’ 0:00:00 12.49it/s val_accuracy: 0.994 val_word_accuracy: 0.971 early_stopping: 10/10 0.99436
Moving best model ./20231231/sgd/german_newspapers_30.mlmodel (0.9943616986274719) to ./20231231/sgd/german_newspapers_best.mlmodel
nice ketos train -f binary -o ./20231231/sgd/german_newspapers -d cuda:0 199080,08s user 13389,96s system 123% cpu 47:37:04,78 total
Evaluation
The evaluation set contains pages from the newspaper Hakenkreuzbanner, which were not included in the training!
Evaluation also includes a newly trained Tesseract model (german_newspapers) and the current best Tesseract model for historical sources (frak2021).
All topologies are quite similar in their results and should be trained on a larger dataset to test the potential of more complex neural network structures.
For this or similar datasets, we recommend using the kraken topology as it is the smallest model and gives the best results.
π Top models for /home/jkamlah/Documents/projects/OCR-D/Models/Evaluation/german_print/hkb_1941-09-03_0005
π₯ 99.58 Kraken_german_newspapers_default_deep
π₯ 99.57 Kraken_german_newspapers_htr+
π₯ 99.57 Kraken_german_newspapers_default
βοΈ 99.56 Kraken_german_newspapers_sdg
βοΈ 99.56 Kraken_german_newspapers_gpt
βοΈ 99.16 Tesseract_german_newspapers
βοΈ 98.02 Tesseract_frak2021
βοΈ 97.47 Kraken_digi_tue
π Top models for /home/jkamlah/Documents/projects/OCR-D/Models/Evaluation/german_print/hkb_1931-01-03_0003
π₯ 99.77 Kraken_german_newspapers_default
π₯ 99.76 Kraken_german_newspapers_default_deep
π₯ 99.70 Kraken_german_newspapers_gpt
βοΈ 99.68 Kraken_german_newspapers_htr+
βοΈ 99.61 Kraken_german_newspapers_sdg
βοΈ 99.50 Kraken_digi_tue
βοΈ 99.12 Tesseract_german_newspapers
βοΈ 98.58 Tesseract_frak2021
π Top models for /home/jkamlah/Documents/projects/OCR-D/Models/Evaluation/german_print/hkb_1943-04-01_0008
π₯ 99.62 Kraken_german_newspapers_htr+
π₯ 99.55 Kraken_german_newspapers_default
π₯ 99.54 Kraken_german_newspapers_gpt
βοΈ 99.52 Kraken_german_newspapers_default_deep
βοΈ 99.48 Kraken_german_newspapers_sdg
βοΈ 99.11 Tesseract_german_newspapers
βοΈ 98.75 Kraken_digi_tue
βοΈ 97.95 Tesseract_frak2021
π Top models for /home/jkamlah/Documents/projects/OCR-D/Models/Evaluation/german_print/hkb_1931-01-03_0011
π₯ 100.00 Kraken_german_newspapers_sdg
π₯ 100.00 Kraken_german_newspapers_default_deep
π₯ 100.00 Kraken_german_newspapers_default
βοΈ 99.80 Kraken_german_newspapers_gpt
βοΈ 99.79 Kraken_german_newspapers_htr+
βοΈ 99.52 Kraken_digi_tue
βοΈ 99.41 Tesseract_german_newspapers
βοΈ 99.12 Tesseract_frak2021
π Top models for /home/jkamlah/Documents/projects/OCR-D/Models/Evaluation/german_print/hkb_1937-11-21_0026
π₯ 99.77 Kraken_german_newspapers_sdg
π₯ 99.77 Kraken_german_newspapers_default
π₯ 99.77 Kraken_german_newspapers_gpt
βοΈ 99.73 Kraken_german_newspapers_default_deep
βοΈ 99.68 Kraken_german_newspapers_htr+
βοΈ 99.48 Tesseract_german_newspapers
βοΈ 99.25 Kraken_digi_tue
βοΈ 99.01 Tesseract_frak2021
π Top models for /home/jkamlah/Documents/projects/OCR-D/Models/Evaluation/german_print/hkb_1945-01-01_0022
π₯ 99.01 Kraken_german_newspapers_default_deep
π₯ 99.00 Kraken_german_newspapers_default
π₯ 98.96 Kraken_german_newspapers_htr+
βοΈ 98.80 Kraken_german_newspapers_gpt
βοΈ 98.79 Kraken_german_newspapers_sdg
βοΈ 97.63 Kraken_digi_tue
βοΈ 96.20 Tesseract_frak2021
βοΈ 96.20 Tesseract_german_newspapers
π½ Top models over all
π₯ 99.61 Kraken_german_newspapers_default
π₯ 99.60 Kraken_german_newspapers_default_deep
π₯ 99.55 Kraken_german_newspapers_htr+
βοΈ 99.54 Kraken_german_newspapers_sdg
βοΈ 99.53 Kraken_german_newspapers_gpt
βοΈ 98.75 Tesseract_german_newspapers
βοΈ 98.69 Kraken_digi_tue
βοΈ 98.15 Tesseract_frak2021