2018 05 16 - PaddlePaddle/Paddle GitHub Wiki
- add inplace attribute, for memory reuse
- https://github.com/PaddlePaddle/Paddle/pull/10665
- eigen ci testing
- https://github.com/PaddlePaddle/Paddle/pull/10731
- fix compile testing
- https://github.com/PaddlePaddle/Paddle/pull/10667
- https://github.com/PaddlePaddle/Paddle/pull/10629
- https://github.com/PaddlePaddle/Paddle/pull/10450
- https://github.com/PaddlePaddle/Paddle/pull/10657
- benchmarkSuite framework, for consistent checking
- https://github.com/PaddlePaddle/Paddle/pull/10646
- https://github.com/PaddlePaddle/Paddle/pull/10624
- make uint8 support in data_type transform
- https://github.com/PaddlePaddle/Paddle/pull/10715
- refine batch norm op
- https://github.com/PaddlePaddle/Paddle/pull/10502
- Build Scripts:
- Add a new feature to run a single uni test: https://github.com/PaddlePaddle/Paddle/pull/10678
- Generate dockerfile after each build: https://github.com/PaddlePaddle/Paddle/pull/10719
- Cleanup old build scripts: https://github.com/PaddlePaddle/Paddle/pull/10721
- Teamcity CI
- Using different machines to debug the “test_all_ops” failure.
- [WIP] Try to enable ccache to speed up build
- [WIP] Debug build “C_API” failure in nightly build
- Fix document error: https://github.com/PaddlePaddle/Paddle/pull/10726
- Fix data_feeder LoD bug
- Simplify recognize digits example code
- [WIP] Make the creation of LoDTensor more user friendly in book examples
- Fix signed/unsigned comparison warning
- Review:
- mkldnn:
- plan the July 5th roadmap with Intel:
- implement the inference optimization of OCR Vehicle Plate Recognition model at first. https://github.com/PaddlePaddle/Paddle/issues/10685
- Profile ocr Vehicle Plate Recognition model, 8%~10% slower than the online version.
- [merge] update mklml version, disable building tests and examples when install mkldnn, fix compiler error on gcc48,fix dead link of mkldnn doc: https://github.com/PaddlePaddle/Paddle/pull/10571,https://github.com/PaddlePaddle/Paddle/pull/10567, https://github.com/intel/mkl-dnn/pull/238
- plan the July 5th roadmap with Intel:
- [merge] implement convert tensorrt relu op, and its unit-test: https://github.com/PaddlePaddle/Paddle/pull/10495
- code review:
- mkldnn:
- [merge] Reusing of softmax mkldnn primitives:https://github.com/PaddlePaddle/Paddle/pull/10576
- Patch mkldnn for build on gcc 4.8.2: https://github.com/PaddlePaddle/Paddle/pull/10616
- [bug] remove light_imdb & tiny_imdb: https://github.com/PaddlePaddle/models/pull/907
- [bug] Add profile in aishell example: https://github.com/PaddlePaddle/models/pull/910
- crnn-ctc label was not found:https://github.com/PaddlePaddle/models/pull/912
- inference:
- mkldnn:
- Test some tcp framework performance
- Help OCR groud to solve
- Coredump after call inference api.
- Slowdown when link with fluid library.
- Bug fixes and enhancements
- [API] Trainer.train
- Reader related jobs:
- Support uint8 of tensor
- [WIP] RandomCropOp
- aws tool on vgg16 dist train as example integration with CE
- tool adjustment done
- vgg16 dist train error https://github.com/PaddlePaddle/Paddle/issues/10720
- Trying to bring nightly back online
- reviews
- [Merged] Adapt the convertor to tensorrt backend
- Add Inception_v4 model config in Fluid API
- Add inception-v4 to supported models in onnx convertor
- [DeepASR] Add profile in aishell example
- [WIP] Add multiple groups in conv_transpose_op to support face detection model
Code Review:
- https://github.com/PaddlePaddle/Paddle/pull/10621
- https://github.com/PaddlePaddle/paddle-onnx/pull/48
- benchmark scripts polish https://github.com/PaddlePaddle/Paddle/pull/10707
- fix possible memory leek https://github.com/PaddlePaddle/Paddle/pull/10663
- NCCL2 dist train perf tests: https://docs.google.com/spreadsheets/d/1D5Xc_TfGfMV5aKh4ZJS_b4js3Mnn06H1Po0iuECZLr4/edit?pli=1#gid=1453458907
- NMT:
- Add beamsearch decoder using while_op in Transformer (WIP).
- Refine Transformer code.
- Review:
- Enhanced is_empty_op
- Enhancing assign_value_op
- Found a bug of conditional block
- Simple RNN beam search
- Enhance reduce op
- Add dice loss:
-
PR
- Balance parameter_opt between cards
- Feature/fuse reduce op
- Update SE-Resnext
- The current SE-Resnext 152 benchmark result on P40
- Fix pe bug [prevent loss.grad optimized by mem_opt]
- Refine fetch op handle
-
Review
- Fix a profiler race condition
- Delete prefetch_ctx_ after use
- Our tests interfere with each other and cause random failure
- Revert "CI: rerun failed tests. (#10536)
- Add a multi-dim add layer test.
- Add comment to explain how to run inference test
- timeline for distributed training
- add some instructions for running vgg distributedly
- allow inference test to generate timeline
- CI merge with latest develop to avoid stale PR
- code review for distributed training, multi-gpu training and inference.