2018 02 28 - PaddlePaddle/Paddle GitHub Wiki
Main focus: ParallelDo in Multiple GPUs
Issue:
- The problem of improving the performance of Parallel_Do
PR:
- Backward on parallel do using nccl:
- Python framework:
Review:
- https://github.com/PaddlePaddle/Paddle/pull/8516
- https://github.com/PaddlePaddle/Paddle/pull/8489
- https://github.com/PaddlePaddle/Paddle/pull/8402
- https://github.com/PaddlePaddle/Paddle/pull/8471
Inference:
- Integrate float16 into data_type_transform:
- Enable is_test for batch norm and dropout op:
- [WIP] add float16 GEMM GPU function in math_function
- Review:
Inference: PR/issue:
- Profiling C++ inference api for recognize digits model: https://github.com/PaddlePaddle/Paddle/pull/8497
- Results for this analysis: https://github.com/sidgoyal78/paddle_notes/blob/master/benchmark/recoginze_digits.md
- Survey TensorRT for inference: https://github.com/PaddlePaddle/Paddle/issues/8492
- CSP
- Exposing Channel to be used as a Variable and integrating with Fluid https://github.com/PaddlePaddle/Paddle/pull/8486
- Add unit tests for ChannelHolder https://github.com/PaddlePaddle/Paddle/pull/8486
- Add Go_op, Channel_create, channel_close, channel_send and channel_receive ops https://github.com/PaddlePaddle/Paddle/pull/8593
- Adding more unit tests for ChannelHolder class https://github.com/PaddlePaddle/Paddle/pull/8668
- Review:
- Exposing Channel to be used as a Variable and integrating with Fluid: https://github.com/PaddlePaddle/Paddle/pull/8486
- Add Go_op, Channel_create, channel_close, channel_send and channel_receive ops: https://github.com/PaddlePaddle/Paddle/pull/8593
- nvcc fatal errors on TeamCity: https://github.com/PaddlePaddle/Paddle/issues/8501
- PR review:
- inference:
- refine inference_lib_dist after code move, and add it to docker/build.sh: https://github.com/PaddlePaddle/Paddle/pull/8379
- combine batch_size_like.cc into batch_size_like.h: https://github.com/PaddlePaddle/Paddle/pull/8604
- compile:
- Move Fluid C++ code from /paddle to /paddle/fluid:
- move Fluid API doc/code out of V2 API doc/code:
- set the default option of WITH_FAST_BUNDLE_TEST be OFF: https://github.com/PaddlePaddle/Paddle/pull/8563
- document
- reduce doc build time in travis ci (from 30+ min to 4 min):
- Adjust the structure of API, Operators, cluster and quick start (Both Chinese and English):
- update generate_paddle_docs.sh in paddlepaddle.org repo: https://github.com/PaddlePaddle/PaddlePaddle.org/pull/409
- code review:
- Simplify the cmake of inference: https://github.com/PaddlePaddle/Paddle/pull/8272
- Refine cmake for cudnn op: https://github.com/PaddlePaddle/Paddle/pull/8591
- [Intel] MKLDNN conv2d and pool2d OP kernels added: https://github.com/PaddlePaddle/Paddle/pull/8451
- [doc] add introduction:
- fix dist traning bug, make sure demo code work.
- fix v2 async sgd update, https://github.com/PaddlePaddle/Paddle/pull/8474
- english toturial doc on cloud repo, https://github.com/PaddlePaddle/cloud/pull/621
- review:
- https://github.com/PaddlePaddle/cloud/pull/617#pullrequestreview-99173766
- https://github.com/PaddlePaddle/Paddle/pull/8538#pullrequestreview-99175816
- https://github.com/PaddlePaddle/Paddle/pull/8656#pullrequestreview-100022647
- https://github.com/PaddlePaddle/Paddle/pull/8634#pullrequestreview-99954888
- turn on cmake flag
WITH_DISTRIBUTE
on CI so that the wheel package support distributed training.
- Enhance layer_generator
- https://github.com/PaddlePaddle/Paddle/pull/8543
-
mean(x=layer_out)
-->mean(layer_out)
- Moving unique_name to python. We can reset the unique_name generator now
- Demo about switch optmizers
- Demo about stack denoising autoencoder
- Make global_step as a global variable in Fluid
- Several Enhancements
-
learning rate decay (https://github.com/PaddlePaddle/Paddle/issues/7769)
- Fix compare op https://github.com/PaddlePaddle/Paddle/pull/8532
- create learning rate for multi program https://github.com/PaddlePaddle/Paddle/pull/8545
- change learning_rate_decay to
learning_rate_scheduler
https://github.com/PaddlePaddle/Paddle/pull/8583
-
multi gpu profile
- parallel-do should not merge the gradient of parameter that stop_gradient=True https://github.com/PaddlePaddle/Paddle/pull/8652
-
se_resnet_50
multi-gpu profile https://github.com/PaddlePaddle/Paddle/issues/8661
-
add c-api quick start https://github.com/PaddlePaddle/Paddle/pull/8566
-
discuss
- The problem of improving the performance of Parallel_Do https://github.com/PaddlePaddle/Paddle/issues/8592
-
Review
- Fine Tune MNIST by Adam and SGD https://github.com/PaddlePaddle/Paddle/pull/8570
- simplify shape inference code https://github.com/PaddlePaddle/Paddle/pull/8087
- Enhance
layer_function_generator
https://github.com/PaddlePaddle/Paddle/pull/8543 - Moving unique_name to python https://github.com/PaddlePaddle/Paddle/pull/8524
- A new design of model save/load:
- [WIP] Disassemble evaluator:
- Reviews:
- SSD on Fluid:
- [Merged] Enhance bipartite_match_op to support argmax matching after bipartite matching.
- [Merged] Register more data type for reshape operator.
- [Merged] Enable the SSD loss to support normalization by the total number of output locations.
- [Merged] Fix the backward transpiler bug in ssd_loss API.
- Verify the correctness of SSD loss:
- [WIP] Verify the correctness of detection output
- Review:
- Fix box coder op: https://github.com/PaddlePaddle/Paddle/pull/8647
- Other:
- Help HuaWei to debug the AR demo on Fluid.
DeepASR:
- Convergence verification on single GPU
- Performance profiling
- Some enhancements
- Fix the profiler's bug in multi-gpu mode
Code Review:
Main focus: ParallelDo in Multiple GPUs
-
PR:
-
Reviews:
- https://github.com/PaddlePaddle/Paddle/pull/8665#pullrequestreview-100223755
- https://github.com/PaddlePaddle/Paddle/issues/8504#event-1488164262
- https://github.com/PaddlePaddle/Paddle/issues/8480#event-1488164544
- https://github.com/PaddlePaddle/Paddle/pull/8471#pullrequestreview-98784602
- https://github.com/PaddlePaddle/Paddle/issues/8500
- https://github.com/PaddlePaddle/Paddle/issues/8592#issuecomment-368693237
- https://github.com/PaddlePaddle/Paddle/pull/8550#pullrequestreview-100248472
-
CI fixes:
-
fulid
-
PR, conv sequence to sequence
- finished, tuning with data
- [PR, v2 API doc overview]https://github.com/PaddlePaddle/Paddle/pull/8547
-
PR, conv sequence to sequence
-
visualdl
- Distribute training:
- Tensorflow: https://github.com/PaddlePaddle/Paddle/pull/8522
- Change script to support tensorflow distribution on k8s: https://github.com/PaddlePaddle/cloud/pull/617
- Document:
- cluster train: https://github.com/PaddlePaddle/Paddle/pull/8622
- Fix docment generation bugs: https://github.com/PaddlePaddle/PaddlePaddle.org/pull/418
- Review:
- Inference Framework
- [Merged] Refine the inference API and unittest
- Write a basic userguide of Fluid inference
- Review
- combine batch_size_like.cc into batch_size_like.h: https://github.com/PaddlePaddle/Paddle/pull/8604
- Inference example and unittest for NMT model: https://github.com/PaddlePaddle/Paddle/pull/8314
- Get rid of the dependency of Go compiler when WITH_GOLANG is OFF
- Profile data reader for DeepASR
https://github.com/PaddlePaddle/models/issues/673 - RNN Beam search
https://github.com/PaddlePaddle/Paddle/issues/8603
https://github.com/PaddlePaddle/models/pull/675
- Debug and tune the Transformer model referenced with the Pytorch implementation.
- PR:
- Remove the losses from paddings in Transformer
- Add learning rate scheduling in Transformer
- CI speed up
- Bisect culprit commit:
- Timeline Profiler
- Reviews
-
PR
- [WIP]Refine concat_op
- Refine Sum in elementwise_op_function
- Refine cmake for cudnn op
- Add tuple type
- Fix conv_op bug
- refine FQA doc
- fix get_mid_dims annotation
-
Review
- Refine Sum in elementwise_op_function
- Add Go_op, Channel_create, channel_close, channel_send and channel_receive ops
- Add unit tests for ChannelHolder
- Extend current profiler for timeline and more features
- Enhance bipartite_match_op to support argmax matching after bipartite matching
- Add ceil_mode option for pool2d and pool3d
-
CI updates
- CI docker graph location update, teamcity data relocation to larger storage
- daily team city backup
- daily docker prune
-
Review
-
Pr
- GPU perf
- https://github.com/PaddlePaddle/Paddle/pull/8550
- https://github.com/PaddlePaddle/Paddle/issues/8638
- https://github.com/PaddlePaddle/Paddle/pull/8634
- https://github.com/PaddlePaddle/Paddle/pull/8573
- https://github.com/PaddlePaddle/Paddle/pull/8538
- https://github.com/PaddlePaddle/Paddle/pull/8512
- review: https://github.com/PaddlePaddle/Paddle/pull/8600
- review: https://github.com/PaddlePaddle/cloud/pull/617
- reviews etc.
- EDL:
PaddlePaddle.org
- Fix issue of MathJax equations and images not rendering correctly when user clicks on a new link. (https://github.com/PaddlePaddle/PaddlePaddle.org/pull/412)
- Fix issue with permalink not showing entire title (https://github.com/PaddlePaddle/PaddlePaddle.org/pull/415)
Paddle
- Add Go_op, Channel_create, channel_close, channel_send and channel_receive ops (https://github.com/PaddlePaddle/Paddle/pull/8593)
- Unittests concurrency (https://github.com/PaddlePaddle/Paddle/pull/8666)
Other
- Worked on Visual Debugger Tech Talk with Varun
- Fix error message on charts: https://github.com/PaddlePaddle/VisualDL/pull/279
- Add the Histogram related Vue files: https://github.com/PaddlePaddle/VisualDL/pull/278
- Add the Graph.vue and the Config.vue for Graph tab: https://github.com/PaddlePaddle/VisualDL/pull/277
- Fix the incorrect pagination issue: https://github.com/PaddlePaddle/VisualDL/pull/276
- Allow the navigation bar to persist the selected item style. : https://github.com/PaddlePaddle/VisualDL/pull/274
- Include the pymdownx extensions: https://github.com/PaddlePaddle/PaddlePaddle.org/pull/425
- Fix incorrect markdown and RST file: https://github.com/PaddlePaddle/Paddle/pull/8667
- Fix scalar issues: https://github.com/PaddlePaddle/VisualDL/pull/283
- Show Scalar Data and add ExpandPanel https://github.com/PaddlePaddle/VisualDL/pull/272
- Add theme and UI https://github.com/PaddlePaddle/VisualDL/pull/280
- Fix font size and reorganize css and stylus https://github.com/PaddlePaddle/VisualDL/pull/287
- Add Go_op, Channel_create, channel_close, channel_send and channel_receive ops (https://github.com/PaddlePaddle/Paddle/pull/8593)
- Worked on Visual Debugger Tech Talk with Thuan