2017 12 27 - PaddlePaddle/Paddle GitHub Wiki
- 
Gradient Check of RNN - [WIP] https://github.com/PaddlePaddle/Paddle/pull/7068
- Tensor::has_nan/has_inf
- Evaluator raise exception when NAN/Inf
 
- 
Rename API of DevCtx 
- 
Polish Scope::LocalVarNames
- 
Speed up ColwiseSum in CPU 
- 
Rewrite AdamOp 
- 
Set RelWithDebInfo 
Multi-device:
- add data layout
- add library type
- refine OpKernelType
- add memory switch mechanism in operator kernel switch
- cache memory in local scope
Fix and Enhance
- update support new device docs
- remove unused place
- remove unused usage_stat script: https://github.com/PaddlePaddle/Paddle/pull/6880
- unify the indentation of license: https://github.com/PaddlePaddle/Paddle/pull/7022
- refine CMakeLists.txt when add op need DEPS: https://github.com/PaddlePaddle/Paddle/pull/7067
- MKL
- update alexnet training data: https://github.com/PaddlePaddle/Paddle/pull/6878
- Add "download mklml failed" into FAQ: https://github.com/PaddlePaddle/Paddle/pull/7009
- code review:
- enable alexnet benchmark: https://github.com/PaddlePaddle/Paddle/pull/6852
- use small samples to infer openblas: https://github.com/PaddlePaddle/Paddle/pull/6755
- enable MKL Packed Recurrent Layer: https://github.com/PaddlePaddle/Paddle/pull/6719
 
 
- doc
- Complete refactor of backward
- Update DataFeederand inference model io according to users' feedback
- Other improving and fixes:
- Reviews:
- 
Update doc of V2 api 
- 
performance validation of understand_sentiment in fluid https://github.com/PaddlePaddle/Paddle/pull/7004 https://github.com/PaddlePaddle/Paddle/issues/7046 
- 
Add gpu support for NCE_layer. 
- 
[WIP] Implement adaptive softmax. 
- 
Book.04 word2vec speed performance comparison with V2. 
- 
setup onnx environment and learned how it should interact with VisualDL 
- 
finished graph data design for graph in VisualDL 
- 
add edges to graph proto so that frontend can render more easily (WIP) 
- 
updated data format design for VisualDL 
- Serialize and Deserialize SelectedRows, https://github.com/PaddlePaddle/Paddle/pull/7042
- BlockingCounter for ThreadPool, https://github.com/PaddlePaddle/Paddle/pull/7000
- Bug fix
- install python-tk, https://github.com/PaddlePaddle/Paddle/pull/7095
 
- PR Review:
- Profiling:
- Refine the activation type getting in the LSTM operator to speed.
- Speed data reader for IMDB dataset.
- Optimize the rowwise add function.
- Speed based on three statcked LSTM model:
- GPU: 166.95994s -> 87.30287s
- CPU: 385.2211s -> 294.90407s
 
 
- Benchmark Model:
- Make the ResNet of TensorFlow consistent with Paddle
 
- Mobile:
- Code Review:
- Implement ResNeXt for image classification
- Working on SENet [WIP]
- 
Muiti Device 
- 
Code optimize 
- 
Review 
- Doc:
- Polish accuracy doc: https://github.com/PaddlePaddle/Paddle/pull/7091
- Fix transpose op doc: https://github.com/PaddlePaddle/Paddle/pull/7020
 
- Models test:
- Use 'time' monitor resources while running train model
- Add script to analysis train log
 
- VGG16 performance comparison with TensorFlow
- Convergence comparison with TF on CPU
- Speed comparison with TF on CPU
- Internal convergence comparison on CPU and GPU
- Memory allocation comparison with TF
- Update and merge the VGG16 benchmark scripts
 
- Add the parsing part for the profiling tool
- Polish the doc of cross_entropy_op
- Fix two docs' problem
- Code Review:
PR
- Refine cos-sim-op
- Refine sgd-op
- Add conv2d_python doc
- Fix embedding example
Performance analysis: ResNet and VGG16
Review
- remove GPU Sync Interface
- Refine CUDA profiler and delete the test file
- Use for_range to rewrite adam
- Speed data reader for IMDB dataset.
- Optimize the rowwise add function
- Add vgg16 benchmark configuration
- detection_output op(for SSD, doing, code review
- norm op doing doing, code review
- run caffe ssd demo
- Framework
- Add a simple C++ inference example for fluid
 
- Mobile
- Always link protobuf-lite for mobile inference
 
- Multi box loss operator: https://github.com/PaddlePaddle/Paddle/pull/6946
- code review:
- run paddle v2 SSD demo
- 
fluid 
- 
VisualDL with @longfei @daming 
- 
models ci with @haoshuang reviews https://github.com/PaddlePaddle/regtest/pull/8#pullrequestreview-85679209 https://github.com/PaddlePaddle/regtest/pull/9#pullrequestreview-85760472 
- improve send/recv op:
- single thread block => async: https://github.com/PaddlePaddle/Paddle/compare/develop...gongweibao:asyncsendrecv?expand=1
 
- Fix bugs:
- create vars bugs: https://github.com/PaddlePaddle/Paddle/pull/7060
- Fix demo code bug in usage doc: https://github.com/PaddlePaddle/cloud/pull/535
 
- ISSUE:
- code review:
- refine distributed transpiler
- add scatter functors
- Fluid
- add DataType Transform
- Fix ThreadPool
- add multi kernel register
- add data layout in Tensor
- switch GPUPlace with CUDAPlace
- fix/copyfrom context
- refine op_kernel key
- remove GPU Sync interface
- switch Operaterbase Run with place/ Global DeviceContext
- fix/Place
 
- Benchmark - Reviews - https://github.com/dzhwinter/benchmark/pull/35 - https://github.com/dzhwinter/benchmark/pull/34
- PR & Review
- [optimized] https://github.com/PaddlePaddle/Paddle/pull/7034
- [multi-thread] https://github.com/PaddlePaddle/Paddle/pull/6751
- [CAPI-doc] https://github.com/PaddlePaddle/Paddle/pull/6596
 
- Add stacked dynamic lstm model for fluid 
 https://github.com/dzhwinter/benchmark/pull/34
- Add seq2seq model for tf 
 https://github.com/dzhwinter/benchmark/pull/31
- Add stacked dynamic lstm model for tf 
 https://github.com/dzhwinter/benchmark/pull/35
- Code Review 
 https://github.com/PaddlePaddle/Paddle/pull/6986#pullrequestreview-85519491
 https://github.com/PaddlePaddle/Paddle/pull/6779#pullrequestreview-85241336