[20220524] Compression Roadmap 2022 - microsoft/nni GitHub Wiki
Design Note
The method of simulating the compression effect is to replace some nodes in the graph with the wrapped ones. But note that sometimes the method of wrapping only the current node is not equivalent to the actual compressed effect.
Work Items
base
evaluator (handle train, validate, hook, patch...)
config list refactor
specify compression target (input, output, weight, ...)
specify compression algo (include related parameters, such as sparse pattern, quant bit)
Support for variable compression targets
compressor & wrapper refactor, provide a unified interface for parsing config list.
basic pruner refactor & quantizer design
Super compressor? most existed basic pruner/quantizer can implement by config super compressor?
universal wrapper
pruning
refactor sparse pattern
metric calculator & sparsity allocator
migrate to evaluator
quantization
refactor design (key consideration: experiment, evaluator, wrapper, conv-bn-fusion)
experiment
wrap tuner as strategy
support more pruners & quantizers
a good strategy (how to search in search space)
speedup
mask propagation stands alone as a module
quantization speedup supports more backend
benchmark
visualization
🗂️ Page Index for this GitHub Wiki