Misc Ideas - cr88192/bgbtech_misc GitHub Wiki
Possible stand alone stream compressor for BTLZH.
- Would serve a similar purpose and have a similar interface to GZIP or XZ.
- Reuse GZIP format, but with BTLZH payload.
- Use BTIC CTLV or BTLV2
- Compressed buffer would consist of a number of compressed segments, each encoded independently.
- Segment size will be specified in a header, but is likely measured in MB for normal files.
- Likely, the segment size will be between 1x and 4x the size of the sliding window.
- params:WORD
- Indicates which optional parameters are present.
- segSize:BYTE
- log2 of segment size.
- winSize:BYTE
- log2 of window size.
- Each will consist of a Context-Dependent Data marker, which in turn holds compressed data in Zlib format.
- Either Deflate or BTLZH or BTLZA may be used.
Unary prefix followed by ((Q+1)*(k+(Q>>2))>>1) bits.
- k=0: 0, 10, 110, 1110, 111100 111101 ...
- / k=1: 00 01, 1000..1011, 110000..110111, 111000000..111011111, 111100000000..111101111111, ...
- k=1: 0, 100 101, 1100..1101, 111000..111011, 11110000..11110111, ...
- k=2: 00 01, 1000..1011, 110000..110111, 11100000..11101111, ...
- ...
Base (for Q):
- k=0: 0, 1, 2, 3, 4, 6, ...
- k=1: 0, 1, 3, 5, 9, ...
- k=2: 0, 2, 6, 14, 30, ...
Idea: Scheme similar to AdRice, but tweaking the code scheme.
As before, the Rk factor will specify a constant number of extra bits.
So, Prefixes:
- 0, 0
- 1, 10
- 2, 110
- 3, 1110
- 4+, 1111 (Gamma)
- Encode Q-4 using a Gamma code.
- Gamma code consists of a Unary code (N) followed by N bits.
AdRice + Damage Control
AdRiceDC:
Q=0..7: Encoded as AdRice Otherwise, use a VLI (with multiples of 5 bits) 0- 31: 1111-1111 0xxx-xx 32-1023: 1111-1111 10xx-xxxx xxxx 1024-32767: 1111-1111 110x-xxxx xxxx-xxxx xx 32768- 1M: 1111-1111 1110-xxxx xxxx-xxxx xxxx-xxxx 1M- 128M: 1111-1111 1111-0xxx xxxx-xxxx xxxx-xxxx xxxx-xx 128M- 1G: 1111-1111 1111-10xx xxxx-xxxx xxxx-xxxx xxxx-xxxx xxxx
In VLI case, K is increased by 2 for each multiple of 5 bits in the suffix.
For AdRice:
- 0: k=k-1
- 1: k=k
- 2: k=k+1
- 3: k=k+1
- 4..7: k=k+2
- Works, but contrived.
- Initial form was distribution sensitive, better for some cases, but worse for others.
- Changing cutoff from 8 to (8+k) and 5 to (5+k) seems to have improved results.