Strangely enough, for these many years I've learnt programming, never have I tried compressing a file with hand-written code.
Huffman encoding is one of the most classic lossless compression algorithms. Here I'm using it to compress/decompress files as a coding practice.
It turns out to be more difficult than I expected, as there are quite a lot of things to consider about:
- Binary format design
- File IO
- Binary tree
- Priority queue (optional, for optimization of codebook generation)
- Bit stream reader/writer (the most tricky and buggy one)
- State machine (not necessary if performance is not the concern)
What is the probability that the compressed size is exactly the same? I've encountered one in the first few minutes of testing.
Decompressor