Skip to content

extmod/uzlib: Add gzip compression support.#5613

Closed
andrewleech wants to merge 3 commits into
micropython:masterfrom
andrewleech:gzip
Closed

extmod/uzlib: Add gzip compression support.#5613
andrewleech wants to merge 3 commits into
micropython:masterfrom
andrewleech:gzip

Conversation

@andrewleech
Copy link
Copy Markdown
Contributor

Exposes basic compression support from uzlib. I originally wrote this nearly a year ago so don't really remember too much about it.

Pushed up to support #5590

Doesn't have any unit tests written... probably needs some first. Possibly some #define's to enable/disable the compress functionality?

Comment thread extmod/moduzlib.c
memset(comp, 0, sizeof(*comp));

comp->dict_size = 32768;
comp->hash_bits = 12;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and previsou line could use a comment stating why this is chosen I think (I know nothing about gzip though)

Copy link
Copy Markdown
Contributor

@codefreax codefreax Feb 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to expose these settings in the API?

Comment thread extmod/moduzlib.c Outdated
uzlib_compress(comp, bufinfo.buf, len);
zlib_finish_block(&comp->out);

printf("compressed from %u to %u raw bytes\n", len, comp->out.outlen);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use DEBUG_printf ?

Comment thread extmod/moduzlib.c Outdated

printf("compressed from %u to %u raw bytes\n", len, comp->out.outlen);

mp_uint_t dest_buf_size = (comp->out.outlen + 6 + (3*sizeof(int)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark as earlier, what are these magic constants?

@stinos
Copy link
Copy Markdown
Contributor

stinos commented Feb 6, 2020

There's MICROPY_PY_UZLIB already, but I guess splitting that up in separate compress/decompress doesn't hurt and can be used to keep the original behavior (i.e. not enabling compression by default).

A bunch of tests would be nice indeed, and fairly easy to write (assuming asserting that decompress(compress(x)) == x is sufficient for proving it works)

@dpgeorge dpgeorge added the extmod Relates to extmod/ directory in source label Feb 6, 2020
Comment thread extmod/moduzlib.c Outdated
Comment on lines +239 to +241
int mtime = 0;
memcpy(&dest_buf[i], &mtime, sizeof(mtime));
i += sizeof(mtime);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Wikipedia, mtime must be 4 bytes. Use uint32_t here?

Comment thread extmod/moduzlib.c Outdated
Comment on lines +216 to +217
struct uzlib_comp *comp = m_new_obj(struct uzlib_comp);
memset(comp, 0, sizeof(*comp));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Merge into m_new0(struct uzlib_comp, 1);?

Comment thread extmod/moduzlib.c Outdated
Comment on lines +235 to +238
dest_buf[i++] = 0x1f;
dest_buf[i++] = 0x8b;
dest_buf[i++] = 0x08;
dest_buf[i++] = 0x00; // FLG
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you add a few more comments here and below where there are more magic numbers?

Comment thread extmod/moduzlib.c Outdated
Comment on lines +248 to +250
unsigned int crc = ~uzlib_crc32(bufinfo.buf, len, ~0);
memcpy(&dest_buf[i], &crc, sizeof(crc));
i += sizeof(crc);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint32_t?

Comment thread extmod/moduzlib.c Outdated
mp_obj_t data = args[0];
mp_buffer_info_t bufinfo;
mp_get_buffer_raise(data, &bufinfo, MP_BUFFER_READ);
int len = bufinfo.len;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is written to the output; should this be uint32_t (or whatever the right size is)?

@QAMU
Copy link
Copy Markdown

QAMU commented Apr 27, 2020

hello,
for this commit, how to compress data with micropython.
can I use uzlib.compress(data)?

@andrewleech
Copy link
Copy Markdown
Contributor Author

Hi @QAMU I think it worked like uzlib.gzip(data)
I haven't used it for some time though, not sure it's ever likely to get completed enough to merge.

@QAMU
Copy link
Copy Markdown

QAMU commented Apr 28, 2020

Hi @andrewleech,
thanks for your reply, can I use uzlib.decompress() to decompress data = uzlib.gzip()?

@andrewleech
Copy link
Copy Markdown
Contributor Author

yep the two should work together well

@harbaum
Copy link
Copy Markdown

harbaum commented Feb 22, 2021

It seems compression and decompression don't work with each other:

>>> uzlib.decompress(uzlib.gzip("Hello World!"))
compressed from 12 to 14 raw bytes
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: -3

@harbaum
Copy link
Copy Markdown

harbaum commented Feb 27, 2021

It seems compression and decompression don't work with each other:

They do if one removes the gzip header, crc and length fields and uses the correct wbits:

>>> uzlib.decompress(uzlib.gzip("Hello World!")[10:-8], -15)
compressed from 12 to 14 raw bytes
bytearray(b'Hello World!')

@harbaum
Copy link
Copy Markdown

harbaum commented Mar 1, 2021

Updated attempt here.

@QAMU
Copy link
Copy Markdown

QAMU commented Mar 1, 2021

to compress use:

import uzlib 
def compress(buffer):
    encoded = = uzlib.gzip(buffer)
    return encoded

to decompress:

import uzlib 
import uio
def decompress(buffer):
    file = uio.BytesIO(buffer)
    decoded = uzlib.DecompIO(file, 31)
    return decoded.read()

Copy link
Copy Markdown

@harbaum harbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to decompress:

import uzlib 
import uio
def decompress(buffer):
    file = uio.BytesIO(buffer)
    decoded = uzlib.DecompIO(file, 31)
    return decoded.read()

Why not:

def decompress(buffer):
  return uzlib.decompress(buffer[10:-8], -15)

Seems more lightweight for an embedded solution.

Comment thread extmod/moduzlib.c
memset(comp->hash_table, 0, hash_size);

zlib_start_block(&comp->out);
uzlib_compress(comp, bufinfo.buf, len);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uzlib_compress as well as zlib_start_block internally (re-)allocate a buffer at comp->out which is never free'd. As a result the system runs out of memory after a few compression runs.

A free(comp->out.outbuf); after the subsequent memcpy solved this.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uzlib_compress as well as zlib_start_block internally (re-)allocate a buffer at comp->out which is never free'd. As a result the system runs out of memory after a few compression runs.

A free(comp->out.outbuf); after the subsequent memcpy solved this.

Im trying to encode a string to get into the shorter sms data possible.,
questio is : how can I add this gzip uzlib extension to my current 1.14 toolchain for esp32?

@br0kenpixel br0kenpixel mentioned this pull request May 6, 2021
@andrewleech andrewleech force-pushed the gzip branch 3 times, most recently from f69e550 to 106f843 Compare July 13, 2021 08:06
@andrewleech
Copy link
Copy Markdown
Contributor Author

@harbaum Thanks for your additions in #6972, there is some great work there. I've rebased your changes on my original branch as a separate commit to keep your attribution.

In addition, I've added support for gzip header in the decompress(bytes, 31) function to allow it to work directly on the gzip compressed data.

The docs have been updated and I've added a basic compress unit test.

@andrewleech
Copy link
Copy Markdown
Contributor Author

Ok I've broken the unit tests with the last additions of gzip decompress and rebase onto current master... and it's too big for tiny ports - so probably does need a new feature flag. Either that, or just build & distribute in the dynamic C module version of the uzlib.

tannewt added a commit to tannewt/circuitpython that referenced this pull request Dec 29, 2021
clear out interrupt when freeing the timer
@dpgeorge
Copy link
Copy Markdown
Member

Superseded by #11905.

@dpgeorge dpgeorge closed this Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extmod Relates to extmod/ directory in source

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants