py/modmicropython: Add micropython.memmove() and micropython.memset().#12487
py/modmicropython: Add micropython.memmove() and micropython.memset().#12487projectgus wants to merge 1 commit into
Conversation
|
Code size report: |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #12487 +/- ##
==========================================
- Coverage 98.38% 98.18% -0.20%
==========================================
Files 158 158
Lines 20940 20981 +41
==========================================
- Hits 20602 20601 -1
- Misses 338 380 +42 ☔ View full report in Codecov by Sentry. |
This was based on a discussion about providing a more optimal way to copy data between buffers, however based on the benchmarking so far it seems like it might not be worth the overhead. Signed-off-by: Angus Gratton <[email protected]>
bbc7eaf to
b55ac53
Compare
After applying the thread-local slice optimisation and poking around with This is another very short-lived heap allocation, but it looks like it would be much harder to optimise than the thread-local slice case. |
|
This is an automated heads-up that we've just merged a Pull Request See #13763 A search suggests this PR might apply the STATIC macro to some C code. If it Although this is an automated message, feel free to @-reply to me directly if |
This was based on a discussion about providing a more optimal way to copy data between buffers, however based on benchmarks so far it seems like it might not be worth it compared to optimising "copy to/from slice" code paths written in idiomatic Python.
Summary
Adds two functions to
micropythonmodule, gated behind a new config option:micropython.memmove(dest, dest_idx, src, src_idx, [len])- an optimised equivalent ofdest[dest_idx:dest_idx+len] = src[src_idx:src_idx+len]. Copies memory contents with semantics of Cmemmove, hence the name.lenargument is optional, length defaults to the minimum of the length of the source and destination regions.micropython.memset(dest, dest_idx=0, c=0, len=len(dest)-dest_idx)- an optimised equivalent ofdest[dest_idx:] = bytes([c]*len). Modelled on C'smemset.Unlike assigning to a slice, the destination buffer size never changes as a result of calling either of these functions. Out of bounds assignment raises an exception.
Benchmarks - memmove
Comparing memmove to current MicroPython "best practices" (unix port, i5-1248P CPU):
Honestly I found this a little underwhelming! Admittedly,
slice_copy-6-memmove.pycan do the equivalent ofslice_copy-5-lvalue_rvalue_memoryview.py(slices on both sides of the assignment) and it's almost twice as fast, but it's only twice as fast (in a tight loop that does nothing else, working with pretty short buffers.)Maybe the C implementation of memmove() needs some tweaks to streamline the error checking 🤷 .
When rebased against PR #10160 things get even closer:
Now
slice_copy-6-memmove.pyis only 1.6x faster thanslice_copy-5-lvalue_rvalue_memoryview.py, and no faster than assigning a buffer to an lvalue slice...Benchmarks - memset
Kind of the same story with memset, writing out a bytes array (which can be frozen to flash) is basically as fast as using the
memset()function. The naive versions of this are a lot slower, though!Disclaimer: The new test file names take some liberties with the meaning of
lvalueandrvalue, happy to take suggestions for more accurate term to use.This work was funded through GitHub Sponsors.