index: Fix alignment issues in write_disk_entry()#4655
Conversation
pks-t
left a comment
There was a problem hiding this comment.
Thanks for your pull request! I guess the intention is clear and reasonable, so I'm happy to merge this.
Right now, the code is a bit inefficient, though, as we're now writing the same content into the mmapped buffer twice, which doesn't make a lot of sense to me. See in-line comments for some suggestions.
|
|
||
| memset(ondisk, 0x0, disk_size); | ||
| memset(mem, 0x0, disk_size); | ||
| memcpy(&ondisk, mem, sizeof(struct entry_short)); |
There was a problem hiding this comment.
So you first zero out mem and then afterwards copy the zeroed out memory into ondisk. So simply using memset(&ondisk, 0, sizeof(ondisk)) would result in more efficient code here, wouldn't it?
There was a problem hiding this comment.
Yeah, you're right. I was more focusing on the memory-mapped region pointed to by mem and wanted to make sure it's all cleared out. It was a bit late yesterday when I wrote this ;).
There was a problem hiding this comment.
On second thought: ondisk and mem can have different sizes here: sizeof(struct entry_short) vs disk_size. So, if I made the change as suggested, the memory region that would get zeroed would be different.
| ondisk_ext = (struct entry_long *)ondisk; | ||
| ondisk_ext->flags_extended = htons(entry->flags_extended & | ||
| struct entry_long ondisk_ext; | ||
| memcpy(&ondisk_ext, mem, sizeof(struct entry_long)); |
There was a problem hiding this comment.
Huh. So this doesn't make a lot of sense to me. mem was mmapped previously, which is why we assigned that and then just continued writing via the ondisk_ext pointer. But what you're doing now is first write ondisk to disk via the mmapped file buffer, then copy ondisk into ondisk_ext and overwrite what you've previously written to the mmapped buffer with ondisk_ext. It would make sense to avoid that overhead and only write either ondisk or ondisk_ext to disk, but not both.
So essentially, I guess what I'm asking for is to move the above memcpy into these conditionals here:
- prepare
struct entry_short - is it extended? copy
entry_shortintoentry_longandmemcpythat tomem. Otherwise, justmemcpyentry_shortintomem
Does that make sense to you?
There was a problem hiding this comment.
Does that make sense to you?
Sounds reasonable. I'll have another look.
|
Thanks for the quick review. I'll push an updated version later today. |
|
Just pushed an updated version which should address your issues. |
|
@pks-t Have you had a chance yet to review my updated pull request? |
| ondisk_ext.flags_extended = htons(entry->flags_extended & | ||
| GIT_IDXENTRY_EXTENDED_FLAGS); | ||
| path = ondisk_ext->path; | ||
| memcpy(mem, &ondisk_ext, sizeof(struct entry_long)); |
There was a problem hiding this comment.
This is in fact writing ondisk_ext.path into mem, which is not initialized yet. While I think we're fine here as path gets memcpyd to anyway, you could also use offsetof(struct entry_long, path) instead.
There was a problem hiding this comment.
I'm not 100% sure what you mean. Use the offset of path for what? Why shouldn't mem not be initialized? It's zero'd in line 2636, isn't it?
There was a problem hiding this comment.
mem is initialized, yes. But ondisk_ext.path is not initialized, as you only copy sizeof(struct entry_short) bytes into ondisk_ext, leaving additional struct entries (which happen to be only path and the extended flags) uninitialized.
There was a problem hiding this comment.
And sorry for being unclear: use offsetof(struct entry_long, path) as the third argument of memcpy
There was a problem hiding this comment.
Ah, now I understand what you mean. Thanks for the additional details. Yes, you're right, path not initialized yet. I will use your suggested change.
There was a problem hiding this comment.
Ah, and shouldn't it be offsetof(struct entry_short, path) for both cases? We're only filling up the extended flags after the memcpy() aren't we?
There was a problem hiding this comment.
Wait, no. I looked at the wrong line. Never mind ;).
| disk_size -= offsetof(struct entry_long, path); | ||
| } else { | ||
| path = ondisk->path; | ||
| memcpy(mem, &ondisk, sizeof(struct entry_short)); |
There was a problem hiding this comment.
Same here, but obviously with offsetof(struct entry_short, path)
|
Sorry, I didn't find the time yet. There's one small nit with regards to uninitialized |
In order to avoid alignment issues on certain target architectures, it is necessary to use memcpy() when modifying elements of a struct inside a buffer returned by git_filebuf_reserve().
|
Pushed updated and rebased version. |
|
Looks good to me, thanks! I'll wait for a couple of days whether anybody else wants to review |
|
Ok, let me know in case I need to rebase. |
|
Thanks again for your fix! |
In order to avoid alignment issues on certain target architectures, it is necessary to use memcpy() when modifying elements of a struct inside a buffer returned by git_filebuf_reserve().
This fixes the testsuite on Debian mipsel and sparc64 where it crashes with "Bus error".
See: https://buildd.debian.org/status/package.php?p=libgit2&suite=sid