Skip to content

index: Fix alignment issues in write_disk_entry()#4655

Merged
pks-t merged 1 commit into
libgit2:masterfrom
glaubitz:alignment
Jun 7, 2018
Merged

index: Fix alignment issues in write_disk_entry()#4655
pks-t merged 1 commit into
libgit2:masterfrom
glaubitz:alignment

Conversation

@glaubitz
Copy link
Copy Markdown
Contributor

In order to avoid alignment issues on certain target architectures, it is necessary to use memcpy() when modifying elements of a struct inside a buffer returned by git_filebuf_reserve().

This fixes the testsuite on Debian mipsel and sparc64 where it crashes with "Bus error".
See: https://buildd.debian.org/status/package.php?p=libgit2&suite=sid

Copy link
Copy Markdown
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your pull request! I guess the intention is clear and reasonable, so I'm happy to merge this.

Right now, the code is a bit inefficient, though, as we're now writing the same content into the mmapped buffer twice, which doesn't make a lot of sense to me. See in-line comments for some suggestions.

Comment thread src/index.c Outdated

memset(ondisk, 0x0, disk_size);
memset(mem, 0x0, disk_size);
memcpy(&ondisk, mem, sizeof(struct entry_short));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you first zero out mem and then afterwards copy the zeroed out memory into ondisk. So simply using memset(&ondisk, 0, sizeof(ondisk)) would result in more efficient code here, wouldn't it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right. I was more focusing on the memory-mapped region pointed to by mem and wanted to make sure it's all cleared out. It was a bit late yesterday when I wrote this ;).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought: ondisk and mem can have different sizes here: sizeof(struct entry_short) vs disk_size. So, if I made the change as suggested, the memory region that would get zeroed would be different.

Comment thread src/index.c Outdated
ondisk_ext = (struct entry_long *)ondisk;
ondisk_ext->flags_extended = htons(entry->flags_extended &
struct entry_long ondisk_ext;
memcpy(&ondisk_ext, mem, sizeof(struct entry_long));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. So this doesn't make a lot of sense to me. mem was mmapped previously, which is why we assigned that and then just continued writing via the ondisk_ext pointer. But what you're doing now is first write ondisk to disk via the mmapped file buffer, then copy ondisk into ondisk_ext and overwrite what you've previously written to the mmapped buffer with ondisk_ext. It would make sense to avoid that overhead and only write either ondisk or ondisk_ext to disk, but not both.

So essentially, I guess what I'm asking for is to move the above memcpy into these conditionals here:

  1. prepare struct entry_short
  2. is it extended? copy entry_short into entry_long and memcpy that to mem. Otherwise, just memcpy entry_short into mem

Does that make sense to you?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that make sense to you?

Sounds reasonable. I'll have another look.

@glaubitz
Copy link
Copy Markdown
Contributor Author

Thanks for the quick review. I'll push an updated version later today.

@glaubitz
Copy link
Copy Markdown
Contributor Author

Just pushed an updated version which should address your issues.

@glaubitz
Copy link
Copy Markdown
Contributor Author

@pks-t Have you had a chance yet to review my updated pull request?

Comment thread src/index.c Outdated
ondisk_ext.flags_extended = htons(entry->flags_extended &
GIT_IDXENTRY_EXTENDED_FLAGS);
path = ondisk_ext->path;
memcpy(mem, &ondisk_ext, sizeof(struct entry_long));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in fact writing ondisk_ext.path into mem, which is not initialized yet. While I think we're fine here as path gets memcpyd to anyway, you could also use offsetof(struct entry_long, path) instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure what you mean. Use the offset of path for what? Why shouldn't mem not be initialized? It's zero'd in line 2636, isn't it?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mem is initialized, yes. But ondisk_ext.path is not initialized, as you only copy sizeof(struct entry_short) bytes into ondisk_ext, leaving additional struct entries (which happen to be only path and the extended flags) uninitialized.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And sorry for being unclear: use offsetof(struct entry_long, path) as the third argument of memcpy

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I understand what you mean. Thanks for the additional details. Yes, you're right, path not initialized yet. I will use your suggested change.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, and shouldn't it be offsetof(struct entry_short, path) for both cases? We're only filling up the extended flags after the memcpy() aren't we?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, no. I looked at the wrong line. Never mind ;).

Comment thread src/index.c Outdated
disk_size -= offsetof(struct entry_long, path);
} else {
path = ondisk->path;
memcpy(mem, &ondisk, sizeof(struct entry_short));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, but obviously with offsetof(struct entry_short, path)

@pks-t
Copy link
Copy Markdown
Member

pks-t commented Jun 1, 2018

Sorry, I didn't find the time yet. There's one small nit with regards to uninitialized path being copied to the mmap, but otherwise I'm happy

In order to avoid alignment issues on certain target architectures,
it is necessary to use memcpy() when modifying elements of a struct
inside a buffer returned by git_filebuf_reserve().
@glaubitz
Copy link
Copy Markdown
Contributor Author

glaubitz commented Jun 1, 2018

Pushed updated and rebased version.

@pks-t
Copy link
Copy Markdown
Member

pks-t commented Jun 1, 2018

Looks good to me, thanks! I'll wait for a couple of days whether anybody else wants to review

@glaubitz
Copy link
Copy Markdown
Contributor Author

glaubitz commented Jun 4, 2018

Ok, let me know in case I need to rebase.

@pks-t pks-t merged commit 422cd59 into libgit2:master Jun 7, 2018
@pks-t
Copy link
Copy Markdown
Member

pks-t commented Jun 7, 2018

Thanks again for your fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants