commit.go: support multi-line header continuations#10
Merged
Conversation
ttaylorr
commented
Mar 8, 2019
| for _, hdr := range c.ExtraHeaders { | ||
| n3, err := fmt.Fprintf(to, "%s %s\n", hdr.K, hdr.V) | ||
| n3, err := fmt.Fprintf(to, "%s %s\n", | ||
| hdr.K, strings.Replace(hdr.V, "\n", "\n ", -1)) |
Contributor
Author
There was a problem hiding this comment.
In future versions of Go, this line can be replaced with strings.ReplaceAll(hdr.V, "\n", "\n "), but this was introduced in Go 1.12, which we don't build against yet.
bk2204
requested changes
Mar 8, 2019
bk2204
left a comment
Member
There was a problem hiding this comment.
Overall, I think this is a great improvement and I'm excited to see it. I'd like to see an additional test, though, to make sure we round-trip things properly and to help us avoid breaking things in the future.
When Git wishes to continue one or more of a commit's extra headers on
more than a single line, it writes out the following:
parent: <SHA-1>
tree: <SHA-1>
gpgsig: -----BEGIN PGP SIGNATURE-----
<signature>
-----END PGP SIGNATURE-----
Our current parsing implementation does not handle this correctly, based
on a misunderstanding that one line is equivalent to one extra header,
and vice versa.
In fact, the situation presently is even more dire than not parsing the
'gpgsig' header incorrectly: we'll split the signature end ending line
into their own "headers" and in doing so trim off the leading
whitespace. In practice, this means that we can corrupt commits when
round-tripping them in many interesting ways [1].
To address the situation, we do two things:
1. Teach gitobj that when we are parsing extra headers for a commit,
_and_ a header line begins with a single whitespace character, we
are in fact continuing the last known header.
2. Likewise, teach gitobj that when encoding a commit which has an
extra header whose value contains a LF character, replace each LF
with a leading space, to round trip commits of this form
successfully.
Together, (1) and (2) means that we parse the 'gpgsig' header in the
above example as a _single_ entry in the commit's 'ExtraHeaders' field,
as expected.
[1]: git-lfs/git-lfs#3530
ab42a1c to
930b3ff
Compare
When parsing an extra header that is continued over multiple lines, an earlier check on the length of whitespace-separated fields caused the loop to terminate early, dropping continuation lines that consist only of whitespace. Tweak the logic slightly in order to capture these, and allow us to successfully round-trip commit parsing.
bk2204
approved these changes
Mar 11, 2019
bk2204
left a comment
Member
There was a problem hiding this comment.
This looks great! Thanks for adding the new test.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When Git wishes to continue one or more of a commit's extra headers on
more than a single line, it writes out the following:
Our current parsing implementation does not handle this correctly, based
on a misunderstanding that one line is equivalent to one extra header,
and vice versa.
In fact, the situation presently is even more dire than not parsing the
'gpgsig' header incorrectly: we'll split the signature end ending line
into their own "headers" and in doing so trim off the leading
whitespace. In practice, this means that we can corrupt commits when
round-tripping them in many interesting ways [1].
To address the situation, we do two things:
Teach gitobj that when we are parsing extra headers for a commit,
and a header line begins with a single whitespace character, we
are in fact continuing the last known header.
Likewise, teach gitobj that when encoding a commit which has an
extra header whose value contains a LF character, replace each LF
with a leading space, to round trip commits of this form
successfully.
Together, (1) and (2) means that we parse the 'gpgsig' header in the
above example as a single entry in the commit's 'ExtraHeaders' field,
as expected.
[1]: git-lfs/git-lfs#3530
/cc @git-lfs/core, especially @bk2204
/cc git-lfs/git-lfs#3530