parse: Do not initialize the content in context to NULL#4749
Conversation
String operations in libgit2 are supposed to never receive `NULL`, e.g. they are not `NULL`-save. In the case of `git__linenlen()`, invocation with `NULL` leads to undefined behavior. In a `git_parse_ctx` however, the `content` field used in these operations was initialized to `NULL` if the `git_parse_ctx_init()` was called with `NULL` for `content` or `0` for `content_len`. For the latter case, the initialization function even contained some logic for initializing `content` with `NULL`. This commit mitigates triggering undefined behavior by rewriting the logic. Now `content` is always initialized to a non-null buffer. Instead of a null buffer, an empty string is used for denoting an empty buffer.
| } else { | ||
| ctx->content = ""; | ||
| ctx->content_len = 0; | ||
| } |
There was a problem hiding this comment.
Yeah, this fixes the issue. I'm wondering though if we should even make it a bug to pass in NULL to git_parse_ctx_init by adding an assert. That'd require us to review all callers, though
There was a problem hiding this comment.
That was my initial idea. In a (non-published) earlier version, I added an assertion for catching calls to git_parse_ctx_init with NULL for context. However, I quickly ran into places where this fails.
For example, functions for parsing stuff built upon git_parse_ctx would also happily pass NULL for context. I even run into a test case explicitly checking for the ability to parse a patch from an empty and a NULL buffer:
libgit2/tests/apply/fromfile.c
Line 448 in 3be7301
There was a problem hiding this comment.
Huh, so be it then. Thanks for clarifying!
|
Note (to self): from the approaches described in the original PR (#4733), I ended up implementing approach 1 rather than approach 2. |
String operations in libgit2 are supposed to never receive
NULL, e.g. they are notNULL-save. In the case ofgit__linenlen(), invocation withNULLleads to undefined behavior.In a
git_parse_ctxhowever, thecontentfield used in these operations was initialized toNULLif thegit_parse_ctx_init()was called withNULLforcontentor0forcontent_len. For the latter case, the initialization function even contained some logic for initializingcontentwithNULL.This commit mitigates triggering undefined behavior by rewriting the logic. Now
contentis always initialized to a non-null buffer. Instead of a null buffer, an empty string is used for denoting an empty buffer.Supersedes #4733. Resolves #4726.