gitignore with escapes#5097
Conversation
The "ignore.c.bak" file has accidentally been checked in via commit 1916490 (ignore: test that leading whitespace is significant, 2019-05-19) and should obviously not be part of our test suites. Delete it.
We had several occasions where tests for the gitignore had been added to status::ignore instead of the easier-to-handle attr::ignore test suite. This most likely resulted from the fact that the attr::ignore test suite is not easy to discover inside of the attr folder. Furthermore, ignore being part of the attributes code is an implementation detail, only, and thus shouldn't be stressed as much. Improve this by moving both attr::ignore and status::ignore tests into a new ignore test suite.
|
Thanks for 65ad6b5. 😢 |
So this is an interesting question. Is a path prefixed by a You can see this on the shell. If you're in cmd, you should be able to run What I'm not sure is what that means exactly for us. I think that I think that this is what the existing code was trying to communicate - and that it is correct, but only on Windows. |
|
I'm +1 on this otherwise. |
Okay, thanks for clarifying. My first approach was to simply move that code into the Windows-specific |
Ugh. I think so? Importantly, we (try to) do this right in the network code, where But that's a legitimate relative path on (non-macOS) Unix systems, so... yeah, we should probably be treating it as absolute only on Windows systems. |
Also, this is a lie. It's a legitimate path on macOS as well. I thought this was actually a reserved character in the Mac filesystem. (For pre-OS X compatibility, which used But I'm mistaken. Some apps still don't allow you to create files with |
Windows-based systems treat paths starting with '\' as absolute, either referring to the current drive's root (e.g. "\foo" might refer to "C:\foo") or to a network path (e.g. "\\host\foo"). On the other hand, (most?) systems that are not based on Win32 accept backslashes as valid characters that may be part of the filename, and thus we cannot treat them to identify absolute paths. Change the logic to only paths starting with '\' as absolute on the Win32 platform. Add tests to avoid regressions and document behaviour.
953310e to
880816f
Compare
When parsing attributes, we need to search for the first unescaped whitespace character to determine where the pattern is to be cut off. The scan fails to account for the case where the escaping '\' character is itself escaped, though, and thus we would not recognize the cut-off point in patterns like "\\ ". Refactor the scanning loop to remember whether the last character was an escape character. If it was and the next character is a '\', too, then we will reset to non-escaped mode again. Thus, we now handle escaped whitespaces as well as escaped wildcards correctly.
880816f to
3b51735
Compare
When parsing attribute patterns, we will eventually unescape the parsed pattern. This is required because we require custom escapes for whitespace characters, as normally they are used to terminate the current pattern. Thing is, we don't only unescape those whitespace characters, but in fact all escaped sequences. So for example if the pattern was "\*", we unescape that to "*". As this is directly passed to fnmatch(3) later, fnmatch would treat it as a simple glob matching all files where it should instead only match a file with name "*". Fix the issue by unescaping spaces, only. Add a bunch of tests to exercise escape parsing.
When determining the trailing space length, we need to honor whether spaces are escaped or not. Currently, we do not check whether the escape itself is escaped, though, which might generate an off-by-one in that case as we will simply treat the space as escaped. Fix this by checking whether the backslashes preceding the space are themselves escaped.
In our attributes pattern parsing code, we have a comment that states we might have to convert '\' characters to '/' to have proper POSIX paths. But in fact, '\' characters are valid inside the string and act as escape mechanism for various characters, which is why we never want to convert those to POSIX directory separators. Furthermore, gitignore patterns are specified to only treat '/' as directory separators. Remove the comment to avoid future confusion.
|
So I've changed |
|
Sweet, thanks! |
Various fixes to how we handle escapes when parsing gitignore patterns.