-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
object-name: fix resolution of object names containing curly braces #1844
base: master
Are you sure you want to change the base?
Conversation
1308dfe
to
1671b77
Compare
/submit |
Submitted as [email protected] To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this): "Elijah Newren via GitGitGadget" <[email protected]> writes:
> From: Elijah Newren <[email protected]>
>
> Given a branch name of 'foo{bar', commands like
>
> git cat-file -p foo{bar:README.md
>
> should succeed (assuming that branch had a README.md file, of course).
> However, the change in cce91a2caef9 (Change 'master@noon' syntax to
> 'master@{noon}'., 2006-05-19) presumed that curly braces would always
> come after an '@' and be paired, causing 'foo{bar:README.md' to
> entirely miss the ':' and assume there's no object being referenced.
> In short, git would report:
>
> fatal: Not a valid object name foo{bar:README.md
>
> Change the parsing to only make the assumption of paired curly braces
> immediately after a '@' character appears.
Interesting. I wonder if this looseness was to ensure that we won't
mistake a colon inside "master^{/title with : a colon}" as a start
of a subpath, instead of asking for a commit with a title that
happens to have a colon in it?
> Add tests for both this and 'foo@@{...}' cases, which an initial version
> of this patch broke.
Thanks for being extra careful here.
> Reported-by: Gabriel Amaral <[email protected]>
> Helped-by: Michael Haggerty <[email protected]>
> Signed-off-by: Elijah Newren <[email protected]>
> ---
> object-name: fix resolution of object names containing curly braces
>
> Maintainer note: this bug dates back to 2006; it is not a regression in
> this cycle.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1844%2Fnewren%2Fobject-name-fix-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1844/newren/object-name-fix-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1844
>
> object-name.c | 8 +++++---
> t/t1006-cat-file.sh | 17 +++++++++++++++++
> 2 files changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/object-name.c b/object-name.c
> index c892fbe80aa..e92f26b3256 100644
> --- a/object-name.c
> +++ b/object-name.c
> @@ -2087,12 +2087,14 @@ static enum get_oid_result get_oid_with_context_1(struct repository *repo,
> return -1;
> }
> for (cp = name, bracket_depth = 0; *cp; cp++) {
> - if (*cp == '{')
> + if (*cp == '@' && *(cp+1) == '{') {
> + cp++;
> bracket_depth++;
> - else if (bracket_depth && *cp == '}')
> + } else if (bracket_depth && *cp == '}') {
> bracket_depth--;
> - else if (!bracket_depth && *cp == ':')
> + } else if (!bracket_depth && *cp == ':') {
> break;
> + }
> }
> if (*cp == ':') {
> struct object_id tree_oid;
> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> index d36cd7c0863..252485dac78 100755
> --- a/t/t1006-cat-file.sh
> +++ b/t/t1006-cat-file.sh
> @@ -603,6 +603,23 @@ test_expect_success FUNNYNAMES '--batch-check, -Z with newline in input' '
> test_cmp expect actual
> '
>
> +test_expect_success FUNNYNAMES 'setup with curly braches in input' '
> + git branch "foo{bar" &&
> + git branch "foo@"
> +'
> +
> +test_expect_success FUNNYNAMES 'object reference with curly brace' '
> + git cat-file -p "foo{bar:hello" >actual &&
> + git cat-file -p HEAD:hello >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success FUNNYNAMES 'object reference with at-sign' '
> + git cat-file -p "foo@@{0}:hello" >actual &&
> + git cat-file -p HEAD:hello >expect &&
> + test_cmp expect actual
> +'
> +
> test_expect_success 'setup blobs which are likely to delta' '
> test-tool genrandom foo 10240 >foo &&
> { cat foo && echo plus; } >foo-plus &&
>
> base-commit: 92999a42db1c5f43f330e4f2bca4026b5b81576f |
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Wed, Jan 01, 2025 at 02:53:09AM +0000, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <[email protected]>
>
> Given a branch name of 'foo{bar', commands like
>
> git cat-file -p foo{bar:README.md
>
> should succeed (assuming that branch had a README.md file, of course).
> However, the change in cce91a2caef9 (Change 'master@noon' syntax to
> 'master@{noon}'., 2006-05-19) presumed that curly braces would always
> come after an '@' and be paired, causing 'foo{bar:README.md' to
> entirely miss the ':' and assume there's no object being referenced.
> In short, git would report:
>
> fatal: Not a valid object name foo{bar:README.md
>
> Change the parsing to only make the assumption of paired curly braces
> immediately after a '@' character appears.
>
> Add tests for both this and 'foo@@{...}' cases, which an initial version
> of this patch broke.
Curious. I was kind of surprised to see that it's perfectly legal to
have branch names with curly braces in them in the first place. Even
something like `foo{bar}` is legal, even though it might be confusing
when one knows the above syntax. But sans your finding, this should be
fine given that curly braces are only interpreted specially when
preceded by '@', and the '@{' sequence is indeed disallowed by
`check_refname_compoment()`.
> diff --git a/object-name.c b/object-name.c
> index c892fbe80aa..e92f26b3256 100644
> --- a/object-name.c
> +++ b/object-name.c
> @@ -2087,12 +2087,14 @@ static enum get_oid_result get_oid_with_context_1(struct repository *repo,
> return -1;
> }
> for (cp = name, bracket_depth = 0; *cp; cp++) {
> - if (*cp == '{')
> + if (*cp == '@' && *(cp+1) == '{') {
> + cp++;
> bracket_depth++;
> - else if (bracket_depth && *cp == '}')
> + } else if (bracket_depth && *cp == '}') {
> bracket_depth--;
> - else if (!bracket_depth && *cp == ':')
> + } else if (!bracket_depth && *cp == ':') {
> break;
> + }
> }
> if (*cp == ':') {
> struct object_id tree_oid;
Makes sense. Only the first hunk actually changes anything, the
remaining changes are only required to make us stick to our coding
style.
I wonder though: does this have any impact on '<rev>^{<type>}' and other
syntaxes where we use '^' instead of '@'?
> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> index d36cd7c0863..252485dac78 100755
> --- a/t/t1006-cat-file.sh
> +++ b/t/t1006-cat-file.sh
> @@ -603,6 +603,23 @@ test_expect_success FUNNYNAMES '--batch-check, -Z with newline in input' '
> test_cmp expect actual
> '
>
> +test_expect_success FUNNYNAMES 'setup with curly braches in input' '
> + git branch "foo{bar" &&
> + git branch "foo@"
> +'
> +
> +test_expect_success FUNNYNAMES 'object reference with curly brace' '
> + git cat-file -p "foo{bar:hello" >actual &&
> + git cat-file -p HEAD:hello >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success FUNNYNAMES 'object reference with at-sign' '
> + git cat-file -p "foo@@{0}:hello" >actual &&
> + git cat-file -p HEAD:hello >expect &&
> + test_cmp expect actual
> +'
Do these really need the FUNNYNAMES prereq? The prereq seems to only be
about embedded quotes, tabs and newlines and is disallowed on MinGW. But
I think both '{' and '@' should work alright there, shouldn't they?
Thanks!
Patrick |
User |
On the Git mailing list, Junio C Hamano wrote (reply to this): Patrick Steinhardt <[email protected]> writes:
> I wonder though: does this have any impact on '<rev>^{<type>}' and other
> syntaxes where we use '^' instead of '@'?
> ...
> Do these really need the FUNNYNAMES prereq? The prereq seems to only be
> about embedded quotes, tabs and newlines and is disallowed on MinGW. But
> I think both '{' and '@' should work alright there, shouldn't they?
Thanks for a review. I am too curious how this change interacts
with syntax with {braces} that do not use "@". |
1671b77
to
f6e6e7a
Compare
There are issues in commit f6e6e7a: |
f6e6e7a
to
4759f9c
Compare
There are issues in commit 4759f9c: |
030d909
to
68bc921
Compare
Given a branch name of 'foo{bar', commands like git cat-file -p foo{bar:README.md should succeed (assuming that branch had a README.md file, of course). However, the change in cce91a2 (Change 'master@noon' syntax to 'master@{noon}'., 2006-05-19) presumed that curly braces would always come after an '@' or '^' and be paired, causing e.g. 'foo{bar:README.md' to entirely miss the ':' and assume there's no object being referenced. In short, git would report: fatal: Not a valid object name foo{bar:README.md Change the parsing to only make the assumption of paired curly braces immediately after either a '@' or '^' character appears. Add tests for this, as well as for a few other test cases that initial versions of this patch broke: * 'foo@@{...}' * 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}' Reported-by: Gabriel Amaral <[email protected]> Helped-by: Michael Haggerty <[email protected]> Signed-off-by: Elijah Newren <[email protected]>
From Documentation/revisions.txt: '<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb':: Output from `git describe`; i.e. a closest tag, optionally followed by a dash and a number of commits, followed by a dash, a 'g', and an abbreviated object name. which means that output of the format ${REFNAME}-${INTEGER}-g${HASH} should parse to fully expand ${HASH}. This is fine. However, we currently don't validate any of ${REFNAME}-${INTEGER}, we only parse -g${HASH} and assume the rest is valid. That is problematic, since it breaks things like git cat-file -p branchname:path/to/file/named/i-gaffed which, when commit affed exists, will not return us information about a file we are looking for but will instead tell us about commit affed. Similarly, we should probably not treat refs/tags/invalid/./../...../// ~^:/?*\\&[}/busted.lock-g049e0ef6 as a request for commit 050e0ef either. Tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are present and valid. Reported-by: Gabriel Amaral <[email protected]> Signed-off-by: Elijah Newren <[email protected]>
68bc921
to
31f1c37
Compare
On the Git mailing list, Elijah Newren wrote (reply to this): On Wed, Jan 1, 2025 at 9:01 AM Junio C Hamano <[email protected]> wrote:
>
> "Elijah Newren via GitGitGadget" <[email protected]> writes:
>
> > From: Elijah Newren <[email protected]>
> >
> > Given a branch name of 'foo{bar', commands like
> >
> > git cat-file -p foo{bar:README.md
> >
> > should succeed (assuming that branch had a README.md file, of course).
> > However, the change in cce91a2caef9 (Change 'master@noon' syntax to
> > 'master@{noon}'., 2006-05-19) presumed that curly braces would always
> > come after an '@' and be paired, causing 'foo{bar:README.md' to
> > entirely miss the ':' and assume there's no object being referenced.
> > In short, git would report:
> >
> > fatal: Not a valid object name foo{bar:README.md
> >
> > Change the parsing to only make the assumption of paired curly braces
> > immediately after a '@' character appears.
>
> Interesting. I wonder if this looseness was to ensure that we won't
> mistake a colon inside "master^{/title with : a colon}" as a start
> of a subpath, instead of asking for a commit with a title that
> happens to have a colon in it?
Yeah, good catch, my changes would for example break parsing
master^{/object-name:}:t/t1006-cat-file.sh
I'll fix that and add a testcase. |
User |
On the Git mailing list, Elijah Newren wrote (reply to this): On Fri, Jan 3, 2025 at 12:16 AM Patrick Steinhardt <[email protected]> wrote:
>
> On Wed, Jan 01, 2025 at 02:53:09AM +0000, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <[email protected]>
> >
> > Given a branch name of 'foo{bar', commands like
> >
> > git cat-file -p foo{bar:README.md
> >
> > should succeed (assuming that branch had a README.md file, of course).
> > However, the change in cce91a2caef9 (Change 'master@noon' syntax to
> > 'master@{noon}'., 2006-05-19) presumed that curly braces would always
> > come after an '@' and be paired, causing 'foo{bar:README.md' to
> > entirely miss the ':' and assume there's no object being referenced.
> > In short, git would report:
> >
> > fatal: Not a valid object name foo{bar:README.md
> >
> > Change the parsing to only make the assumption of paired curly braces
> > immediately after a '@' character appears.
> >
> > Add tests for both this and 'foo@@{...}' cases, which an initial version
> > of this patch broke.
>
> Curious. I was kind of surprised to see that it's perfectly legal to
> have branch names with curly braces in them in the first place.
I was surprised too, but apparently they are valid and we have real
world repositories where people have used such bad names.
> Even
> something like `foo{bar}` is legal, even though it might be confusing
> when one knows the above syntax. But sans your finding, this should be
> fine given that curly braces are only interpreted specially when
> preceded by '@', and the '@{' sequence is indeed disallowed by
> `check_refname_compoment()`.
>
> > diff --git a/object-name.c b/object-name.c
> > index c892fbe80aa..e92f26b3256 100644
> > --- a/object-name.c
> > +++ b/object-name.c
> > @@ -2087,12 +2087,14 @@ static enum get_oid_result get_oid_with_context_1(struct repository *repo,
> > return -1;
> > }
> > for (cp = name, bracket_depth = 0; *cp; cp++) {
> > - if (*cp == '{')
> > + if (*cp == '@' && *(cp+1) == '{') {
> > + cp++;
> > bracket_depth++;
> > - else if (bracket_depth && *cp == '}')
> > + } else if (bracket_depth && *cp == '}') {
> > bracket_depth--;
> > - else if (!bracket_depth && *cp == ':')
> > + } else if (!bracket_depth && *cp == ':') {
> > break;
> > + }
> > }
> > if (*cp == ':') {
> > struct object_id tree_oid;
>
> Makes sense. Only the first hunk actually changes anything, the
> remaining changes are only required to make us stick to our coding
> style.
>
> I wonder though: does this have any impact on '<rev>^{<type>}' and other
> syntaxes where we use '^' instead of '@'?
<type> is pretty limited, so I see no problem there. However
<rev>^{/<search text>} is problematic, as Junio pointed out. I've
fixed up the patch and added a testcase to cover all the '^{...}'
cases.
> > diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> > index d36cd7c0863..252485dac78 100755
> > --- a/t/t1006-cat-file.sh
> > +++ b/t/t1006-cat-file.sh
> > @@ -603,6 +603,23 @@ test_expect_success FUNNYNAMES '--batch-check, -Z with newline in input' '
> > test_cmp expect actual
> > '
> >
> > +test_expect_success FUNNYNAMES 'setup with curly braches in input' '
> > + git branch "foo{bar" &&
> > + git branch "foo@"
> > +'
> > +
> > +test_expect_success FUNNYNAMES 'object reference with curly brace' '
> > + git cat-file -p "foo{bar:hello" >actual &&
> > + git cat-file -p HEAD:hello >expect &&
> > + test_cmp expect actual
> > +'
> > +
> > +test_expect_success FUNNYNAMES 'object reference with at-sign' '
> > + git cat-file -p "foo@@{0}:hello" >actual &&
> > + git cat-file -p HEAD:hello >expect &&
> > + test_cmp expect actual
> > +'
>
> Do these really need the FUNNYNAMES prereq? The prereq seems to only be
> about embedded quotes, tabs and newlines and is disallowed on MinGW. But
> I think both '{' and '@' should work alright there, shouldn't they?
Oh, I misread the failures. It turns out the FUNNYNAMES prereq fixed
things in CI on windows for me because the only commit ever created in
the repository is created by a testcase with a FUNNYNAMES prereq.
Since the setup for my tests relied on HEAD existing (because I run
git branch "foo{bar" HEAD
in a setup test of my own), the tests were failing. I didn't look
closely enough and assumed that command was failing because Windows
didn't like a branch name with a curly brace, but the real reason it
was failing was because HEAD didn't exist.
I'll tweak one of the earlier setup tests to create a commit so HEAD exists.
Thanks for pointing this out. |
/submit |
Submitted as [email protected] To fetch this version into
To fetch this version to local tag
|
Maintainer note: these bugs both date back to 2006; neither is a regression in this cycle.
Changes since v1:
The second patch has some backward compatibility concerns. People used to be able to do e.g.
git show ${garbage}-g${hash}
. I tightened it to${valid_refname}-${number}-g${hash}
, but do we want to allow e.g.${valid_refname}-g${hash}
(allowing the count to be omitted) or maybe even allow a subset of invalid refnames?Also for the second patch, while the repository the reporter found the issue in was something else, I found two open source examples:
git cat-file -t master:random/path/major-gaffed
git cat-file -t super-invalid~///\\[email protected]
cc: Patrick Steinhardt [email protected]
cc: Elijah Newren [email protected]