Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix python.lang.correctness.baseclass-attribute-override #3434

Merged
merged 2 commits into from
Jul 18, 2024

Conversation

IagoAbal
Copy link
Contributor

The rule depended on Semgrep unifying two identifiers, corresponding to methods in two different classes, that were not the same but just had the same "string" name.

The rule depended on Semgrep unifying two identifiers, corresponding to
methods in two different classes, that were not the same but just had
the same "string" name.
@IagoAbal IagoAbal requested review from 0xDC0DE and p4p3r July 18, 2024 15:28
Copy link
Contributor

@akuhlens akuhlens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙈

@IagoAbal IagoAbal merged commit cd6cd52 into develop Jul 18, 2024
8 checks passed
@IagoAbal IagoAbal deleted the iago/fix-rule branch July 18, 2024 16:21
semgrep-ci bot pushed a commit to semgrep/semgrep that referenced this pull request Jul 22, 2024
…emgrep/semgrep-proprietary#1877)

Applying name equivalences on a global id forks matching, trying all its
"alternate names", while erasing the id_info. We were applying
equivalences right away, before even knowing if it was required. If a
metavariable matched a global id, instead of being just bound to that
global id, it ended up bound multiple times with each one of the
alternate names (with no id_info).

In #1751 we added the `semgrep-internal-metavariable-name` operator, and
its implementation required some "hacks" partly because of this issue.
Ideally, we analyze the sources and record a set of `Name.t`s of our
interest, then we just check if `...-metavariable-name` binds to one of
those `Name.t`s. But this doesn't work because `...-metavariable-name`
was binding to plain strings without id_info.

This should enable us to clean up #1751.

Closes CODE-7336

As a side-effect, Semgrep was unifying different global ids that just
happened to have the same string name, and I had to fix one rule that
was relying on this behavior: semgrep/semgrep-rules#3434.

For backwards compatibility I added the `unify_ids_strictly` option,
which can be disabled to unify ids ignoring their id_info. The option
could remain undocumented/experimental for now, the best way to write
those rules is probably as we did in semgrep/semgrep-rules#3434, by
using two different metavariables and then comparing them like
`comparison: str($X) == str($Y)`.

test plan:
SEMGREP_LOG_SRCS=semgrep.matching bin/semgrep-core-proprietary
-deep_inter_file \
-lang python -e '$FLASK(__name__)'
OSS/tests/rules//sym_prop_decorator.py -debug
#^ This now produces one match instead of two, and via -debug it can be
checked
#^ that we now bind $FLASK with the id `flask` preserving its id_info.

synced from Pro 8ed835e5a9f692f49c9a227bad500b0f00751e3c
semgrep-ci bot pushed a commit to semgrep/semgrep that referenced this pull request Jul 23, 2024
…emgrep/semgrep-proprietary#1877)

Applying name equivalences on a global id forks matching, trying all its
"alternate names", while erasing the id_info. We were applying
equivalences right away, before even knowing if it was required. If a
metavariable matched a global id, instead of being just bound to that
global id, it ended up bound multiple times with each one of the
alternate names (with no id_info).

In #1751 we added the `semgrep-internal-metavariable-name` operator, and
its implementation required some "hacks" partly because of this issue.
Ideally, we analyze the sources and record a set of `Name.t`s of our
interest, then we just check if `...-metavariable-name` binds to one of
those `Name.t`s. But this doesn't work because `...-metavariable-name`
was binding to plain strings without id_info.

This should enable us to clean up #1751.

Closes CODE-7336

As a side-effect, Semgrep was unifying different global ids that just
happened to have the same string name, and I had to fix one rule that
was relying on this behavior: semgrep/semgrep-rules#3434.

For backwards compatibility I added the `unify_ids_strictly` option,
which can be disabled to unify ids ignoring their id_info. The option
could remain undocumented/experimental for now, the best way to write
those rules is probably as we did in semgrep/semgrep-rules#3434, by
using two different metavariables and then comparing them like
`comparison: str($X) == str($Y)`.

test plan:
SEMGREP_LOG_SRCS=semgrep.matching bin/semgrep-core-proprietary
-deep_inter_file \
-lang python -e '$FLASK(__name__)'
OSS/tests/rules//sym_prop_decorator.py -debug
#^ This now produces one match instead of two, and via -debug it can be
checked
#^ that we now bind $FLASK with the id `flask` preserving its id_info.

synced from Pro 8ed835e5a9f692f49c9a227bad500b0f00751e3c
semgrep-ci bot pushed a commit to semgrep/semgrep that referenced this pull request Jul 24, 2024
…emgrep/semgrep-proprietary#1877)

Applying name equivalences on a global id forks matching, trying all its
"alternate names", while erasing the id_info. We were applying
equivalences right away, before even knowing if it was required. If a
metavariable matched a global id, instead of being just bound to that
global id, it ended up bound multiple times with each one of the
alternate names (with no id_info).

In #1751 we added the `semgrep-internal-metavariable-name` operator, and
its implementation required some "hacks" partly because of this issue.
Ideally, we analyze the sources and record a set of `Name.t`s of our
interest, then we just check if `...-metavariable-name` binds to one of
those `Name.t`s. But this doesn't work because `...-metavariable-name`
was binding to plain strings without id_info.

This should enable us to clean up #1751.

Closes CODE-7336

As a side-effect, Semgrep was unifying different global ids that just
happened to have the same string name, and I had to fix one rule that
was relying on this behavior: semgrep/semgrep-rules#3434.

For backwards compatibility I added the `unify_ids_strictly` option,
which can be disabled to unify ids ignoring their id_info. The option
could remain undocumented/experimental for now, the best way to write
those rules is probably as we did in semgrep/semgrep-rules#3434, by
using two different metavariables and then comparing them like
`comparison: str($X) == str($Y)`.

test plan:
SEMGREP_LOG_SRCS=semgrep.matching bin/semgrep-core-proprietary
-deep_inter_file \
-lang python -e '$FLASK(__name__)'
OSS/tests/rules//sym_prop_decorator.py -debug
#^ This now produces one match instead of two, and via -debug it can be
checked
#^ that we now bind $FLASK with the id `flask` preserving its id_info.

synced from Pro 8ed835e5a9f692f49c9a227bad500b0f00751e3c
semgrep-ci bot pushed a commit to semgrep/semgrep that referenced this pull request Jul 24, 2024
…emgrep/semgrep-proprietary#1877)

Applying name equivalences on a global id forks matching, trying all its
"alternate names", while erasing the id_info. We were applying
equivalences right away, before even knowing if it was required. If a
metavariable matched a global id, instead of being just bound to that
global id, it ended up bound multiple times with each one of the
alternate names (with no id_info).

In #1751 we added the `semgrep-internal-metavariable-name` operator, and
its implementation required some "hacks" partly because of this issue.
Ideally, we analyze the sources and record a set of `Name.t`s of our
interest, then we just check if `...-metavariable-name` binds to one of
those `Name.t`s. But this doesn't work because `...-metavariable-name`
was binding to plain strings without id_info.

This should enable us to clean up #1751.

Closes CODE-7336

As a side-effect, Semgrep was unifying different global ids that just
happened to have the same string name, and I had to fix one rule that
was relying on this behavior: semgrep/semgrep-rules#3434.

For backwards compatibility I added the `unify_ids_strictly` option,
which can be disabled to unify ids ignoring their id_info. The option
could remain undocumented/experimental for now, the best way to write
those rules is probably as we did in semgrep/semgrep-rules#3434, by
using two different metavariables and then comparing them like
`comparison: str($X) == str($Y)`.

test plan:
SEMGREP_LOG_SRCS=semgrep.matching bin/semgrep-core-proprietary
-deep_inter_file \
-lang python -e '$FLASK(__name__)'
OSS/tests/rules//sym_prop_decorator.py -debug
#^ This now produces one match instead of two, and via -debug it can be
checked
#^ that we now bind $FLASK with the id `flask` preserving its id_info.

synced from Pro 8ed835e5a9f692f49c9a227bad500b0f00751e3c
aryx pushed a commit to semgrep/semgrep that referenced this pull request Jul 24, 2024
…emgrep/semgrep-proprietary#1877)

Applying name equivalences on a global id forks matching, trying all its
"alternate names", while erasing the id_info. We were applying
equivalences right away, before even knowing if it was required. If a
metavariable matched a global id, instead of being just bound to that
global id, it ended up bound multiple times with each one of the
alternate names (with no id_info).

In #1751 we added the `semgrep-internal-metavariable-name` operator, and
its implementation required some "hacks" partly because of this issue.
Ideally, we analyze the sources and record a set of `Name.t`s of our
interest, then we just check if `...-metavariable-name` binds to one of
those `Name.t`s. But this doesn't work because `...-metavariable-name`
was binding to plain strings without id_info.

This should enable us to clean up #1751.

Closes CODE-7336

As a side-effect, Semgrep was unifying different global ids that just
happened to have the same string name, and I had to fix one rule that
was relying on this behavior: semgrep/semgrep-rules#3434.

For backwards compatibility I added the `unify_ids_strictly` option,
which can be disabled to unify ids ignoring their id_info. The option
could remain undocumented/experimental for now, the best way to write
those rules is probably as we did in semgrep/semgrep-rules#3434, by
using two different metavariables and then comparing them like
`comparison: str($X) == str($Y)`.

test plan:
SEMGREP_LOG_SRCS=semgrep.matching bin/semgrep-core-proprietary
-deep_inter_file \
-lang python -e '$FLASK(__name__)'
OSS/tests/rules//sym_prop_decorator.py -debug
#^ This now produces one match instead of two, and via -debug it can be
checked
#^ that we now bind $FLASK with the id `flask` preserving its id_info.

synced from Pro 8ed835e5a9f692f49c9a227bad500b0f00751e3c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants