-
Notifications
You must be signed in to change notification settings - Fork 398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(ASAN report-converter) CodeChecker cmd diff
finds new issues even though the reports are the same
#3632
Comments
CodeChecker cmd diff
finds new issues even though the reports are the same
Hi, This is a real issue to solve in the future. The problem is that the has which is identifying the report type contains the message emitted by the sanitizer. So yes, unfortunately they will be identified as different reports every time. |
We could execute a regex replace on the message, right before generating the hash. Using
Example:
In CodeChecker python code, where would the best place be to do this replacement? I can submit a patch if it's not too complicated. |
There is a |
Isn't the report-hash computation in CodeChecker itself, rather than in report-converter? Such a change might have to be generic. |
CodeChecker is using this What command did you use for generating sanitizer reports? |
I used My point is that there are 2 places where this can be fixed:
|
I would vote on the first option, i.e. changing the checker message in the |
We should also think if there is a way to handle the already stored reports by sanitizers. I have daily reports printing a ton of addresses each, all with different report hash. Even though many reports (even in the same run) refer to the same issue, selecting the "unique reports" filter doesn't help. All these reports would have to be deleted and re-converted and then re-stored?
There are messages that tell you "read of size 8 at Address1" and then print a whole stacktrace and then give more information on that specific address by printing "Address1 is located X bytes inside a 32-byte region freed by thread T0 here" and print another stacktrace. Apparently it helps to see that it is talking about the same address.
Yeah I see this need more and more. We don't even need a "text bubble", we need a more clear way to display a lot of text. The message I mentioned above might print 150 lines of stacktraces (of different threads) and helper messages. report-converter only catches the top stacktrace and ignores the rest, but they are vital to debugging the issue. I'll file a separate ticket for that. |
Some time we were also thinking of the case of same reports, changing their hash. This might happen either because the user chose another hash algorithm at We had a discussion in the team and we thought that the we could move hash generation to the report converter. It could be a better solution to this current issue. |
I like the idea, that should be the best indeed! (even if a bit complicated to implement) Thank you for discussing it further. Regarding the extra messages and stacktraces I have opened #3639 to keep track of it, since parsing, storing and presenting all that info is a separate and complicated issue. |
I compile the exact same code and same test with Address Sanitizer enabled. Both of the days, address sanitizer reports the same issue while running the test. But the memory addresses are different. It seems CodeChecker considers this a significant difference and the issues get a different report_hash.
I append snippets of
CodeChecker cmd results -o json
for both runs. Notice that the reports are for the same bug, but the report-hash is differentFirst day:
Second day:
As a result,
CodeChecker cmd diff
reports new issues found every day, without really having any new issue detected.The text was updated successfully, but these errors were encountered: