Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[-Wunsafe-buffer-usage] Accept calls to some libc functions with annotated arguments #10088

Merged

Conversation

ziqingluo-90
Copy link

  • printf, fprintf snprintf functions accept __null_terminated
  • snprintf function accepts __counted_by/__sized_by
  • functions consuming a single string pointer like strlen or atoi accept __null_terminated

Generalized isCountAttributedPointerArgumentSafe so that it is shared by interoperation gadgets and the unsafe libc gadget.

(A follow-up change to rdar://138798346)

@ziqingluo-90
Copy link
Author

CC @dtarditi

@ziqingluo-90
Copy link
Author

This PR depends on #10060

@@ -747,7 +747,7 @@ const Expr *extractExtentFromSubviewDataCall(ASTContext &Context,
static bool hasIntegeralConstant(const Expr *E, uint64_t Val, ASTContext &Ctx) {
Expr::EvalResult ER;

if (E->EvaluateAsConstantExpr(ER, Ctx)) {
if (E->EvaluateAsInt(ER, Ctx)) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://github.com/llvm/llvm-project/pull/124022/files, the result of EvaluateAsConstantExpr may not necessarily be able to getInt() later.

@@ -23,7 +23,7 @@ namespace std {
T* p;
T *c_str();
T *data();
unsigned size_bytes();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A mistake I made in the test---std::string has no size_bytes method.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth checking if basic_string<wchar_t> works as expected.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Added test cases for wide characters.

@@ -366,6 +366,7 @@ isInUnspecifiedUntypedContext(internal::Matcher<Stmt> InnerMatcher) {

namespace {

/* TO_UPSTREAM(BoundsSafetyInterop) ON */
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume there is no problem in upstreaming the part of the interop in UnsafeBufferUsage.cpp? @patrykstefanski

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this is fine.

@ziqingluo-90
Copy link
Author

add @jkorous-apple

snwprintf(safe_p, n, "%s", str); // expected-warning{{function 'snwprintf' is unsafe}} expected-note{{string argument is not guaranteed to be null-terminated}}

memcpy(safe_p, safe_p, n); // no warn
strlen(str); // expected-warning{{unsafe assignment to a parameter of '__null_terminated' type; only '__null_terminated' pointers, string literals, and 'std::string::c_str' calls are compatible with '__null_terminated' pointers}}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not getting this warning here.

error: 'expected-warning' diagnostics expected but not seen:
  File ./clang/test/SemaCXX/warn-unsafe-buffer-usage-libc-functions-interop.cpp Line 36: unsafe assignment to a parameter of '__null_terminated' type; only '__null_terminated' pointers, string literals, and 'std::string::c_str' calls are compatible with '__null_terminated' pointers

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ It was not up to date to the dependent PR. Fixed.

&*ValuesOpt, Context);
return isCountAttributedPointerArgumentSafeImpl(
Context, Arg, CountArg, CAT, CAT->getCountExpr(), CAT->isCountInBytes(),
CAT->isOrNull(), ValuesOpt ? &*ValuesOpt : nullptr);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: IMHO we could be explicit to avoid any confusion: ValuesOpt.has_value() ? &*ValuesOpt : nullptr.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert(CountedByExpr ||
!DependentValueMap &&
"If the __counted_by information is hardcoded, there is no "
"dependent value map.");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the operator precedence in those 2 asserts correct? The bottom one is equivalent to:

CountedByExpr ||
         (!DependentValueMap &&
             "If the __counted_by information is hardcoded, there is no "
             "dependent value map.")

Is this expected?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string literal is simply true so I think they are equivalent. But let's add parentheses to avoid confusion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto ValuesOpt = getDependentValuesFromCall(CAT, Call);
if (!ValuesOpt.has_value())
if (!DependentValueMap && CountedByExpr)
// Bail if the map is not available for case (a). Becuase

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Becuase -> Because.

return false;

const Expr *ArgCount = nullptr;
// the acutal count of the pointer inferred through patterns below:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acutal -> actual


// expected-note@+2{{consider using a safe container and passing '.data()' to the parameter 'dst' and '.size()' to its dependent parameter 'size' or 'std::span' and passing '.first(...).data()' to the parameter 'dst'}}
// expected-note@+1{{consider using a safe container and passing '.data()' to the parameter 'src' and '.size()' to its dependent parameter 'size' or 'std::span' and passing '.first(...).data()' to the parameter 'src'}}
void memcpy(void * __sized_by(size) dst, const void * __sized_by(size) src, unsigned size);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memcpy returns void * and takes size_t size.

void memcpy(void * __sized_by(size) dst, const void * __sized_by(size) src, unsigned size);
unsigned strlen( const char* __null_terminated str );
// expected-note@+1{{consider using a safe container and passing '.data()' to the parameter 'buffer' and '.size()' to its dependent parameter 'buf_size' or 'std::span' and passing '.first(...).data()' to the parameter 'buffer'}}
int snprintf( char* __counted_by(buf_size) buffer, unsigned buf_size, const char* format, ... );

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Those printf functions take size_t.

// expected-note@+1{{consider using a safe container and passing '.data()' to the parameter 'buffer' and '.size()' to its dependent parameter 'buf_size' or 'std::span' and passing '.first(...).data()' to the parameter 'buffer'}}
int snprintf( char* __counted_by(buf_size) buffer, unsigned buf_size, const char* format, ... );
int snwprintf( char* __counted_by(buf_size) buffer, unsigned buf_size, const char* format, ... );
int vsnprintf( char* __counted_by(buf_size) buffer, unsigned buf_size, const char* format, ... );

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should take va_list vlist instead of ....

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... then I need to define a va_list that accepts variant parameters.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate? Is calling vsnprintf expected to be always unsafe? Cannot we declare this function with va_list and then pass an empty va_list on the call site?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, that would work.

}
return true; // ptr and size are not in safe pattern
return !isHardcodedCountedByPointerArgumentSafe(
Finder->getASTContext(), Buf, Size, FirstParmTy.getTypePtr(), true,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is true passed to isSizedBy correct here? I think the functions taking wchar_t would be annotated with __counted_by().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good point!

@@ -23,7 +23,7 @@ namespace std {
T* p;
T *c_str();
T *data();
unsigned size_bytes();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth checking if basic_string<wchar_t> works as expected.

…tated arguments

- `printf`, `fprintf` `snprintf` functions accept `__null_terminated`
- `snprintf` function accepts `__counted_by/__sized_by`
- functions consuming a single string pointer like `strlen` or `atoi`
  accept `__null_terminated`

Generalized `isCountAttributedPointerArgumentSafe` so that it is
shared by interoperation gadgets and the unsafe libc gadget.

(A follow-up change to rdar://138798346)
@ziqingluo-90 ziqingluo-90 force-pushed the dev/ziqing/PR-138798346-followup branch from 306ea22 to 2678bd8 Compare February 26, 2025 00:45
@ziqingluo-90
Copy link
Author

@patrykstefanski I have addressed your comments. Plz take another look.

@ziqingluo-90
Copy link
Author

Also added the pattern---std::span{p, strlen(p)}

} else
// When there is no argument representing the count/size, it is safe iff
// the annotation is `__counted_by(1)`.
return !isSizedBy && hasIntegeralConstant(CountedByExpr, 1, Context);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have CountArg above, we support form 2 (&var, sizeof(var)) with this check:

      // form 2:
      if (auto TySize = Context.getTypeSizeInCharsIfKnown(DRE->getType()))
        return hasIntegeralConstant(CountArg, TySize->getQuantity(), Context);

In the else branch here, we don't check for the form 2, we only check form 1. Is this expected?

} else
// When there is no argument representing the count/size, it is safe iff
// the annotation is `__counted_by(1)`.
return !isSizedBy && hasIntegeralConstant(CountedByExpr, 1, Context);
return false;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's fine.

return (FName == "strlen" || FName == "wcslen") &&
AreSameDRE(Arg0, CallArg);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the check for __null_terminated missing?

size_t strlen( const char* __null_terminated str );
size_t wcslen( const wchar_t* __null_terminated str );

void test_span_ctor_non_nt(char *p, const wchar_t *wp) {
  std::span S{p, strlen(p)};
  std::span S2{wp, wcslen(wp)};
}

It looks like we don't warn for this code. Not sure why it doesn't warn for the call strlen(p) though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... yeah, I'll look into it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, it's because that test only has -Wunsafe-buffer-usage-in-container but no -Wunsafe-buffer-usage. This is expected.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a test for this.

@@ -933,7 +1014,7 @@ bool isSinglePointerArgumentSafe(ASTContext &Context, const Expr *Arg) {
// 6. `std::span<T>{p, n}`, where `p` is a __counted_by(`n`)/__sized_by(`n`)
// pointer OR `std::span<char>{(char*)p, n}`, where `p` is a __sized_by(`n`)
// pointer.

// 7. `std::span<char>{p, strlen(p)}` or `std::span<wchar_t>{p, wcslen(p)}`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add '... if p is __null_terminated`.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, whether p is __null_terminated is irrelevant here.
The analysis of std::span constructors only cares about if the two arguments are in the form p and strlen(p).
If p is not null-terminated, it's up to the type checking to complain about it. Just like we do not check that p is of pointer-to-char type here.

T *c_str();
T *data();
unsigned size();
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we match the interface to be the same as the types in std::? You can copy-paste from warn-unsafe-buffer-usage-count-attributed-pointer-argument.cpp.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// expected-note@+1{{consider using a safe container and passing '.data()' to the parameter 'buffer' and '.size()' to its dependent parameter 'buf_size' or 'std::span' and passing '.first(...).data()' to the parameter 'buffer'}}
int snprintf( char* __counted_by(buf_size) buffer, unsigned buf_size, const char* format, ... );
int snwprintf( char* __counted_by(buf_size) buffer, unsigned buf_size, const char* format, ... );
int vsnprintf( char* __counted_by(buf_size) buffer, unsigned buf_size, const char* format, ... );

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate? Is calling vsnprintf expected to be always unsafe? Cannot we declare this function with va_list and then pass an empty va_list on the call site?

}
}
if (auto NumChars =
Finder->getASTContext().getTypeSizeInCharsIfKnown(FirstPteTy)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: auto and explicit optional:

  std::optional<CharUnits> NumChars =
      Finder->getASTContext().getTypeSizeInCharsIfKnown(FirstPteTy);
  if (NumChars.has_value()) {

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't if (std::optional<CharUnits> NumChars = ...) better?

// RUN: %clang_cc1 -std=c++20 -Wno-all -Wunsafe-buffer-usage -Wno-error=bounds-safety-strict-terminated-by-cast\
// RUN: -verify -fexperimental-bounds-safety-attributes %s
#include <ptrcheck.h>
typedef unsigned size_t;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot we include stddef.h? unsigned is not right type on most 64-bit platforms.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure!

@ziqingluo-90 ziqingluo-90 merged commit cfe50ed into swiftlang:next Mar 5, 2025
@ziqingluo-90 ziqingluo-90 deleted the dev/ziqing/PR-138798346-followup branch March 5, 2025 01:55
ziqingluo-90 added a commit that referenced this pull request Mar 6, 2025
…tated arguments (#10088)

* [-Wunsafe-buffer-usage] Accept calls to some libc functions with annotated arguments

- `printf`, `fprintf` `snprintf` functions accept `__null_terminated`
- `snprintf` function accepts `__counted_by/__sized_by`
- functions consuming a single string pointer like `strlen` or `atoi`
  accept `__null_terminated`

Generalized `isCountAttributedPointerArgumentSafe` so that it is
shared by interoperation gadgets and the unsafe libc gadget.

(A follow-up change to rdar://138798346)
ziqingluo-90 added a commit that referenced this pull request Mar 7, 2025
…tated arguments & Let libc warnings yield to bounds attributes (#10182)

Cherry-pick two commits:

* [Upstream][C++ Safe Buffers] Let libc warnings yield to bounds attributes

For a call to an unsafe libc function, do not warn about it if its
callee function has any bounds attributes. Because we can verify the
safety of the call using bounds attributes.

(rdar://140138380)

* [-Wunsafe-buffer-usage] Accept calls to some libc functions with annotated arguments (#10088)

- `printf`, `fprintf` `snprintf` functions accept `__null_terminated`
- `snprintf` function accepts `__counted_by/__sized_by`
- functions consuming a single string pointer like `strlen` or `atoi`
  accept `__null_terminated`

Generalized `isCountAttributedPointerArgumentSafe` so that it is
shared by interoperation gadgets and the unsafe libc gadget.

(A follow-up change to rdar://138798346)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants