Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ar uses the same temporary file name for building libs #170

Open
sterpumihai opened this issue Oct 10, 2023 · 5 comments
Open

ar uses the same temporary file name for building libs #170

sterpumihai opened this issue Oct 10, 2023 · 5 comments

Comments

@sterpumihai
Copy link

Hi Brecht,

I'm encountering issues with both gcc 13.2 and 10.5, both with the same root cause.
The problem occurs when invoking parallel instances of ar. Because they all use the same temporary file name (in my case "stP1kAlM", not sure if applies in general), sometimes some ar calls fail with error "ar.exe: could not create temporary file whilst writing archive: Permission denied".

I went down the rabbit hole and traced this error message inside ar's implementation ie. https://github.com/bminor/binutils-gdb/blob/master/binutils/ar.c (unofficial mirror), line 1241, function write_archive. So the error is thrown when make_tempname returns NULL.
make_tempname is in https://github.com/bminor/binutils-gdb/blob/master/binutils/bucomm.c at line 541 and we can see it calls mkstemp - actually, I'm 100% sure it uses mkstemp and not mktemp as the format of a string produced by mktemp has one letter and 5 digits whereas mkstemp returns 6 alphanumeric characters.

Therefore, the problem now moves to glibc, https://github.com/bminor/glibc/blob/master/sysdeps/posix/tempname.c

The implementation of the random 6 alphanumeric characters generator is inside try_tempname_len, line 238.
At line 317 we can see that after a random generation it opens the freshly created random file. If the file already exists, it generates again. I have verified this behaviour locally by using a small application compiled with the gcc 13.2 you provide.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

char template[100];

int main()
{
  strncpy(template, "stXXXXXX", 11);
  mkstemp(template);
  printf("%s\n", template);
  return 0;
}

If you execute this small application like:

gcc main.c
a
del <whatever random file was created> && a
del <whatever random file was created> && a
...

you will notice that it always prints the same "random" string. This suggests that there is no entropy in the randomness provided by glibc's mkstemp.
In order to obtain different results you need to keep the generated random files.

What do you think? Where might the problem be? I somehow suspect glibc. Can you please check and maybe update glibc on the environment you are using for building gcc? I'm quite curious if this is already solved or if it's something new.

@sterpumihai
Copy link
Author

Ok I think I figured it out, the issue is with the current version of gnulib, not glibc. I reported it to the issue list.
Unfortunately, I don't see a workaround at the moment.

@brechtsanders if a fix is eventually done on the master branch of gnulib can you please re-trigger a build for gcc 10.5?

@brechtsanders
Copy link
Owner

brechtsanders commented Oct 11, 2023

What do I need to do to build differently?
I don't explicitely pull in gnulib, and I assume this is replaced with MinGW-w64 on Windows.
See: https://www.gnu.org/software/gnulib/manual/html_node/Native-Windows-Support.html

@brechtsanders
Copy link
Owner

I have an issue, that may or may not be related.

For native winlibs build I usually build everything from scratch (gcc, binutils, etc... + all dependencies).
Then I use that build to build the final release.
This way I know the resulting build can build itself and everything is build with optimizations present in the latest version of the toolchain.
However, recently I get lseek() errors in ar when doing this last stage.
I don't have the issue when building binutils with plugins disabled, but then a lot of stuff won't link.

Maybe the lseek issue is also caused by using the same temporary filename causing a seek beyond EOF?

@brechtsanders
Copy link
Owner

I think maybe mkstemp() from MinGW-w64 returns the same result because there is no random seed.
Maybe ar.exe should perform a random seed using something that's different for each instance (e.g. clock ticks since boot or process ID)?

@sterpumihai
Copy link
Author

Maybe the lseek issue is also caused by using the same temporary filename causing a seek beyond EOF?

Doesn't sound like the problem I'm reporting but we can try out a potential fix and see if this is also eliminated.

I think maybe mkstemp() from MinGW-w64 returns the same result because there is no random seed. Maybe ar.exe should perform a random seed using something that's different for each instance (e.g. clock ticks since boot or process ID)?

Meanwhile I did some more digging. So it turns out there is some randomness with gnulib, even if the main difference between libc and gnulib is that libc initializes the seed with the address of a stack variable. Actually, some people criticize this libc approach as it "leaks" ASLR information indirectly.

gnulib relies on clock() to provide some randomness. However, this has the problem of resolution - at least on Windows the precison is millisecond.
Given each parallel ar instance creates the temporary file immediately after starting, clock() always returns 0 (ie. not even a full ms has passed). This is the reason of the same filename.

Now, to be honest, this is not the immediate cause of the race condition but it creates the premises for it as ar now wants to also remove these files. It's pretty complicated if you study the code.

Let's try a small fix! I'll provide you a patch for binutils based on current master revision (4b41a55fe53ce2da4823ffa3a5b94bd09bf2ab0d). It should patch quite easily as bucomm.c was modified in february this year.
Also, let me know if the patch works ie. if it compiles. I tried to write it as cleanly as possible.

https://drive.google.com/file/d/1UjJXuuR7DA_w4XWn1h1w1ckRiGysCj7f/view?usp=sharing

Can you please therefore apply this patch to binutils and release a new gcc? Can you please also use gcc 10.5 for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants