Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 5.38: Parallel::ForkManager = rmtree Error #146

Closed
sskopnik opened this issue Oct 20, 2023 · 29 comments
Closed

Version 5.38: Parallel::ForkManager = rmtree Error #146

sskopnik opened this issue Oct 20, 2023 · 29 comments
Labels
upstream module Issue is due to an upstream module

Comments

@sskopnik
Copy link

sskopnik commented Oct 20, 2023

Hi,
when using Parallel::ForkManager temporary directories are created for every spawned process. Later these dirs are removed using rmtree in DESTROY of Temp.pm
Using Excel::Writer::XLSX in the child processes results in an error when trying to delete these dirs. This worked fine under 5.32

Simple Test script throws:
cannot remove directory for C:/Users/mxxxxx/AppData/Local/Temp/RstWAjVdoy: Directory not empty at C:/Perl64_new/perl/lib/File/Temp.pm line 2643.
cannot remove directory for C:/Users/mxxxxx/AppData/Local/Temp/fvjigK5GMN: Directory not empty at C:/Perl64_new/perl/lib/File/Temp.pm line 2643.

# Test script
use Spreadsheet::WriteExcel;
use Excel::Writer::XLSX;
use Parallel::ForkManager;

sub genXls {
  my $pm = Parallel::ForkManager->new(2);
  
  foreach my $file ('test.xlsx', 'test2.xlsx') {
    my $pid = $pm->start and next;
    srand();
    my $workbook;
    $workbook = Excel::Writer::XLSX->new($file);
    $workbook->close() or die "Error File close";
    $pm->finish; # Terminates the child process
  }
  $pm->wait_all_children; 
}  
genXls();
@genio
Copy link
Member

genio commented Oct 20, 2023

@sskopnik One issue I've noticed is that some anti-virus programs see the quick creation/deletion of temporary directories on Windows as "sketchy" and will prevent the application from functioning as expected. Can you run that again with any AV programs off during the run?

@sskopnik
Copy link
Author

No the error persists even without AV programs activ.
Interesting fact: The generated XLSX Files should be empty but valid. They are not! Excel says they are corrupt.
There seems to be some conflicts between Excel::Writer::XLSX under a ForkManager environment (Parallel generation of Excel files) in combination with Strawberry Perl 5.38
The same script runs perfect under Strawberry 5.32 (64Bit). Here the Temp dirs are removed without any errors

@shawnlaffan
Copy link
Contributor

Excel files are just zip archives so can be opened using 7zip or similar.

When doing that, the generated excel files only contain the [Content_Types].xml member. The temp dirs contain the full contents.

This looks like it could be an issue with Excel::Writer::XLSX running under Parallel::ForkManager. It could also be with File::Temp.

The runs below include $File::Temp::DEBUG = 1; in the script. The second disables the Parallel::ForkManager calls.

#  parallel
perl gh_issue146.pl
rmdir C:/Users/user/AppData/Local/Temp/dUnTH1Qilj
cannot remove directory for C:/Users/user/AppData/Local/Temp/dUnTH1Qilj: Directory not empty at C:/perls/5.38.0.1_PDL/perl/lib/File/Temp.pm line 2643.
rmdir C:/Users/user/AppData/Local/Temp/nod2DU9dgt
cannot remove directory for C:/Users/user/AppData/Local/Temp/nod2DU9dgt: Directory not empty at C:/perls/5.38.0.1_PDL/perl/lib/File/Temp.pm line 2643.

#  non-parallel
perl gh_issue146.pl
unlink C:\Users\user\AppData\Local\Temp\1x2ElZEFHN\docProps\app.xml
unlink C:\Users\user\AppData\Local\Temp\1x2ElZEFHN\docProps\core.xml
rmdir docProps
unlink C:\Users\user\AppData\Local\Temp\1x2ElZEFHN\xl\styles.xml
unlink C:\Users\user\AppData\Local\Temp\1x2ElZEFHN\xl\theme\theme1.xml
rmdir theme
unlink C:\Users\user\AppData\Local\Temp\1x2ElZEFHN\xl\workbook.xml
unlink C:\Users\user\AppData\Local\Temp\1x2ElZEFHN\xl\worksheets\sheet1.xml
rmdir worksheets
unlink C:\Users\user\AppData\Local\Temp\1x2ElZEFHN\xl\_rels\workbook.xml.rels
rmdir _rels
rmdir xl
unlink C:\Users\user\AppData\Local\Temp\1x2ElZEFHN\[Content_Types].xml
unlink C:\Users\user\AppData\Local\Temp\1x2ElZEFHN\_rels\.rels
rmdir _rels
rmdir C:/Users/user/AppData/Local/Temp/1x2ElZEFHN
unlink C:\Users\user\AppData\Local\Temp\niMBIjJrsP\docProps\app.xml
unlink C:\Users\user\AppData\Local\Temp\niMBIjJrsP\docProps\core.xml
rmdir docProps
unlink C:\Users\user\AppData\Local\Temp\niMBIjJrsP\xl\styles.xml
unlink C:\Users\user\AppData\Local\Temp\niMBIjJrsP\xl\theme\theme1.xml
rmdir theme
unlink C:\Users\user\AppData\Local\Temp\niMBIjJrsP\xl\workbook.xml
unlink C:\Users\user\AppData\Local\Temp\niMBIjJrsP\xl\worksheets\sheet1.xml
rmdir worksheets
unlink C:\Users\user\AppData\Local\Temp\niMBIjJrsP\xl\_rels\workbook.xml.rels
rmdir _rels
rmdir xl
unlink C:\Users\user\AppData\Local\Temp\niMBIjJrsP\[Content_Types].xml
unlink C:\Users\user\AppData\Local\Temp\niMBIjJrsP\_rels\.rels
rmdir _rels
rmdir C:/Users/user/AppData/Local/Temp/niMBIjJrsP

FWIW, it is also relatively common that temp dirs cannot be removed under testing. Antivirus software is a common cause but there could be other processes that scan files. OneDrive comes to mind but it hopefully does not trigger when files are created in temp dirs.

@shawnlaffan
Copy link
Contributor

Possibly related: jmcnamara/excel-writer-xlsx#182

@sskopnik
Copy link
Author

Thanks for the detailed info, which helped alot. I think it all boils down to rmtree not being threadsafe. I'm still wondering why the script works flawlessly with 5.32. Will investigate further...

@sskopnik
Copy link
Author

I tested the parallel version of my test script unter 5.32. Here the deletion of the Temp-Dirs runs without any error. The module versions of Excel::Writer::XLSX are the same. Still have no explanation!

unlink C:\Users\m701387\AppData\Local\Temp\7j45_O8k9W\docProps\app.xml
unlink C:\Users\m701387\AppData\Local\Temp\7j45_O8k9W\docProps\core.xml
rmdir docProps
unlink C:\Users\m701387\AppData\Local\Temp\7j45_O8k9W\xl\styles.xml
unlink C:\Users\m701387\AppData\Local\Temp\7j45_O8k9W\xl\theme\theme1.xml
rmdir theme
unlink C:\Users\m701387\AppData\Local\Temp\7j45_O8k9W\xl\workbook.xml
unlink C:\Users\m701387\AppData\Local\Temp\7j45_O8k9W\xl\worksheets\sheet1.xml
rmdir worksheets
unlink C:\Users\m701387\AppData\Local\Temp\7j45_O8k9W\xl\_rels\workbook.xml.rels
rmdir _rels
rmdir xl
unlink C:\Users\m701387\AppData\Local\Temp\7j45_O8k9W\[Content_Types].xml
unlink C:\Users\m701387\AppData\Local\Temp\7j45_O8k9W\_rels\.rels
rmdir _rels
rmdir C:/Users/m701387/AppData/Local/Temp/7j45_O8k9W
unlink C:\Users\m701387\AppData\Local\Temp\zNfqF48htO\docProps\app.xml
unlink C:\Users\m701387\AppData\Local\Temp\zNfqF48htO\docProps\core.xml
rmdir docProps
unlink C:\Users\m701387\AppData\Local\Temp\zNfqF48htO\xl\styles.xml
unlink C:\Users\m701387\AppData\Local\Temp\zNfqF48htO\xl\theme\theme1.xml
rmdir theme
unlink C:\Users\m701387\AppData\Local\Temp\zNfqF48htO\xl\workbook.xml
unlink C:\Users\m701387\AppData\Local\Temp\zNfqF48htO\xl\worksheets\sheet1.xml
rmdir worksheets
unlink C:\Users\m701387\AppData\Local\Temp\zNfqF48htO\xl\_rels\workbook.xml.rels
rmdir _rels
rmdir xl
unlink C:\Users\m701387\AppData\Local\Temp\zNfqF48htO\[Content_Types].xml
unlink C:\Users\m701387\AppData\Local\Temp\zNfqF48htO\_rels\.rels
rmdir _rels
rmdir C:/Users/m701387/AppData/Local/Temp/zNfqF48htO

@sisyphus
Copy link

sisyphus commented Oct 24, 2023

@sskopnik , does this (from the File::Find File::Path documentation) pertain to the failure message you're seeing:

=item cannot remove directory [dir]: [errmsg]

C<remove_tree> attempted to remove a directory, but failed. This may be because
some objects that were unable to be removed remain in the directory, or
it could be a permissions issue. The directory will be left behind.

It seems to me that there's no fatal error, and the script is actually running fine - apart form there being a couple of directories that aren't being cleaned up.
Is that what you're seeing ?

Cheers,
Rob

@sskopnik
Copy link
Author

sskopnik commented Oct 24, 2023

No, it is not a cosmetic problem. The generated (in this case empty) XLSX-files are corrupt (1kb instead of 5kb)! They can't be opened via Excel.
Try the same script with Strawberry 5.32 = All fine! The generated files are valid.

use Excel::Writer::XLSX;
use Parallel::ForkManager;

$File::Temp::DEBUG = 1;

sub genXls {
  
  my $pm = Parallel::ForkManager->new(1);
  
  foreach my $file ('test.xlsx', 'test2.xlsx') {
    my $pid = $pm->start and next;
    srand();

    my $workbook;
    $workbook = Excel::Writer::XLSX->new($file);
    $workbook->close() or die "Error File close";

    $pm->finish; # Terminates the child process
  }

  $pm->wait_all_children; # Wait for all children to complete
}  
genXls();

@sskopnik sskopnik changed the title Version 5.38: Parallel::ForkManager + Excel::Writer::XLSX = Error deleting Temp Dirs Version 5.38: Parallel::ForkManager = Error rmtree Oct 26, 2023
@sskopnik
Copy link
Author

sskopnik commented Oct 26, 2023

Hi, to get some complexity out of the equation, I removed Excel::Writer::XLSX out of my test script. Now I simply create a folder/file structure which I later try to delete in parallel via rmtree:

use Parallel::ForkManager;
use File::Path;

$File::Temp::DEBUG = 1;

sub do_rmtree {
  
  my $pm = Parallel::ForkManager->new(3);
  
  foreach my $file ('TEST/root1', 'TEST/root2', 'TEST/root3') {
    my $pid = $pm->start() and next;

    rmtree($file,1,1);    
    $pm->finish; # Terminates the child process
  }

  $pm->wait_all_children; # Warten bis alle children fertig sind!
}  

my $existingdir = './TEST';

# rmtree ($existingdir,1,1); 
mkdir $existingdir;
mkdir "$existingdir/root1";
mkdir "$existingdir/root2";
mkdir "$existingdir/root3";

open my $fileHandle, ">>", "$existingdir/root1/filetocreate.txt" or die "Can't open\n";
print $fileHandle "FooBar!\n";
close $fileHandle;

open my $fileHandle, ">>", "$existingdir/root2/filetocreate.txt" or die "Can't open\n";
print $fileHandle "FooBar!\n";
close $fileHandle;

open my $fileHandle, ">>", "$existingdir/root3/filetocreate.txt" or die "Can't open\n";
print $fileHandle "FooBar!\n";
close $fileHandle;

do_rmtree();

Under Strawberry 5.38 I get:

rmdir TEST/root1
rmdir TEST/root2
cannot remove directory for TEST/root1: Directory not empty at rmtree2.pl line 18.
cannot remove directory for TEST/root2: Directory not empty at rmtree2.pl line 18.
rmdir TEST/root3
cannot remove directory for TEST/root3: Directory not empty at rmtree2.pl line 18.

Under Strawberry 5.32.1 I get:

unlink TEST\root1\filetocreate.txt
unlink TEST\root2\filetocreate.txt
unlink TEST\root3\filetocreate.txt
rmdir TEST/root1
rmdir TEST/root3
rmdir TEST/root2

Same if I start the script nonparallel under 5.38.
It seem's that the recursion inside the _rmtree function to unlink the files inside the folders doesn't work under 5.38 using Forkmanager. But I cannot see any difference in the File::Path module.

Maybe chdir isn't threadsave in 5.38 any more?

@sskopnik sskopnik changed the title Version 5.38: Parallel::ForkManager = Error rmtree Version 5.38: Parallel::ForkManager = rmtree Error Oct 26, 2023
@shawnlaffan
Copy link
Contributor

I wonder if this occurs with any recent perl on Windows. @sisyphus - can you reproduce the issue using your various recent perls?

@sisyphus
Copy link

@sisyphus - can you reproduce the issue using your various recent perls?

Yes - now that @sskopnik has provided the (excellent) simplified demo, I can see it happening on my own builds of perl-5.38.0 and perl-5.39.3. (Those are the only 2 perls I've tested.)

I'm thinking it might simply be that the unlinking of the files is now not happening when rmtree() is called.
I think rmdir() then behaves as it ought in refusing to remove a non-empty directory.

I'll take a look at that later tonight ... if not beaten to it ;-)

Cheers,
Rob

@sskopnik
Copy link
Author

sskopnik commented Oct 26, 2023

Thank you! Any support appreciated!!!

I investigated a bit more. I think the problem is the following line in File::Path (sub _rmtree, line 392)

my ( $ldev, $lino, $perm ) = ( lstat $root )[ 0, 1, 2 ]
          or next ROOT_DIR;

When _rmtree is doing the recursion-call to unlink the files in dirs the lstat call fails in the parallel environment, resulting in a NEXT jump. So the files never get deleted and the rmdir later fails.

Maybe this helps for further research...

@sisyphus
Copy link

the lstat call fails in the parallel environment

Are you certain that it happens only in the "parallel environment" ?

Cheers,
Rob

@sskopnik
Copy link
Author

Yes! "parallel environment" means:

my $pm = Parallel::ForkManager->new(3);

Non-parallel means:

my $pm = Parallel::ForkManager->new(0);

Both are running fine under 5.32, as said

@sisyphus
Copy link

sisyphus commented Oct 26, 2023

A couple of changes between Strawberry-5.32.1 and Strawberry-5.38.0:

  1. $Config{d_lstat} changes from 'undef' to 'define'
  2. $Config{d_symlink} changes from 'undef' to 'define'.

I think that might be a good part of the problem.

@sskopnik , in 5.38.0 I changed the 'lstat' call in File::Path (line 392). to 'stat' - thinking that might fix the issue. After all, when $Config{d_lstat} is 'undef' (as in 5.32.1), then 'lstat' reverts to 'stat'.
But when I then ran your demo script in 5.38.0, I got:

unlink TEST\root2\filetocreate.txt
unlink TEST\root1\filetocreate.txt
cannot unlink file for TEST\root1\filetocreate.txt: No such file or directory at try.pl line 13.
cannot unlink file for TEST\root2\filetocreate.txt: No such file or directory at try.pl line 13.
rmdir TEST/root1
rmdir TEST/root2
cannot remove directory for TEST/root1: Directory not empty at try.pl line 13.
unlink TEST\root3\filetocreate.txt
cannot remove directory for TEST/root2: Directory not empty at try.pl line 13.
cannot unlink file for TEST\root3\filetocreate.txt: No such file or directory at try.pl line 13.
rmdir TEST/root3
cannot remove directory for TEST/root3: Directory not empty at try.pl line 13.

At least it tried to unlink the files before doing the rmdir(), but the fact that it claims those files don't exist is rather baffling.

Cheers,
Rob

@sskopnik
Copy link
Author

After changing lstat to stat in File::Path I can reproduce that behavior.
And I'm really running out of ideas...

@shawnlaffan
Copy link
Contributor

#149 is possibly related, albeit perhaps tangentially.

There have been changes to how stat is implemented on Windows since 5.32.

@sisyphus

This comment was marked as outdated.

@sskopnik
Copy link
Author

Just probed my test script with the new version Perl 5.38.2.2
As expected nothing changed: Parallel::ForkManager still seems broken in 5.38.
Unfortunately, this is a showstopper for me, using parallel execution speeds up my scripts by a factor of X.
Can't live without it.

@shawnlaffan
Copy link
Contributor

If the issue is with ultimately File::Path then it would be useful to also report it there.

FWIW, there are two open tickets that might (or might not) be related:
https://rt.cpan.org/Public/Bug/Display.html?id=142055 (does not delete a dir containing a symlink)
https://rt.cpan.org/Public/Bug/Display.html?id=139377 (dir permissions)

@shawnlaffan
Copy link
Contributor

Looping back to this after some time, it is the lstat call at L392 that is not functioning correctly.

I added this code to File::Path immediately before L392 and reran the code from #146 (comment)

my ( $ldev, $lino, $perm ) = ( lstat $root )[ 0, 1, 2 ];
say STDERR "Checking $root";
say STDERR "  lstat result: $ldev, $lino, $perm";
say STDERR '  Exists: ' . -e $root;

On 5.32 I get this result:

Checking TEST/root1
  lstat result: 2, 0, 16895
  Exists: 1
Checking filetocreate.txt
  lstat result: 2, 0, 33206
  Exists: 1
unlink TEST\root1\filetocreate.txt
rmdir TEST/root1
Checking TEST/root2
  lstat result: 2, 0, 16895
  Exists: 1
Checking filetocreate.txt
  lstat result: 2, 0, 33206
  Exists: 1
unlink TEST\root2\filetocreate.txt
rmdir TEST/root2
Checking TEST/root3
  lstat result: 2, 0, 16895
  Exists: 1
Checking filetocreate.txt
  lstat result: 2, 0, 33206
  Exists: 1
unlink TEST\root3\filetocreate.txt
rmdir TEST/root3
Checking C:\Users\user\AppData\Local\Temp\tiZnHq4tYa
  lstat result: 2, 0, 16895
  Exists: 1

On 5.38.2.2 I get:

Checking TEST/root1
  lstat result: 144443394, 607422999741609049, 16895
  Exists: 1
Checking filetocreate.txt
  lstat result: , ,
  Exists: 1
rmdir TEST/root1
cannot remove directory for TEST/root1: Directory not empty at issue146.pl line 18.
Checking TEST/root2
  lstat result: 144443394, 105271641289805798, 16895
  Exists: 1
Checking filetocreate.txt
  lstat result: , ,
  Exists: 1
rmdir TEST/root2
cannot remove directory for TEST/root2: Directory not empty at issue146.pl line 18.
Checking TEST/root3
  lstat result: 144443394, 61361544922955611, 16895
  Exists: 1
Checking filetocreate.txt
  lstat result: , ,
  Exists: 1
rmdir TEST/root3
cannot remove directory for TEST/root3: Directory not empty at issue146.pl line 18.
Checking C:\Users\user\AppData\Local\Temp\bj2ptTRgCt
  lstat result: 144443394, 74309393851667969, 16895
  Exists: 1

@shawnlaffan
Copy link
Contributor

lstat was changed for Windows in this PR: Perl/perl5#18306

@sskopnik
Copy link
Author

sskopnik commented Jul 26, 2024 via email

@shawnlaffan
Copy link
Contributor

I updated the code to check the error vars. It seems the system is not finding the files when using lstat under Parallel::ForkManager, and perhaps threads more generally.

As before, add the code below to File::Path immediately before L392 and rerun the code from #146 (comment)

It would be useful if anyone knows how to reduce the test case to not need Parallel::ForkManager. If not then I'll report it upstream as-is.

my ( $ldev, $lino, $perm ) = ( lstat $root )[ 0, 1, 2 ];
my @errs = ($@, $!, $^E, $?);
my %err_hash;
    @err_hash{qw /$@ $! $^E $?/} = map {$_ // ''} @errs;
say STDERR "Checking $root";
say STDERR "  lstat result: $ldev, $lino, $perm";
say STDERR '  Exists: ' . -e $root;
say STDERR "  Errors: \n" . join "\n", map {"    $_: $err_hash{$_}"} sort keys %err_hash;
my ( $ldev, $lino, $perm ) = ( stat $root )[ 0, 1, 2 ];
say STDERR "  stat result: $ldev, $lino, $perm";
Checking TEST/root1
  lstat result: 144443394, 24488322974021684, 16895
  Exists: 1
  Errors:
    $!:
    $?: 0
    $@:
    $^E:
  stat result: 144443394, 24488322974021684, 16895
Checking filetocreate.txt
  lstat result: , ,
  Exists: 1
  Errors:
    $!: No such file or directory
    $?: 0
    $@:
    $^E: The system cannot find the file specified
  stat result: 144443394, 18858823439808611, 33206
rmdir TEST/root1
cannot remove directory for TEST/root1: Directory not empty at issue146.pl line 18.
Checking TEST/root2
  lstat result: 144443394, 18858823439808570, 16895
  Exists: 1
  Errors:
    $!:
    $?: -1
    $@:
    $^E:
  stat result: 144443394, 18858823439808570, 16895
Checking filetocreate.txt
  lstat result: , ,
  Exists: 1
  Errors:
    $!: No such file or directory
    $?: -1
    $@:
    $^E: The system cannot find the file specified
  stat result: 144443394, 9288674231646314, 33206
rmdir TEST/root2
cannot remove directory for TEST/root2: Directory not empty at issue146.pl line 18.
Checking TEST/root3
  lstat result: 144443394, 5348024557697122, 16895
  Exists: 1
  Errors:
    $!:
    $?: -1
    $@:
    $^E:
  stat result: 144443394, 5348024557697122, 16895
Checking filetocreate.txt
  lstat result: , ,
  Exists: 1
  Errors:
    $!: No such file or directory
    $?: -1
    $@:
    $^E: The system cannot find the file specified
  stat result: 144443394, 12384898975463545, 33206
rmdir TEST/root3
cannot remove directory for TEST/root3: Directory not empty at issue146.pl line 18.
Checking C:\Users\user\AppData\Local\Temp\MU8rNyP1Il
  lstat result: 144443394, 10696049115470891, 16895
  Exists: 1
  Errors:
    $!: No such file or directory
    $?: 0
    $@:
    $^E:
  stat result: 144443394, 10696049115470891, 16895

@shawnlaffan
Copy link
Contributor

shawnlaffan commented Aug 1, 2024

On further checking it is an issue with File::Path.

This is already documented at https://metacpan.org/pod/File::Path#MULTITHREADED-APPLICATIONS (see also RT94209)

I don't know why it does not affect perl 5.32 but assume it relates to the change in how lstat is implemented.

File::Path::Tiny should work instead (based on its documentation).

Edit: And this does look like the same issue as jmcnamara/excel-writer-xlsx#182

@shawnlaffan
Copy link
Contributor

Some further notes.

Excel::Writer::XLSX does not directly use File::Path. It is pulled in via File:::Temp.

Both File::Temp and File::Path are core modules. File::Path::Tiny is not.

There is an open PR for the base cause of this issue at jmcnamara/excel-writer-xlsx#261

@shawnlaffan
Copy link
Contributor

Cross-ref: jmcnamara/excel-writer-xlsx#303

@shawnlaffan
Copy link
Contributor

I'll close this issue now given this is not an SP specific issue and there is an upstream PR in place.

@shawnlaffan
Copy link
Contributor

I should add, thanks to @sskopnik for the report and reproducer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream module Issue is due to an upstream module
Projects
None yet
Development

No branches or pull requests

4 participants