Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editable install: do not show any output in verbose mode when there is no work to do #579

Conversation

lesteve
Copy link
Contributor

@lesteve lesteve commented Feb 22, 2024

Why the current behaviour is problematic

The "ninja: no work to do" output can be confusing as noted for example by @ogrisel in scikit-learn/scikit-learn#28040 (comment). @eli-schwartz mentioned that it could potentially be changed in meson-python scikit-learn/scikit-learn#28040 (comment).

Personally, when working on scikit-learn code, I find it reassuring to see some output if the sklearn import takes a while and get the feed-back that some compilation is happening. In the case when there is nothing to do I would rather have nothing printed. One thing I realised recently is that the "no work to do" output can be printed multiple times if you use subprocesses (makes complete sense since sklearn is imported in each subprocess). This can happen easily in scikit-learn for example with this snippet.

# /tmp/test.py
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score

model = LogisticRegression()
X, y = make_classification()

print(cross_val_score(model, X, y, n_jobs=4))

Output is:

❯ python /tmp/test.py
+ /home/lesteve/micromamba/envs/scikit-learn-dev/bin/ninja
ninja: no work to do.
+ /home/lesteve/micromamba/envs/scikit-learn-dev/bin/ninja
+ /home/lesteve/micromamba/envs/scikit-learn-dev/bin/ninja
+ /home/lesteve/micromamba/envs/scikit-learn-dev/bin/ninja
+ /home/lesteve/micromamba/envs/scikit-learn-dev/bin/ninja
ninja: no work to do.
ninja: no work to do.
ninja: no work to do.
ninja: no work to do.
[1.   0.85 0.9  0.95 0.95]

Current implementation

The simplest thing I could find was to first execute the ninja command in dry-run mode and to check if "no work to do" is in the captured stdout. This may be a bit brittle in case the ninja output changes. Maybe there is a better way to check with ninja that there is no work to do?

There does not seem to be any test right now for the editable verbose behavior, but I can try to add some if you think this is worth it.

Alternative implementation

If you think the overhead of having a ninja dry-run command is too much overhead, one possible alternative implementation would be to capture stdout with p = subprocess.Popen(..., stdout=subprocess.PIPE) and look at the first two lines of output. In my experience, this kind of manipulation are always more complicated than you would hope, but maybe I have been doing it wrong all this time ...

The kind of complications I have in mind:

  • p.stdout.read is blocking so in the case that there is no output for some reason, it will wait for ever. Maybe we know there will always be some output with the ninja command. You can also put you can use fcntl to put stdout in non-blocking mode.
  • displaying the captured stdout live (i.e. without waiting for the command to finish) is also always more complicated than you would hope:
    • decide whether to read line by line, a few lines, a number of bytes
    • decide whether to sleep between two reads (I have seen some cases where not sleeping had a high CPU usage)
    • you also need to remember to redisplay all the remaining stdout once the command finishes otherwise you miss some stdout.

@lesteve lesteve force-pushed the editable-verbose-no-output-when-no-work-to-do branch from 85fb89b to 6bc67f6 Compare February 22, 2024 09:26
@ogrisel
Copy link

ogrisel commented Feb 22, 2024

I think the lines:

+ /home/lesteve/micromamba/envs/scikit-learn-dev/bin/ninja
ninja: no work to do.

would be ok, if they were prefixed by some user friendlier message:

meson-python: importing sklearn installed in editable mode
+ /home/lesteve/micromamba/envs/scikit-learn-dev/bin/ninja
ninja: no work to do.

But I agree that the repeated message is a bit annoying when using parallelism that spawn Python workers that import sklearn concurrently.

@ogrisel
Copy link

ogrisel commented Feb 22, 2024

But I am also fine with what @lesteve suggests in this PR because it fixes both problems at once.

mesonpy/_editable.py Outdated Show resolved Hide resolved
@rgommers
Copy link
Contributor

Thanks! I agree that this message can be confusing. I hadn't seen it printed multiple times yet, but that makes it worse. So looks like a good idea to suppress it.

@rgommers rgommers added the maintenance Regular code improvements that are not new features nor end-user-visible bugs label Feb 22, 2024
dry_run_build_cmd = self._build_cmd + ['-n']
p = subprocess.run(dry_run_build_cmd, cwd=self._build_path, env=env, capture_output=True)
if b'ninja: no work to do.' not in p.stdout and b'samu: nothing to do' not in p.stdout:
print('+ ' + ' '.join(self._build_cmd))
Copy link

@ogrisel ogrisel Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the build command not very informative because it does not reflect which editable install is being rebuilt, nor does it state that meson-python's editable install is in charge of making this build happen.

So maybe consider:

Suggested change
print('+ ' + ' '.join(self._build_cmd))
module_names = ", ".join(sorted(self._top_level_modules))
print(f"meson-python: building {module_names}")

Here is the resulting user experience:

$ touch sklearn/ensemble/_gradient_boosting.pyx
$ python -c "import sklearn"
meson-python: building sklearn
[3/3] Linking target sklearn/ensemble/_gradient_boosting.cpython-311-darwin.so
$ python -c "import sklearn"  # nothing to rebuild => no further output
$

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions:

  • do we want to keep the ninja build command for troubleshooting or are we fine not showing it?
  • do we want to keep the + at the beginning of the first line? I guess maybe this was a marker to make it stand out in the output?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have an opinion on those points as long as the first message mentions "meson-python" and the name of the package(s) being rebuilt prior to displaying build outputs.

stdout = None
# We do not want any output if there is no work to do. Code is adapted from:
# https://github.com/mesonbuild/meson/blob/a35d4d368a21f4b70afa3195da4d6292a649cb4c/mesonbuild/mtest.py#L1635-L1636
dry_run_build_cmd = self._build_cmd + ['-n']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not work on Windows, where the compilation command is meson compile, which does not accept a -n argument. This demonstrates that this feature needs tests.

tests/test_editable.py Outdated Show resolved Hide resolved
tests/test_editable.py Outdated Show resolved Hide resolved
mesonpy/_editable.py Outdated Show resolved Hide resolved
mesonpy/_editable.py Outdated Show resolved Hide resolved
@lesteve
Copy link
Contributor Author

lesteve commented Feb 23, 2024

It would be great if someone with the necessary rights could enable the Github actions workflow 🙏!

I can reproduce the Alpine failure in a Docker container and the bad news is that ninja -n seems to segfault (ninja alone works fine). The ninja version is 1.9 (released January 2019) and I have seen at least one test in meson-python that is skipped if the ninja version < 1.10, so maybe this could be an option?

My basic gdb skills only allowed me to get this information:

(gdb) r
Starting program: /usr/bin/ninja -C build/cp312 -n
warning: Error disabling address space randomization: Operation not permitted
ninja: entering directory 'build/cp312'

Program received signal SIGSEGV, Segmentation fault.
0x00007b65464a9458 in strlen (s=<optimized out>, 
    s@entry=0x8 <error: Cannot access memory at address 0x8>) at src/string/strlen.c:17
warning: 17	src/string/strlen.c: No such file or directory

Suggestions how to get more info more than welcome ...

@eli-schwartz
Copy link
Member

I can reproduce the Alpine failure in a Docker container and the bad news is that ninja -n seems to segfault (ninja alone works fine). The ninja version is 1.9 (released January 2019) and I have seen at least one test in meson-python that is skipped if the ninja version < 1.10, so maybe this could be an option?

This is:

michaelforney/samurai#66
michaelforney/samurai#81
michaelforney/samurai@fb61f22

Possibly something Alpine should backport...

@lesteve
Copy link
Contributor Author

lesteve commented Feb 23, 2024

I have skipped the test on Alpine for now based on detecting musllinux.

To know whether the test pass on Windows it would be nice that someone enable Github workflows. Full disclosure: locally the Windows tests don't pass and I don't really know why yet.

@rgommers
Copy link
Contributor

To know whether the test pass on Windows it would be nice that someone enable Github workflows.

Done. this can be a pain because the approve button has to be pressed after every push. If you want to submit a trivial PR to side-step this problem, I'm happy to merge that straight away (e.g, in the readme, change to use Meson to using Meson).

if self._build_cmd[0] == 'meson':
# On Windows meson compile is used so to do a dry run you need
# --ninja-args
dry_run_build_cmd = self._build_cmd + ['--ninja-args="-n"']
Copy link
Member

@dnicolodi dnicolodi Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The double quotes around -n are wrong, if the command is not interpreted by a shell (and it is not when run with subprocess.run()). Also, what happens when self._build_cmd already contains a --ninja-args argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch the quoting was the issue indeed, now the test passes locally on a Windows VM.

About multiple --ninja-args, you probably know a lot more than me about this, so any suggestions welcome!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether meson compile accepts multiple --ninja-args options. This is why I am asking. If I knew I would not have posed this as a question. You are the proponent of this feature, thus it is on you to verify and to find a solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested this and meson compile does accept multiple --ninja-args arguments where the last argument overrides all the others.

For example, meson compile --ninja-args="-n" --ninja-args="-h" does the same as meson compile --ninja-args="-h".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not acceptable to discard the user supplied argument and keep only the -n.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken, my current plan for this is to fine the last arguments in self._build_cmd that starts with --ninja-args and if there is one to append ,-n to it. This does seem a bit complicated though, let me know if you have a better suggestion!

In order to test this (at least locally), do you have an example in mind on Windows where ninja arguments would be already in the build command (i.e. before adding -n)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption is this would happen when users specify -C... options to the build frontend.

Copy link
Member

@dnicolodi dnicolodi Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @eli-schwartz says, you can pass install with pip install -e . -Ccompile-args=-j1 for example.

As for the implementation, it is ok to check if the component of the command starts with --ninja-args, then you need to parse the value passed to the option and add -n to the list. You can take some inspiration for how to do it from the code in this test

# intercept and filter out test arguments and forward the call
if cmd[:2] == ['meson', 'compile']:
# when using meson compile instead of ninja directly, the
# arguments needs to be unmarshalled from the form used to
# pass them to the --ninja-args option
assert cmd[-1].startswith('--ninja-args=')
cmds.append(cmd[:2])
args.append(ast.literal_eval(cmd[-1].split('=')[1]))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the code to deal with existing and multiple --ninja-args in self._build_cmd but I haven't written a test for it yet.

mesonpy/_editable.py Outdated Show resolved Hide resolved
mesonpy/_editable.py Outdated Show resolved Hide resolved
tests/test_editable.py Outdated Show resolved Hide resolved
tests/test_editable.py Outdated Show resolved Hide resolved
@lesteve lesteve mentioned this pull request Feb 29, 2024
@lesteve
Copy link
Contributor Author

lesteve commented Feb 29, 2024

If you want to submit a trivial PR to side-step this problem, I'm happy to merge that straight away (e.g, in the readme, change to use Meson to using Meson).

@rgommers I have opened #585 with this change.

@rgommers
Copy link
Contributor

@rgommers I have opened #585 with this change.

Sounds good - merged now. Should fix the issue with CI running here.

@dnicolodi
Copy link
Member

This is:

michaelforney/samurai#66 michaelforney/samurai#81 michaelforney/samurai@fb61f22

Possibly something Alpine should backport...

@eli-schwartz Are you implying that Alpine uses samurai but calls it ninja? I wonder if we can pull in a fixed package in our CI image...

@dnicolodi
Copy link
Member

Yes, it seems that installing ninja on Alpine Linux puts the executable in a location outside $PATH and that the samuari package install a symlink that makes ninja actually be samurai. samurai seems to be installed with the build-base package, otherwise I don't understand why we get it in our CI images. Anyhow, installing ninja-is-really-ninja should make the ninja command actually be what is expected.

The bug @eli-schwartz mentions was fixed in samurai back in 2021, but samurai hasn't seen a release since 2020.

@lesteve
Copy link
Contributor Author

lesteve commented Mar 4, 2024

The CI failures for the development meson version are also main and do not seem to be related to this PR.

Edit: the development meson builds seem to have been fixed. I don't really know what the freebsd issue is about but it does not seem related to this PR either, see build log

Failed to start an instance: INVALID_ARGUMENT: Not Found 404 Not Found POST https://compute.googleapis.com:443/compute/v1/projects/cirrus-ci-community/zones/us-central1-c/instances
{ "error": { "code": 404, "message": "The resource 'projects/freebsd-org-cloud-dev/global/images/family/freebsd-13-2' was not found", "errors": [ { "message": "The resource 'projects/freebsd-org-cloud-dev/global/images/family/freebsd-13-2' was not found", "domain": "global", "reason": "notFound" } ] } } 

@@ -116,7 +116,7 @@ jobs:

- name: Run tests
run: >-
python -m pytest --showlocals -vv --cov --cov-report=xml
python -m pytest --showlocals -vv --cov --cov-report=xml -s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be reverted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I was about to do it 😉, done in my last commit.

@dnicolodi
Copy link
Member

Please don't merge the main branch into your branch the next time: it makes rebasing the commits harder without reason.

@dnicolodi
Copy link
Member

I added the missing tests and fixed up the last issue and submitted as #594

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Regular code improvements that are not new features nor end-user-visible bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants