-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 does not work with nbdiff
(from https://github.com/jupyter/nbdime)
#1256
Comments
Hi @infokiller ! Hm, your docker repro script is awesome but on first try I got an error (below). However -- I don't think you actually need to go to those lengths -- you can just provide the output from In other words: delta's contract is that it renders unified diff, and git output. So the possibilities are:
In cases (1) and (3) we just need the raw text input sent to delta, and a screenshot of delta's output.
|
I think your archlinux image is stale so you need to run |
Back at my desktop, here's the exact output of
|
Thanks! Right, that looks similar to unified diff, but it's missing the hunk header (the line that starts with https://www.gnu.org/software/diffutils/manual/html_node/Example-Unified.html Would you be able to look into why that is? I'm looking around briefly and I see https://nbdime.readthedocs.io/en/latest/cli.html#nbdiff However, that example looks like it should include the |
A separate question is whether delta should have output the lines between the cc @WayneD who is the original creator of unified diff format (and a delta contributor) in case he has any thoughts on this issue and the way that nbdiff are using the format. |
Thanks a lot @dandavison for looking into this! I think it's reasonable to pass through anything that isn't recognized, or alternatively show some warning so that it doesn't look like the diff is empty. I guess it's best to add an option (for example I will check with the nbdiff developers, but from a quick look at the code it seems there is some support for unified diff, but it may not be the default: https://github.com/jupyter/nbdime/blob/master/nbdime/prettyprint.py#L303 |
Right, I noticed that code also. But that code includes |
Well, putting lines after the Looking at the example, it appears that the notebook is a container of multiple files and they decided to use nbdiff c.ipynb b.ipynb
diff c.ipynb/cells/9/outputs/0/data/text/plain b.ipynb/cells/9/outputs/0/data/text/plain
modified
--- c.ipynb/cells/9/outputs/0/data/text/plain
+++ b.ipynb/cells/9/outputs/0/data/text/plain
@@ -1 +1 @@
- <matplotlib.figure.Figure at 0x10ea05940>
+ <matplotlib.figure.Figure at 0x10eb21860>
diff etc
--- c.ipynb/cells/etc
+++ b.ipynb/cells/etc
@@ etc @@ However, not being familiar with what they're trying to convey, that may not be a good match for their use case. If that is true, then I'd think that this should be considered to be a different diff format that delta could consider supporting, which would require it to have special code for understanding |
Thanks @WayneD your response is very useful! For context, a Jupyter Notebook is essentially a list of "cells", each containing code or markdown/text. Code cells can also have outputs, which can be simple textual outputs or things like figures/images. The problem nbdiff solves is that looking at a "raw" diff of notebooks is very unreadable because of the json representation and stuff like cell outputs, which can get very large (for example encoded images), and hide the code changes which are much smaller and often more interesting. |
I think I figured out the discrepancy- nbdiff has different code paths for diff chunks that are multiline (which are properly formatted as in the website example) and single lines ones: https://github.com/jupyter/nbdime/blob/master/nbdime/prettyprint.py#L830-L846 If you modify the reproduce shell command in the first comment to create a multiline diff chunk, it works as expected: docker pull archlinux
img_digest="$(docker image inspect --format='{{index .RepoDigests 0}}' archlinux:latest)"
echo "Digest: ${img_digest}"
build_oci_img () {
docker build --build-arg "img_digest=${img_digest}" "$@" - <<'EOF'
ARG img_digest
FROM archlinux:${img_digest}
RUN pacman -Sy > /dev/null
RUN pacman -S --noconfirm --needed coreutils curl gcc bc git python-pip > /dev/null
ENV CARGO_HOME=/usr/local
RUN curl -fsSL 'https://sh.rustup.rs' | sh -s -- -y --profile minimal
RUN /usr/local/bin/cargo install git-delta
RUN pip install jupytext ipython_genutils nbdime
WORKDIR /repo
EOF
}
build_oci_img
docker run --rm -i "$(build_oci_img -q)" bash <<'EOF'
main() {
git config --global user.email [email protected] && git config --global user.name test
git -c init.defaultBranch=main init
printf '%s\n' 'print(0)' 'print(1)' > file.py
jupytext --to ipynb file.py
cat file.py
cat file.ipynb
git add file.ipynb
git commit -m 'initial commit'
printf '%s\n' 'print(0)' 'print(2)' > file.py
jupytext --to ipynb file.py
echo 'Running git diff:'
git diff
echo
echo 'Running nbdiff:'
nbdiff
echo
echo 'Piping nbdiff to delta:'
nbdiff | delta
}
main
EOF Output:
|
@dandavison should I open a separate issue with a feature request for passing through unknown diff formats? |
I submitted a PR to improve the compatibility of nbdev: jupyter/nbdime#647 |
Great! Can you post the input to delta from nbdiff on your PR branch and a screenshot of what delta does ?
Hey, I'm sorry -- the docker script works now with the Can you just post the raw input to delta, and a screenshot of delta's output, so that we can immediately discuss whether delta is doing the right thing on its input? Here I think the raw input was
whereas previously it was
I don't understand where we are exactly with those inputs -- neither contains the |
Also I don't get any output from delta on that input whereas the docker script produced a line like
|
My vote is to solve it all in this ticket. |
The default output of nbdiff changes to:
(yes, there's a trailing newline there, not sure why but that's unrelated to my PR) And here's how delta renders it:
My bad, I copied the wrong command (edited the original comment to fix it). The raw input to delta for the single line case is (before my PR):
delta: For the multiline change case:
delta: |
@infokiller What's the current status of this problem? Are you any wiser? |
@sanmai-NL IIRC the nbdime PR resolves it, but it wasn't merged yet |
It might be helpful to open an issue in nbdime explaining the problem. I.e. asking them for an option to output valid unified diff format. The PR doesn't contain any explanation or tests etc, so I can understand that it got overlooked. I think we can close this issue though; I don't think nbdime is prominent enough to warrant adding special support for its format to delta's parser unfortunately. |
Hey again!
So I'm not really sure if it's a bug in
nbdiff
(https://github.com/jupyter/nbdime) or indelta
, but they don't seem to work together. Specifically, the diffs generated bynbdiff
are missing from the output.Here's my reproduction script:
The output I'm getting:
The text was updated successfully, but these errors were encountered: