Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add zfs_refcount command. #270

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

PaulZ-98
Copy link
Contributor

@PaulZ-98 PaulZ-98 commented Mar 17, 2021

Pull request checklist

Please check if your PR fulfills the following requirements:

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been reviewed and added / updated if needed (for bug fixes / features)
  • Build was run locally and any changes were pushed
  • Lint has passed locally and any fixes were made for failures

Pull request type

Please check the type of change your PR introduces:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Documentation content changes
  • Other (please describe):

Issue Number: #192

What is the new behavior?

sdb> spa | member spa_refcount | zfs_refcount
0xffffffffc1089360   dsl_sync_task_common 
0xffff91223d23a000    
0xffff911f9659f000    
0xffff91223f870800    
0xffff91223f870800    
0xffff911f9659c000    
0xffff912249c9d800    
0xffff912249c9d800    
0xffff911f9659c000    
0xffff91223caa6800    
0xffff91223caa6800    
0xffff911f9659c000    
0xffff91223caa4800    

Does this introduce a breaking change?

  • Yes
  • No

Other information

Requires a crash dump with reference tracking enabled, so some of the testing infrastructure was refactored.

Copy link
Contributor

@sdimitro sdimitro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @PaulZ-98, thank you for working on this!

There is a lot of things going on here, so let's break this into multiple commits/PRs.

@@ -10,5 +10,6 @@ sudo apt install bison flex libelf-dev libdw-dev libomp5 libomp-dev
git clone https://github.com/osandov/drgn.git

cd drgn
git checkout -b drgn_february 0a6aaaae5d31ed142448f8220e208d65478ec80d
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this line? If there is a problem with the github actions we fix the actions themselves. If there is a problem with drgn we fix drgn.

Comment on lines 19 to 23
# use the default dir unless alternate dir passed as arg2 to script
DATA_DIR="tests/integration/data"
if [ ! -z "$2" ]; then
DATA_DIR=$2
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of giving the option for which dump to use, we should just go ahead and test both dumps, no option needed. This means a lot of things, so here is a preliminary checklist of the things we need to do:
[1] This script can accept and download more than one dumps, then placed them into their own respective directories under the DATA_DIR (the comment above will also need to be changed along with the code)
[2] The test infrastructure should be made aware that multiple dumps can exist and always run all commands that we have for both crash dumps
[3] The saved regression output should be moved into crash-dump-specific regression output directories
[4] The github actions will need to be adjusted to download both crash dumps
[5] The wiki steps to run the tests manually will need to be updated on the wiki to include both dumps (https://github.com/delphix/sdb/wiki/Integration-Tests)

We should do the above over multiple pull requests:
[1st PR] The first pull request should make the changes to this download script to accept multiple dumps, and the respective changes to the test infrastructure code and regression output
[2nd PR] The second pull request will be adding the second crash dump and its own regression output
[3rd PR] The last pull request will be adding the new zfs_refcount command

Another cool thing would be for the new crash dumps themselves to keep a README file on their compressed archive highlighting why we added it (e.g. old crash dump X did not have refcounts enabled and we want to test that).

Comment on lines 35 to 36
PRIMARY = "primary"
ALTERNATE = "alternate"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of PRIMARY and ALTERNATE let's go ahead and have a list of crash dumps automatically detected by the target DATA_DIR. For example this would currently look like this with the two crash dumps that we have:

DATA_DIR/dump-<date 1>/{dump, mods, vmlinux, regression_output}
DATA_DIR/dump-<date 2>/{dump, mods, vmlinux, regression_output}

Keeping this structure as is makes it easy to add new crash dumps in the future.



@staticmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static methods make sense in class context context. You can move this method in the Infra class.

Comment on lines 30 to 38
@staticmethod
def print_ref(obj: drgn.Object):
ptr = int(obj.ref_holder)
c = sdb.create_object("char *", ptr)
s = c.string_().decode("utf-8")
print(f"{hex(ptr)} {s} ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be overkill, but it seems like this could be a PrettyPrinter for reference_t. Which could also be a Locator (finding reference_t's given a zfs_refcount_t).

from sdb.commands.spl.spl_list import SPLList


class Zfs_Refcount(sdb.Locator, sdb.PrettyPrinter):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me what the benefit is of declaring this as a Locator, despite not having any locator-specific methods (i.e. no_input() or @sdb.InputHandler-annotated methods).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to just a pretty-printer.

class Zfs_Refcount(sdb.PrettyPrinter):

Comment on lines 33 to 34
c = sdb.create_object("char *", ptr)
s = c.string_().decode("utf-8")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the example output, it looks like if the pointer is not a string, this results in an empty string? That seems fine, as long as it doesn't raise an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ahrens I played with this a bit and did finally get an exception generated from the decode method. I added a try/catch for it.

@codecov-commenter
Copy link

codecov-commenter commented Apr 16, 2021

Codecov Report

Merging #270 (b1de684) into master (d0cb398) will decrease coverage by 0.21%.
The diff coverage is 67.85%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #270      +/-   ##
==========================================
- Coverage   87.69%   87.48%   -0.22%     
==========================================
  Files          63       64       +1     
  Lines        2577     2605      +28     
==========================================
+ Hits         2260     2279      +19     
- Misses        317      326       +9     
Impacted Files Coverage Δ
sdb/commands/zfs/zfs_refcount.py 67.85% <67.85%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d0cb398...b1de684. Read the comment docs.

@PaulZ-98
Copy link
Contributor Author

I made a version with just the zfs_refcount command and standard test.

@dlpx-tfc-github-manager dlpx-tfc-github-manager bot deleted the branch delphix:master January 5, 2023 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants