Add zfs_refcount command. #270

PaulZ-98 · 2021-03-17T19:34:28Z

Pull request checklist

Please check if your PR fulfills the following requirements:

Tests for the changes have been added (for bug fixes / features)
Docs have been reviewed and added / updated if needed (for bug fixes / features)
Build was run locally and any changes were pushed
Lint has passed locally and any fixes were made for failures

Pull request type

Please check the type of change your PR introduces:

Issue Number: #192

What is the new behavior?

sdb> spa | member spa_refcount | zfs_refcount
0xffffffffc1089360   dsl_sync_task_common 
0xffff91223d23a000    
0xffff911f9659f000    
0xffff91223f870800    
0xffff91223f870800    
0xffff911f9659c000    
0xffff912249c9d800    
0xffff912249c9d800    
0xffff911f9659c000    
0xffff91223caa6800    
0xffff91223caa6800    
0xffff911f9659c000    
0xffff91223caa4800

Does this introduce a breaking change?

Yes
No

Other information

Requires a crash dump with reference tracking enabled, so some of the testing infrastructure was refactored.

sdimitro

Hey @PaulZ-98, thank you for working on this!

There is a lot of things going on here, so let's break this into multiple commits/PRs.

sdimitro · 2021-03-18T15:05:14Z

.github/scripts/install-drgn.sh

@@ -10,5 +10,6 @@ sudo apt install bison flex libelf-dev libdw-dev libomp5 libomp-dev
 git clone https://github.com/osandov/drgn.git

 cd drgn
+git checkout -b drgn_february 0a6aaaae5d31ed142448f8220e208d65478ec80d


Can we remove this line? If there is a problem with the github actions we fix the actions themselves. If there is a problem with drgn we fix drgn.

sdimitro · 2021-03-18T15:16:49Z

.github/scripts/download-dump-from-s3.sh

+# use the default dir unless alternate dir passed as arg2 to script
 DATA_DIR="tests/integration/data"
+if [ ! -z "$2" ]; then
+	DATA_DIR=$2
+fi


I think instead of giving the option for which dump to use, we should just go ahead and test both dumps, no option needed. This means a lot of things, so here is a preliminary checklist of the things we need to do:
[1] This script can accept and download more than one dumps, then placed them into their own respective directories under the DATA_DIR (the comment above will also need to be changed along with the code)
[2] The test infrastructure should be made aware that multiple dumps can exist and always run all commands that we have for both crash dumps
[3] The saved regression output should be moved into crash-dump-specific regression output directories
[4] The github actions will need to be adjusted to download both crash dumps
[5] The wiki steps to run the tests manually will need to be updated on the wiki to include both dumps (https://github.com/delphix/sdb/wiki/Integration-Tests)

We should do the above over multiple pull requests:
[1st PR] The first pull request should make the changes to this download script to accept multiple dumps, and the respective changes to the test infrastructure code and regression output
[2nd PR] The second pull request will be adding the second crash dump and its own regression output
[3rd PR] The last pull request will be adding the new zfs_refcount command

Another cool thing would be for the new crash dumps themselves to keep a README file on their compressed archive highlighting why we added it (e.g. old crash dump X did not have refcounts enabled and we want to test that).

sdimitro · 2021-03-18T15:22:06Z

tests/integration/infra.py

+PRIMARY = "primary"
+ALTERNATE = "alternate"


Instead of PRIMARY and ALTERNATE let's go ahead and have a list of crash dumps automatically detected by the target DATA_DIR. For example this would currently look like this with the two crash dumps that we have:

DATA_DIR/dump-<date 1>/{dump, mods, vmlinux, regression_output} DATA_DIR/dump-<date 2>/{dump, mods, vmlinux, regression_output}

Keeping this structure as is makes it easy to add new crash dumps in the future.

sdimitro · 2021-03-18T15:24:06Z

tests/integration/infra.py



+@staticmethod


static methods make sense in class context context. You can move this method in the Infra class.

ahrens · 2021-03-18T19:46:16Z

sdb/commands/zfs/zfs_refcount.py

+    @staticmethod
+    def print_ref(obj: drgn.Object):
+        ptr = int(obj.ref_holder)
+        c = sdb.create_object("char *", ptr)
+        s = c.string_().decode("utf-8")
+        print(f"{hex(ptr)}   {s} ")


This might be overkill, but it seems like this could be a PrettyPrinter for reference_t. Which could also be a Locator (finding reference_t's given a zfs_refcount_t).

ahrens · 2021-03-18T19:53:56Z

sdb/commands/zfs/zfs_refcount.py

+from sdb.commands.spl.spl_list import SPLList
+
+
+class Zfs_Refcount(sdb.Locator, sdb.PrettyPrinter):


Can you remind me what the benefit is of declaring this as a Locator, despite not having any locator-specific methods (i.e. no_input() or @sdb.InputHandler-annotated methods).

I changed it to just a pretty-printer.

class Zfs_Refcount(sdb.PrettyPrinter):

ahrens · 2021-03-18T19:55:45Z

sdb/commands/zfs/zfs_refcount.py

+        c = sdb.create_object("char *", ptr)
+        s = c.string_().decode("utf-8")


From the example output, it looks like if the pointer is not a string, this results in an empty string? That seems fine, as long as it doesn't raise an exception.

Hi @ahrens I played with this a bit and did finally get an exception generated from the decode method. I added a try/catch for it.

codecov-commenter · 2021-04-16T17:59:49Z

Codecov Report

Merging #270 (b1de684) into master (d0cb398) will decrease coverage by 0.21%.
The diff coverage is 67.85%.

@@            Coverage Diff             @@
##           master     #270      +/-   ##
==========================================
- Coverage   87.69%   87.48%   -0.22%     
==========================================
  Files          63       64       +1     
  Lines        2577     2605      +28     
==========================================
+ Hits         2260     2279      +19     
- Misses        317      326       +9

Impacted Files	Coverage Δ
sdb/commands/zfs/zfs_refcount.py	`67.85% <67.85%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d0cb398...b1de684. Read the comment docs.

PaulZ-98 · 2021-04-16T18:00:53Z

I made a version with just the zfs_refcount command and standard test.

PaulZ-98 force-pushed the sdb_refcount branch from b7eb85f to 4edde9a Compare March 17, 2021 19:52

sdimitro suggested changes Mar 18, 2021

View reviewed changes

ahrens reviewed Mar 18, 2021

View reviewed changes

PaulZ-98 force-pushed the sdb_refcount branch from 4edde9a to b1de684 Compare April 16, 2021 17:56

Add zfs_refcount command

c01fd44

PaulZ-98 force-pushed the sdb_refcount branch from b1de684 to c01fd44 Compare August 1, 2022 20:42

dlpx-tfc-github-manager bot deleted the branch delphix:master January 5, 2023 22:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add zfs_refcount command. #270

Add zfs_refcount command. #270

PaulZ-98 commented Mar 17, 2021 •

edited by ahrens

Loading

sdimitro left a comment

sdimitro Mar 18, 2021

sdimitro Mar 18, 2021

sdimitro Mar 18, 2021

sdimitro Mar 18, 2021

ahrens Mar 18, 2021

ahrens Mar 18, 2021

PaulZ-98 Apr 16, 2021

ahrens Mar 18, 2021

PaulZ-98 Aug 1, 2022

codecov-commenter commented Apr 16, 2021 •

edited

Loading

PaulZ-98 commented Apr 16, 2021

		from sdb.commands.spl.spl_list import SPLList


		class Zfs_Refcount(sdb.Locator, sdb.PrettyPrinter):

		c = sdb.create_object("char *", ptr)
		s = c.string_().decode("utf-8")

		PRIMARY = "primary"
		ALTERNATE = "alternate"

Add zfs_refcount command. #270

Are you sure you want to change the base?

Add zfs_refcount command. #270

Conversation

PaulZ-98 commented Mar 17, 2021 • edited by ahrens Loading

Pull request checklist

Pull request type

What is the new behavior?

Does this introduce a breaking change?

Other information

sdimitro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Apr 16, 2021 • edited Loading

Codecov Report

PaulZ-98 commented Apr 16, 2021

PaulZ-98 commented Mar 17, 2021 •

edited by ahrens

Loading

codecov-commenter commented Apr 16, 2021 •

edited

Loading