Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-14550 control: Prevent SPDK calls if disable_hugepages true in cfg #13605

Merged
merged 16 commits into from
Jan 31, 2024

Conversation

tanabarr
Copy link
Contributor

@tanabarr tanabarr commented Jan 14, 2024

Avoid calling into SPDK if disable_hugepages is set in the server
config file. This change prevents access during StorageScan,
StorageFormat, StorageNvmeRebind and StorageNvmeAddDevice server
ControlService handlers and when cleaning hugepages during start-up.

Also:

  • Improve fault information messages
  • Improve skip related messaging during format
  • Cover missing config in storage command handler unit tests
  • Fix quirks in storage command display output
  • Improve unit test coverage of storage pretty printers

Features: control
Required-githooks: true

Before requesting gatekeeper:

  • Two review approvals and any prior change requests have been resolved.
  • Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
  • Commit messages follows the guidelines outlined here.
  • Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • You are the appropriate gatekeeper to be landing the patch.
  • The PR has 2 reviews by people familiar with the code, including appropriate watchers.
  • Githooks were used. If not, request that user install them and check copyright dates.
  • Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • All builds have passed. Check non-required builds for any new compiler warnings.
  • Sufficient testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • If applicable, the PR has addressed any potential version compatibility issues.
  • Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • Extra checks if forced landing is requested
    • Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • No new NLT or valgrind warnings. Check the classic view.
    • Quick-build or Quick-functional is not used.
  • Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

@tanabarr tanabarr requested review from a team as code owners January 14, 2024 21:47
@tanabarr tanabarr requested review from mjmac and kjacque and removed request for a team January 14, 2024 21:47
Copy link

github-actions bot commented Jan 14, 2024

Bug-tracker data:
Ticket title is 'Remove all SPDK activity calls if hugepages disabled in config'
Status is 'In Review'
Labels: 'triaged'
https://daosio.atlassian.net/browse/DAOS-14550

@daosbuild1
Copy link
Collaborator

Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13605/1/execution/node/1180/log

@tanabarr tanabarr requested a review from knard38 January 16, 2024 15:02
@daltonbohning daltonbohning removed the request for review from a team January 16, 2024 17:12
knard38
knard38 previously approved these changes Jan 17, 2024
Copy link
Contributor

@knard38 knard38 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

mjmac
mjmac previously approved these changes Jan 17, 2024
@tanabarr tanabarr self-assigned this Jan 19, 2024
@tanabarr tanabarr added the control-plane work on the management infrastructure of the DAOS Control Plane label Jan 19, 2024
@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13605/4/testReport/

Features: control
Required-githooks: true
Signed-off-by: Tom Nabarro <[email protected]>
Features: control
Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
Features: control
Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
Features: control
Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
Base automatically changed from tanabarr/control-add-roles-to-scan-output to master January 23, 2024 10:48
@NiuYawei NiuYawei dismissed stale reviews from mjmac and knard38 January 23, 2024 10:48

The base branch was changed.

@tanabarr tanabarr force-pushed the tanabarr/control-hugepages-disable-no-spdk branch from c9145a6 to 86588da Compare January 23, 2024 11:51
@daosbuild1
Copy link
Collaborator

@daosbuild1
Copy link
Collaborator

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13605/7/execution/node/366/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13605/7/execution/node/306/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13605/7/execution/node/327/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13605/7/execution/node/299/log

@tanabarr tanabarr requested review from knard38 and mjmac January 23, 2024 12:02
@daosbuild1
Copy link
Collaborator

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13605/7/execution/node/310/log

@tanabarr
Copy link
Contributor Author

@mjmac does the most recent commit address your second point sufficiently? I'm assuming you don't want me to address the first point in this PR as it has already been reviewed and I will address your third point in another PR (and create a ticket for it). If it looks okay then I will update the unit tests appropriately.

@daosbuild1
Copy link
Collaborator

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13605/10/execution/node/289/log

@daosbuild1
Copy link
Collaborator

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13605/10/execution/node/357/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13605/10/execution/node/283/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13605/10/execution/node/342/log

@daosbuild1
Copy link
Collaborator

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13605/10/execution/node/376/log

@tanabarr tanabarr requested a review from mjmac January 26, 2024 13:56
@tanabarr
Copy link
Contributor Author

That sounds sensible and I have thought the same, as this is a reactive fix are you okay if I do that in a separate PR? I have to balance against other priorities for 2.6.

That's fine with me. Please create a ticket so that the work isn't forgotten, though.

https://daosio.atlassian.net/browse/DAOS-15086

Copy link
Contributor

@mjmac mjmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better to me. Conceptually, it's much easier to understand smaller functions that either succeed at doing the thing or fail to do the thing rather than maybe doing the thing.

Holding off on approval for now, as there have been a number of changes in related areas and it would be prudent to re-test after landing to ensure that logic changes don't result in test failures.

Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
…gepages-disable-no-spdk

Features: control
Signed-off-by: Tom Nabarro <[email protected]>
Features: control
Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
Features: control
Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
…gepages-disable-no-spdk

Features: control
Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
Features: control
Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
@tanabarr tanabarr requested review from mjmac, knard38 and kjacque January 29, 2024 23:32
@tanabarr
Copy link
Contributor Author

GATEKEEPER: please use the PR title and description as commit message when landing, TIA

@daosbuild1
Copy link
Collaborator

Test stage Functional on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13605/13/testReport/

Copy link
Contributor

@mjmac mjmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tanabarr tanabarr requested a review from a team January 31, 2024 16:51
@mjmac mjmac merged commit 8116293 into master Jan 31, 2024
51 checks passed
@mjmac mjmac deleted the tanabarr/control-hugepages-disable-no-spdk branch January 31, 2024 16:53
tanabarr added a commit that referenced this pull request Feb 6, 2024
…fg (#13605)

Avoid calling into SPDK if disable_hugepages is set in the server
config file. This change prevents access during StorageScan,
StorageFormat, StorageNvmeRebind and StorageNvmeAddDevice server
ControlService handlers and when cleaning hugepages during start-up.

Also:
  * Improve fault information messages
  * Improve skip related messaging during format
  * Cover missing config in storage command handler unit tests
  * Fix quirks in storage command display output
  * Improve unit test coverage of storage pretty printers

Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
tanabarr added a commit that referenced this pull request Feb 6, 2024
…fg (#13605)

Avoid calling into SPDK if disable_hugepages is set in the server
config file. This change prevents access during StorageScan,
StorageFormat, StorageNvmeRebind and StorageNvmeAddDevice server
ControlService handlers and when cleaning hugepages during start-up.

Also:
  * Improve fault information messages
  * Improve skip related messaging during format
  * Cover missing config in storage command handler unit tests
  * Fix quirks in storage command display output
  * Improve unit test coverage of storage pretty printers

Features: control
Required-githooks: true

Signed-off-by: Tom Nabarro <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
control-plane work on the management infrastructure of the DAOS Control Plane
Development

Successfully merging this pull request may close these issues.

5 participants