Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database BackupManager #1711

Open
wants to merge 37 commits into
base: nightly
Choose a base branch
from
Open

Database BackupManager #1711

wants to merge 37 commits into from

Conversation

jfldde
Copy link
Contributor

@jfldde jfldde commented Jan 15, 2025

Description

  • Introduces new BackupManager to backup all databases required by a citrea node.
  • Introduces two new RPCs:
    • backup_create <path> to create a node backup at path
    • backup_validate <path> to validate a node backup at path
    • backup_info <path> to get info about backup at path
  • Adds new --restore-db <path> arg to restore a node from a backup at path.

Linked Issues

Depends on #1714

Missing

  • Job handling for long running RPC

@jfldde jfldde requested a review from eyusufatik as a code owner January 15, 2025 13:32
@auto-assign auto-assign bot requested a review from kpp January 15, 2025 13:32
@jfldde jfldde marked this pull request as draft January 15, 2025 13:32
@jfldde jfldde requested a review from Chinacolt January 15, 2025 13:37
Copy link

codecov bot commented Jan 15, 2025

Codecov Report

Attention: Patch coverage is 26.87225% with 332 lines in your changes missing coverage. Please review.

Project coverage is 76.7%. Comparing base (2147611) to head (e7ac8f2).

Files with missing lines Patch % Lines
crates/common/src/backup/manager.rs 21.7% 158 Missing ⚠️
crates/common/src/backup/utils.rs 0.0% 70 Missing ⚠️
crates/common/src/backup/rpc.rs 20.4% 66 Missing ⚠️
...overeign-sdk/full-node/db/sov-schema-db/src/lib.rs 0.0% 11 Missing ⚠️
bin/citrea/src/cli.rs 0.0% 7 Missing ⚠️
...dk/full-node/sov-prover-storage-manager/src/lib.rs 60.0% 6 Missing ⚠️
crates/light-client-prover/src/da_block_handler.rs 0.0% 5 Missing ⚠️
crates/light-client-prover/src/services.rs 0.0% 3 Missing ⚠️
...es/sovereign-sdk/full-node/db/sov-db/src/mmr_db.rs 0.0% 3 Missing ⚠️
bin/citrea/src/main.rs 0.0% 2 Missing ⚠️
... and 1 more
Additional details and impacted files
Files with missing lines Coverage Δ
bin/citrea/src/rollup/bitcoin.rs 0.0% <ø> (ø)
bin/citrea/src/rollup/mock.rs 76.3% <100.0%> (+0.7%) ⬆️
crates/batch-prover/src/da_block_handler.rs 75.0% <100.0%> (+0.3%) ⬆️
crates/batch-prover/src/lib.rs 100.0% <100.0%> (ø)
crates/batch-prover/src/runner.rs 84.8% <100.0%> (+0.1%) ⬆️
crates/common/src/config.rs 94.8% <100.0%> (+<0.1%) ⬆️
crates/fullnode/src/da_block_handler.rs 72.9% <100.0%> (+0.9%) ⬆️
crates/fullnode/src/lib.rs 100.0% <100.0%> (ø)
crates/fullnode/src/runner.rs 90.3% <100.0%> (+0.1%) ⬆️
crates/sequencer/src/lib.rs 100.0% <100.0%> (ø)
... and 17 more

... and 3 files with indirect coverage changes

@jfldde jfldde marked this pull request as ready for review January 20, 2025 10:59
@auto-assign auto-assign bot requested a review from rakanalh January 20, 2025 10:59
bin/citrea/src/rollup/mod.rs Outdated Show resolved Hide resolved
@@ -119,6 +119,10 @@ impl SnapshotManager {

Ok(SnapshotManagerIter::new(db_iter, snapshot_iterators))
}

pub fn get_db_handle(&self) -> Arc<sov_schema_db::DB> {
self.db.clone()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to see the consequences of returning a Weak ptr here. Just curious

.or(self.base_path.as_ref())
.context("Missing path and no backup_path found in config.")?;

let l1_lock = self.l1_processing_lock.lock().await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these locks for? If we need a FS lock, we can use https://docs.rs/file-lock/latest/file_lock/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind the two locks is to make sure the three backups are created in a consistent L1 and L2 state between them.
We hold both locks, start the backup process in a consistent state, then release locks. The node can continue producing block while the backups are made from the same L2/L1 height

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, understood. Do we produce l1 blocks in parallel while producing l2 blocks or we are blocked by l1 production? If we are blocked, then there is no need for 2 mutexes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L1 and L2 block productions are independent from each other. The two mutexes are here for the backup manager to hold both locks and have a consistent L1 and L2 height

@jfldde jfldde added the HOLD-MERGE PR is not draft but should not be merged yet label Jan 30, 2025
@jfldde
Copy link
Contributor Author

jfldde commented Jan 30, 2025

Holding merge until devops' feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HOLD-MERGE PR is not draft but should not be merged yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RocksDB backups through RPC
3 participants