Skip to content
This repository has been archived by the owner on Oct 31, 2024. It is now read-only.

Commit

Permalink
chore: adding more docs to the storage
Browse files Browse the repository at this point in the history
Signed-off-by: Simon Paitrault <[email protected]>
  • Loading branch information
Freyskeyd committed Oct 31, 2023
1 parent 45c75a4 commit d1e7303
Show file tree
Hide file tree
Showing 5 changed files with 92 additions and 451 deletions.
2 changes: 1 addition & 1 deletion crates/topos-tce-storage/src/fullnode/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ use crate::{

use self::locking::LockGuards;

mod locking;
pub mod locking;

/// Store to manage FullNode data
///
Expand Down
163 changes: 40 additions & 123 deletions crates/topos-tce-storage/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,40 +19,63 @@
//! <img alt="Text changing depending on mode. Light: 'So light!' Dark: 'So dark!'" src="https://github.com/topos-protocol/topos/assets/1394604/e4bd859e-2a6d-40dc-8e84-2a708aa8a2d8">
//!</picture>
//!
//! ## Usage
//! ### Definitions and Responsibilities
//!
//! Each store represents a different kind of capabilities, but they all act and need the same kind
//! of configuration in order to work.
//! As illustrated above, multiple `stores` are exposed in the library using various `tables`.
//!
//! For instance, the [`EpochValidatorsStore`](struct@epoch::EpochValidatorsStore) only needs a [`PathBuf`](struct@std::path::PathBuf)
//! argument to be instantiated where [`FullNodeStore`](struct@fullnode::FullNodeStore) needs a little bit more arguments.
//! The difference between a `store` and a `table` is that the `table` is responsible for storing
//! the data while the `store` is responsible for managing the data and its access and behaviour.
//!
//! The underlying mechanisms of how data is stored is fairly simple, it relies a lot on [`rocksdb`] and will
//! be describe below.
//! Here's the list of the different stores and their responsibilities:
//!
//! - The [`EpochValidatorsStore`](struct@epoch::EpochValidatorsStore) is responsible for managing the list of validators for each `epoch`.
//! - The [`FullNodeStore`](struct@fullnode::FullNodeStore) is responsible for managing all the persistent data such as [`Certificate`] delivered and associated `streams`.
//! - The [`IndexStore`](struct@index::IndexStore) is responsible for managing indexes in order to collect information about the broadcast and the network.
//! - The [`ValidatorStore`](struct@validator::ValidatorStore) is responsible for managing the pending certificates pool and all the transient and volatile data.
//!
//! For more information about a `store`, see the related doc.
//!
//! Next, we've the list of the different tables and their responsibilities:
//!
//! - The [`EpochValidatorsTables`](struct@epoch::EpochValidatorsTables) is responsible for storing the list of validators for each `epoch`.
//! - The [`ValidatorPerpetualTables`](struct@validator::ValidatorPerpetualTables) is responsible for storing the [`Certificate`] delivered and all the persitent data related to the broadcast.
//! - The [`ValidatorPendingTables`](struct@validator::ValidatorPendingTables) is responsible for storing the pending certificates pool and all the transient and volatile data.
//! - The [`IndexTables`](struct@index::IndexTables) is responsible for storing indexes about the delivery of [`Certificate`] such as `target subnet stream`.
//!
//! ## Special Considerations
//!
//! When using the storage layer, you need to be aware of the following:
//! - The storage layer is using [`rocksdb`] as a backend, which means that the data is stored on disk.
//! - The storage layer is using [`Arc`](struct@std::sync::Arc) to share the stores between threads.
//! - The storage layer is using [`async_trait`](https://docs.rs/async-trait/0.1.51/async_trait/) to expose methods that need to manage locks. (see [`WriteStore`](trait@store::WriteStore))
//! - Some functions are using [`DBBatch`](struct@rocks::db_column::DBBatch) to batch multiple writes in one transaction. But not all functions are using it.
//! - The storage layer is using [rocksdb](https://rocksdb.org/) as a backend, which means that this storage doesn't need external service, as `rocksdb` is embeddable kv store.
//! - The storage layer is using [`Arc`](struct@std::sync::Arc) to share the stores between threads. It also means that a `store` is only instantiated once.
//! - Some functions are batching multiple writes in one transaction. But not all functions are using it.
//!
//! ## Design Philosophy
//!
//! The choice of using [`rocksdb`] as a backend was made because it is a well known and battle tested database.
//! It is also very fast and efficient when it comes to write and read data. However, it is not the best when it comes
//! to compose or filter data. This is why we have multiple store that are used for different purposes.
//! The choice of using [rocksdb](https://rocksdb.org/) as a backend was made because it is a well known and battle tested database.
//! It is also very fast and efficient when it comes to write and read data.
//!
//! Multiple `stores` and `tables` exists in order to allow admin to deal with backups or
//! snapshots as they see fit. You can pick and choose which `tables` you want to backup without having to backup the whole database.
//!
//! By splitting the data in dedicated tables we define strong separation of concern
//! directly in our storage.
//!
//! `RocksDB` is however not the best fit when it comes to compose or filter data based on the data
//! itself.
//!
//! For complex queries, another database like [`PostgreSQL`](https://www.postgresql.org/) or [`CockroachDB`](https://www.cockroachlabs.com/) could be used as a Storage for projections.
//! The source of truth would still be [`rocksdb`] but the projections would be stored in a relational database. Allowing for more complex queries.
//! The source of truth would still be [rocksdb](https://rocksdb.org/) but the projections would be stored in a relational database, allowing for more complex queries.
//!
//! As mention above, the different stores are using [`Arc`](struct@std::sync::Arc), allowing a single store to be instantiated once
//! and then shared between threads. This is very useful when it comes to the [`FullNodeStore`](struct@fullnode::FullNodeStore) as it is used in various places.
//! and then shared between threads. This is very useful when it comes to the [`FullNodeStore`](struct@fullnode::FullNodeStore) as it is used in various places but need to provides single entrypoint to the data.
//!
//! It also means that the store is immutable, which is a good thing when it comes to concurrency.
//!
//! The burden of managing the locks is handled by the [`async_trait`](https://docs.rs/async-trait/0.1.51/async_trait/) crate when using the [`WriteStore`](trait@store::WriteStore).
//! The rest of the mutation on the data are handled by [`rocksdb`] itself.
//!
//! The locks are responsible for preventing any other query to mutate the data currently in processing. For more information about the locks see [`locking`](module@fullnode::locking)
//!
//! The rest of the mutation on the data are handled by [rocksdb](https://rocksdb.org/) itself.
//!
use errors::InternalStorageError;
use rocks::iterator::ColumnIterator;
Expand All @@ -72,7 +95,6 @@ pub mod epoch;
pub mod fullnode;
pub mod index;
pub mod types;
/// Everything that is needed to participate to the protocol
pub mod validator;

// v1
Expand Down Expand Up @@ -128,108 +150,3 @@ pub struct SourceHead {
/// Position of the Certificate
pub position: Position,
}

/// Define possible status of a certificate
#[derive(Debug, Deserialize, Serialize)]
pub enum CertificateStatus {
Pending,
Delivered,
}

/// The `Storage` trait defines methods to interact and manage with the persistency layer
#[async_trait::async_trait]
pub trait Storage: Sync + Send + 'static {
async fn get_pending_certificate(
&self,
certificate_id: CertificateId,
) -> Result<(PendingCertificateId, Certificate), InternalStorageError>;

/// Add a pending certificate to the pool
async fn add_pending_certificate(
&self,
certificate: &Certificate,
) -> Result<PendingCertificateId, InternalStorageError>;

/// Persist the certificate with given status
async fn persist(
&self,
certificate: &Certificate,
pending_certificate_id: Option<PendingCertificateId>,
) -> Result<CertificatePositions, InternalStorageError>;

/// Update the certificate entry with new status
async fn update(
&self,
certificate_id: &CertificateId,
status: CertificateStatus,
) -> Result<(), InternalStorageError>;

/// Returns the source heads of given subnets
async fn get_source_heads(
&self,
subnets: Vec<SubnetId>,
) -> Result<Vec<crate::SourceHead>, InternalStorageError>;

/// Returns the certificate data given their id
async fn get_certificates(
&self,
certificate_ids: Vec<CertificateId>,
) -> Result<Vec<Certificate>, InternalStorageError>;

/// Returns the certificate data given its id
async fn get_certificate(
&self,
certificate_id: CertificateId,
) -> Result<Certificate, InternalStorageError>;

/// Returns the certificate emitted by given subnet
/// Ranged by position since emitted Certificate are totally ordered
async fn get_certificates_by_source(
&self,
source_subnet_id: SubnetId,
from: Position,
limit: usize,
) -> Result<Vec<CertificateId>, InternalStorageError>;

/// Returns the certificate received by given subnet
/// Ranged by timestamps since received Certificate are not referrable by position
async fn get_certificates_by_target(
&self,
target_subnet_id: SubnetId,
source_subnet_id: SubnetId,
from: Position,
limit: usize,
) -> Result<Vec<CertificateId>, InternalStorageError>;

/// Returns all the known Certificate that are not delivered yet
async fn get_pending_certificates(
&self,
) -> Result<Vec<(PendingCertificateId, Certificate)>, InternalStorageError>;

/// Returns the next Certificate that are not delivered yet
async fn get_next_pending_certificate(
&self,
starting_at: Option<usize>,
) -> Result<Option<(PendingCertificateId, Certificate)>, InternalStorageError>;

/// Remove a certificate from pending pool
async fn remove_pending_certificate(
&self,
index: PendingCertificateId,
) -> Result<(), InternalStorageError>;

async fn get_target_stream_iterator(
&self,
target: SubnetId,
source: SubnetId,
position: Position,
) -> Result<
ColumnIterator<'_, CertificateTargetStreamPosition, CertificateId>,
InternalStorageError,
>;

async fn get_source_list_by_target(
&self,
target: SubnetId,
) -> Result<Vec<SubnetId>, InternalStorageError>;
}
Loading

0 comments on commit d1e7303

Please sign in to comment.