Skip to content
This repository has been archived by the owner on Jan 3, 2024. It is now read-only.

rgw/sfs: dbconn: add connection pool #233

Merged
merged 10 commits into from
Nov 6, 2023

Conversation

tserong
Copy link
Member

@tserong tserong commented Oct 19, 2023

This is the culmination of the work in #209 but I've cleaned up the commit history to get rid of old experiments and also added some tests. I didn't want to squash the commits in the other PR because some of that is still interesting (to me at least) as a historical record.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)

@github-actions
Copy link

📦 quay.io/s3gw/s3gw:pr-39134ba44abcb24c6fa7f0b40ac5c8e49737199f-6572130889-1 https://quay.io/repository/s3gw/s3gw?tab=tags&tag=pr-39134ba44abcb24c6fa7f0b40ac5c8e49737199f-6572130889-1

@tserong
Copy link
Member Author

tserong commented Oct 19, 2023

(tests are failing because of https://github.com/aquarist-labs/s3gw/issues/756)

@tserong tserong force-pushed the wip-connection-pool branch from 3abc71f to dd4c90e Compare October 19, 2023 09:52
@github-actions github-actions bot added the needs-rebase Changes need a rebase label Oct 19, 2023
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@tserong tserong changed the base branch from s3gw.2023-10-19-old to s3gw October 19, 2023 09:53
@tserong tserong removed the ci/build-s3gw-image Build and push a container image label Oct 19, 2023
@tserong tserong force-pushed the wip-connection-pool branch from dd4c90e to 2f1c295 Compare October 19, 2023 09:55
@github-actions github-actions bot removed the needs-rebase Changes need a rebase label Oct 19, 2023
@tserong tserong added needs-rebase Changes need a rebase ci/build-s3gw-image Build and push a container image and removed needs-rebase Changes need a rebase labels Oct 19, 2023
@jecluis jecluis added this to the v0.23.0 milestone Oct 22, 2023
@jecluis jecluis added area/rgw-sfs RGW & SFS related kind/feature New functionality or support for something labels Oct 22, 2023
sqlite_orm v1.8.2 has fnc12/sqlite_orm#1054,
which we want to ensure thread safety.  Unfortunately it's missing
fnc12/sqlite_orm#1169, without which
`storage.sync_schema()` breaks (or at least gives us nasty errors)
due to incorrect quotes.  In order to pick up the latter, I've
forked sqlite_orm into the aquarist-labs org, and made a branch
"v1.8.2-s3gw", which is upstream v1.8.2 plus a cherry pick of that
additional fix.

Signed-off-by: Tim Serong <[email protected]>
If we're alrady using an SFStore, don't create a new DBConnRef, just use
the one that's already part of the SFStore.

Signed-off-by: Tim Serong <[email protected]>
Using `transaction_guard()` instead of `begin_transaction()` ensures the
transaction will rollback if anything in this block throws an exception.

Signed-off-by: Tim Serong <[email protected]>
@tserong tserong force-pushed the wip-connection-pool branch from 2f1c295 to aa374c0 Compare October 25, 2023 03:37
@tserong
Copy link
Member Author

tserong commented Oct 25, 2023

Rebased

@jecluis
Copy link
Member

jecluis commented Oct 25, 2023

Is this failed test related with the chats we've been having about our CI self-hosted runners? Only thing I see is that the log was abruptly interrupted.

@tserong
Copy link
Member Author

tserong commented Oct 25, 2023

Nope, my bad - when I rebased, I missed changing a couple of instances of storage. to storage-> in some new code. Hang on...

This ensures we're not making multiple copies of the Storage object.
In this commit, I renamed Storage to StorageImpl, so I could do a build
and check for compiler errors to ensure I'd caught all the previous uses
of Storage throughout the codebase.

Signed-off-by: Tim Serong <[email protected]>
@tserong tserong force-pushed the wip-connection-pool branch from aa374c0 to cfa1f10 Compare October 25, 2023 05:03
@tserong
Copy link
Member Author

tserong commented Oct 25, 2023

...let's see how that goes now.

@tserong tserong added ci/build-s3gw-image Build and push a container image and removed ci/build-s3gw-image Build and push a container image labels Oct 25, 2023
@tserong
Copy link
Member Author

tserong commented Oct 25, 2023

LOL! now it's failing the WAL explosion test, because the WAL didn't explode as much as usual...

@github-actions
Copy link

📦 quay.io/s3gw/s3gw:pr-4976c7504f08b89aa74790a73d2b15f40bf92421-6636123673-1 https://quay.io/repository/s3gw/s3gw?tab=tags&tag=pr-4976c7504f08b89aa74790a73d2b15f40bf92421-6636123673-1

@tserong tserong force-pushed the wip-connection-pool branch from cfa1f10 to 8cb8807 Compare October 25, 2023 06:01
@jecluis
Copy link
Member

jecluis commented Oct 25, 2023

LOL! now it's failing the WAL explosion test, because the WAL didn't explode as much as usual...

How dare! 🤣

src/rgw/driver/sfs/sqlite/dbconn.h Outdated Show resolved Hide resolved
src/rgw/driver/sfs/sqlite/dbconn.h Outdated Show resolved Hide resolved

public:
sqlite3* first_sqlite_conn;
std::unordered_map<std::thread::id, sqlite3*> all_sqlite_conns;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this private? It has a non trivial access story (pull mutex?)

Do we actually need the thread id here? I can't find any references. Maybe a vector is enough. Still needs synchronized access though.

Copy link
Member Author

@tserong tserong Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second question first, because it's the easy one :-) The thread id is there because get_sqlite_conn() uses it to get the sqlite3* specifically associated with the calling thread. Same mechanism as get_storage(), just a different map. I found keeping two maps made some of the code easier to read rather than one map with a std::pair<Storage, sqlite*> or a struct, but either approach would work.

As to can we make it private... The current code inside DBConn actually only uses the main thread's sqlite* for a couple of things during startup to do with updating metadata. The other use of all_sqlite_conns is in SFSStatusPage::render(), where we use it to get db status for all threads. Having it public made that really easy and lightweight, but you're right, it would then need synchronized access, dammit.

I guess the safest thing to do under the circumstances is make that map private (or fold it into the storage_pool map), and add a method that returns a vector created as necessary from the contents of the map. Something like this (untested):

std::vector<sqlite*> all_sqlite_conns() const {
  std::vector<sqlite*> conns;
  std::shared_lock lock(storage_pool_mutex);
  for (auto p : all_sqlite_conns) {
    conns.emplace(p.second)
  }
  return conns;
}

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all_sqlite_conns sounds good. Right now this is only called by a human via the status page so this doesn't need to e high performance - at the end this is roughly 4k (64 bit * 512 threads) data :).

An alternative would be to store the sqlite* pointers in a vector and the special first one separately (or define it as vector[0]). We'd loose the thread association, but the copy is cheaper. I'd rather not have the status page under the pool mutex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with the alternative :-)

Currently, `DBConn` keeps an instance of `Storage`, which is created by
`sqlite_orm::make_storage()`.  That first instance of `Storage` is long-lived
(the `DBConn` c'tor calls `storage->open_forever()`) so the sqlite database
is open for the entire life of the program, but this first `Storage` object
and its associated sqlite database connection pointer are largely not used
for anything much after initialization. The exception is the SFS status page,
which also uses this connection to report some sqlite statistics.

All the other threads (the workers, the garbage collector, ...) call
`DBConn::get_storage()`, which returns a copy of the first `Storage` object.
These copies don't have `open_forever()` called on them, which means every
time they're used for queries we get a pair of calls to `sqlite3_open()` and
`sqlite3_close()` at the start end end of the query.  These calls don't open
the main database file again (it's already open) but they do open and close
the WAL.

There's a couple of problems with this.  One is that the SFS status page
only sees the main thread (which is largely boring), and can't track any
of the worker threads.  The other problem is that something about not keeping
the connection open on the worker threads is relatively expensive.  If we
keep connections open rather than opening and closing with every query, we
can get something like a 20x speed increase on read queries, and at least 2x
on writes.

This new implementation gives one `Storage` object per thread, created on
demand as a copy of the first `Storage` object created in the `DBConn`
constructor.

Fixes: https://github.com/aquarist-labs/s3gw/issues/727
Signed-off-by: Tim Serong <[email protected]>
This adds a separate std::vector<sqlite3*> of all DB connections,
and a method to get a copy of it so that the SFS stats page can
iterate though it easily to query DB statistics.

Signed-off-by: Tim Serong <[email protected]>
Since adding storage_pool, the WAL doesn't explode quite as much as
it used to with 10 threads and 1000 objects each (previously with
multiple unpooled connections it'd reliably go over 500MB, but I've
seen one unit test where it "only" got to ~450MB, and another where
it barely got over 70MB).  To try to get it to explode nicely I'm
now using std::thread::hardware_concurrency() as the number of
threads, and creating 2000 objects.  I've also dropped the test
value to 300MB to give some more wiggle room.

Signed-off-by: Tim Serong <[email protected]>
@tserong tserong force-pushed the wip-connection-pool branch from 8cb8807 to 6c6bde6 Compare October 31, 2023 11:41
Copy link
Member

@jecluis jecluis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jecluis jecluis requested a review from irq0 November 2, 2023 07:18
@jecluis
Copy link
Member

jecluis commented Nov 2, 2023

@irq0 when you have a chance, please do another pass on this so we can get it merged soon :)

@tserong tserong merged commit 53d47be into aquarist-labs:s3gw Nov 6, 2023
5 checks passed
@tserong tserong deleted the wip-connection-pool branch November 6, 2023 00:04
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/rgw-sfs RGW & SFS related ci/build-s3gw-image Build and push a container image kind/feature New functionality or support for something
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants