-
Notifications
You must be signed in to change notification settings - Fork 115
Closes #660 Store management #731
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emily, I have mentioned a few things -- largely style changes that I want you to incorporate globally (some EDN preferences and using full sentences for comments throughout).
But then there are more substantive things that we'll need to work through together.
- I'd like to see a fixed-up version of the first commit land (splitting
Store
out ofConn
), but I'd like to see it split intostore.rs
. I have done this locally and may just land it myself, since it's a good refactoring. You can see my tree at
https://github.com/ncalexan/mentat/tree/fluffyemily/store-management/
-
You can't use
canonicalize
in the way that you are, because it goes to the filesystem and fails if the file doesn't exist. (That's why your tests are failing in Travis.) You can work around this by canonicalizing after the first successful open, but it's all kinds of complicated. Do we really care about this? -
The split between named memory stores and files is awkward as written, and it doesn't do what a consumer familiar with SQLite thinks. That is, two named memory stores do not correspond to the same SQLite in-memory store, since unless you take steps to unify them, SQLite treats in-memory stores as distinct. I'd really prefer to standardize on SQLite's URI notation for stores (and potentially build an API for named in-memory stores that uses that URI notation). Canonicalizing URIs is much simpler than canonicalizing file paths, since there's really only
..
to worry about and not symlinks.
Finally, on the meat of the problem: I'm just not sure this expression is going in a good direction. At first I though you wouldn't be able Arc::make_mut
your way forward, because make_mut
updates or clones the underlying data. But I see you have this Arc<Mutex<...>>
thing in place, so in theory the clone should do the right thing. But my test across threads (see my branch) shows that it doesn't work, at least not as written,. I can't convince myself if that's a small bug or a fundamental problem (yet). I will say that it's very hard to reason about the nested Arc
and Mutex
layers, and the Weak
is another layer of indirection that I haven't really understood yet. So I thought I'd give an initial review, asking for some obvious fixes and clean-ups, and point at what I think are some non-obvious problems, so that we can iterate on the final solution. Thanks!
src/errors.rs
Outdated
@@ -109,5 +110,20 @@ error_chain! { | |||
description("provided value doesn't match value type") | |||
display("provided value of type {} doesn't match attribute value type {}", provided, expected) | |||
} | |||
|
|||
StoreNotFound(path: String) { | |||
description("the Store provided does not exist or is not yet open.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-existence (on disk) is pretty different from not being opened. Is there a reason this is one error?
src/stores.rs
Outdated
pub fn open_store<T>(path: T) -> Result<Store> where T: AsRef<Path> { | ||
let path_ref = path.as_ref(); | ||
let path_string = path_ref.to_string_lossy(); | ||
let (name, cannonical) = if path_string.len() > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "canonical" throughout this method.
src/stores.rs
Outdated
} else { | ||
(path_string.into(), path_ref.to_path_buf()) | ||
}; | ||
Stores::singleton().write().unwrap().open(&name, cannonical) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we be .write()?
throughout? This is just a (special) Result
so we don't want to panic on failure.
src/stores.rs
Outdated
assert_eq!(1, Arc::strong_count(store1.conn())); | ||
let mut in_progress = store1.begin_transaction().expect("begun"); | ||
in_progress.transact(r#"[ | ||
{ :db/ident :foo/bar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's usual to tighten EDN, so no spaces before or after curlies (like {:db/ident ...}
), and it's unusual to include commas (which are EDN whitespace).
src/stores.rs
Outdated
let mut store2 = Stores::connect_store(path).expect("expected a new store"); | ||
let mut in_progress = store2.begin_transaction().expect("begun"); | ||
in_progress.transact(r#"[ | ||
{:foo/bar 15, :foo/baz false, :foo/x [1, 2, 3]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: no commas, please. (But the tight curlies look more natural here.)
src/stores.rs
Outdated
} | ||
|
||
{ | ||
// forking an open store leads to a ref count of 2 on the shared conn. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: (throughout) we use full sentences with punctuation for comments. Everywhere, please.
} | ||
} | ||
|
||
pub fn singleton() -> &'static RwLock<Stores> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's hide the RwLock<Stores>
in something opaque, say:
pub struct StoreManager(RwLock<Stores>)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was reading this code and getting confused until I realized that there's a static implementation and a non-static implementation of basically the same thing. What's the point of exposing the singleton and the static implementation? I'd rather have a non-static implementation and a singleton; it makes testing easier ('cuz you can test against an instance that isn't the singleton).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Access to the singleton should definitely not be public. (Right?)
|
||
#[test] | ||
fn test_stores_get_closed_store() { | ||
match Stores::get_named_in_memory_store("test_stores_get_closed_store").expect("Expected a store to be fetched") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can assert!(blah.is_none())
or similar, if you like.
a164969
to
e291159
Compare
@ncalexan This is the version without connection pooling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking much closer to what I expect, but I think there's still work to do around the StoreConnection
functionality:
- either there's work to radically simplify that functionality
- or there's work to explain to me why this layer of complexity is needed
But overall, this is looking much better! Have you tried racing lots of threads opening and closing the same stores simultaneously? That's a good test to implement.
@@ -582,7 +582,6 @@ impl Conn { | |||
/// Prepare the provided SQLite handle for use as a Mentat store. Creates tables but | |||
/// _does not_ write the bootstrap schema. This constructor should only be used by | |||
/// consumers that expect to populate raw transaction data themselves. | |||
|
|||
pub(crate) fn empty(sqlite: &mut rusqlite::Connection) -> Result<Conn> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the first commit, rebase the changes so as to not lose the original commit message.
@@ -86,7 +86,7 @@ fn escape_string_for_pragma(s: &str) -> String { | |||
s.replace("'", "''") | |||
} | |||
|
|||
fn make_connection(uri: &Path, maybe_encryption_key: Option<&str>) -> rusqlite::Result<rusqlite::Connection> { | |||
pub fn make_connection(uri: &Path, maybe_encryption_key: Option<&str>) -> rusqlite::Result<rusqlite::Connection> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thom didn't make this public for a reason. We should be feature-checking sqlcipher
in Store
and Stores
to handle this, I think.
src/errors.rs
Outdated
@@ -80,6 +80,15 @@ pub enum MentatError { | |||
#[fail(display = "provided value of type {} doesn't match attribute value type {}", _0, _1)] | |||
ValueTypeMismatch(ValueType, ValueType), | |||
|
|||
#[fail(display = "Cannot open store {} at path {:?} as it does not match previous store location {:?}", _0, _1, _2)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we differentiating "names" from "paths" at all? If you give the same path two different names, it's a catastrophe, and this error is all about preventing you giving the same name to two different paths.
Let's just let folks give an in-memory URI if they want, and then make it easy for them to create those URIs (which are a little bit of a SQLite detail). Right now we have a split in the API based on whether the store is named that seems odd.
src/errors.rs
Outdated
#[fail(display = "The Store at {} does not exist or is not yet open.", _0)] | ||
StoreNotFound(String), | ||
|
||
#[fail(display = "The Store at {:?} has active connections and cannot be closed.", _0)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The display and the actual name seem to have diverged. I don't think we should leak something about lock poisoning if we can help it; it's a weird case to expose to high-level consumers.
src/errors.rs
Outdated
#[fail(display = "Cannot open store {} at path {:?} as it does not match previous store location {:?}", _0, _1, _2)] | ||
StorePathMismatch(String, PathBuf, PathBuf), | ||
|
||
#[fail(display = "The Store at {} does not exist or is not yet open.", _0)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's not yet open, when would I want to get a "not found" error rather than opening it?
|
||
impl Stores { | ||
// Returns true if there exists an entry for the provided name in the connections map. | ||
// This does not guarentee that the weak reference we hold to the Conn is still valid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This business about the weak reference is outdated now. (Here and below.)
// Creates a new store on an existing connection with a new encrypted rusqlite connection. | ||
// Equivalent to forking an existing store. | ||
#[cfg(feature = "sqlcipher")] | ||
pub fn connect_with_key(&mut self, name: &str, encryption_key: &str) -> Result<Store> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when you try to connect to a store that is encrypted with and without a key?
Can we, in some way, namespace stores, perhaps by asserting that encrypted stores always need to be suffixed .encrypted.mentatdb
, and rejecting unencrypted stores with that suffix? Basically, don't be accepting -- be rigid up front to make this problem easier; we can always loosen our restrictions.
sdks/android/Mentat/build.gradle
Outdated
@@ -18,7 +18,7 @@ buildscript { | |||
google() | |||
} | |||
dependencies { | |||
classpath 'com.android.tools.build:gradle:3.1.2' | |||
classpath 'com.android.tools.build:gradle:3.1.3' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent commit comment. This is... distressing. But whatever moves us forward, I guess.
@@ -10,6 +10,7 @@ | |||
# Specifies the JVM arguments used for the daemon process. | |||
# The setting is particularly useful for tweaking memory settings. | |||
org.gradle.jvmargs=-Xmx1536m | |||
org.gradle.configureondemand=false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really necessary? This sucks, because always configuring all projects is very expensive as sub-projects multiply.
@@ -36,7 +36,7 @@ public Mentat(String dbPath) { | |||
} | |||
|
|||
/** | |||
* Open a connection to an in-memory Store. | |||
* Open a connection to an anonymous in-memory Store. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very misleading, and is really the problem here -- your store isn't anonymous, it's the "empty string", hence it always open a new store. Why not make this function actually anonymous by randomizing the string and drop all the hand-coded names?
Enable ability to create named in memory stores and in memory stores with shared caches. Include ability to create encrypted connections. Update `Store` to take an `Arc<Conn>` so references can be shared. Update FFI to use `Stores` instead of `Store`. Add `store_open_named_in_memory_store` to open a named in-memory store over FFI (useful for tests).
… Utilize named in memory store in tests to ensure isolation when running.
e291159
to
5ead1d3
Compare
Add Store manager to ensure that we only have one open handle to each database.
This implementation has gone through several iterations. The considerations that formed the current design are as follows:
rusqlite::Connection
isSend
but notSync
and therefore cannot be stored in a static var. This is why we only store theConn
insideStores
.Conn
is stored as aWeak
reference because otherwise every time we return aStore
we increase the reference count from one to two, forcing every mutable operation performed on theConn
to clone rather than return a mutable reference, dramatically increasing the number of Conn references proliferating the system. By storingConn
as aWeak
, for as long as there is only one Store we can perform mutable operations without increasing the reference count onConn
.Conn
as aWeak
reference, if we cease to hold any strong references to theConn
, by not having any activeStore
's, theWeak
will no longer be upgradable, but the entry will remain in the map of openConns
. We therefore must recreate theConn
if a request for a previously opened store is received. We do this in the background without consumers being aware.thread_local
store of openRc<rusqlite::Connection>
's mapped by the same key asConn
's. However, due torusqlite::Connection
not beingClone
, this meant thatRc::make_mut
could not be used, which resulted in the code being unable to get a mutable reference to therusqlite::Connection
if there was more than one reference. As the map retained a strong reference, that ensured that every returned Store had a reference count of at least 2 on it'srusqlite::Connection
which resulted in all mutable operations on theStore
failing. I therefore made the decision to create a newrusqlite::Connection
for every Store returned. This contravenes the original spec, but given the constraints something had to give and I decided on this one.Store
lead to a problem where the sameConn
was shared between tests and the assertions around reference counts on storedConn
's were unreliable. In order to get around this I introduced the concept of thenamed_in_memory_store
which assigns a name to an in memory store ensuring that the sameConn
was not shared between tests. This is the requirement that lead to the decision to store keys asString
's rather thanPathBuf
.The thing I don't like about this is the number of
rusqlite::Connection
s that we create. After chatting with @rnewman I will look into providing athread_local
connection pool and getStores
to take a reference to a connection that it will use to createStores
s.