RUST-735 Remove Document as default generic type on Collection and Cursor #323

patrickfreed · 2021-04-08T18:34:43Z

This PR removes Document as the default generic type for Collection and Cursor and updates Database::collection and Database::collection_with_options to return Collection<T>. As part of this, the existing typed helpers became redundant and were removed. These changes were made to ease and promote the use of serde models with the driver instead of regular Documents.

Note: I didn't update the README examples to avoid confusing users of the 1.2.x or 2.0.0-alpha versions. I filed RUST-736 to cover the work for updating that, which we can do right before the release.

patrickfreed

I have a few non-as-yet-discussed comments/questions about the public API, so tagging in @kmahar as well.

patrickfreed · 2021-04-08T18:35:42Z

src/db/mod.rs

    /// Gets a handle to a collection with type `T` specified by `name` of the database. The
    /// `Collection` options (e.g. read preference and write concern) will default to those of the
    /// `Database`.
    ///
    /// This method does not send or receive anything across the wire to the database, so it can be
    /// used repeatedly without incurring any costs from I/O.
-    pub fn collection_with_type<T>(&self, name: &str) -> Collection<T>
+    pub fn collection<T>(&self, name: &str) -> Collection<T>


I know we discussed the possibility of introducing a collection_with_document helper (or something like that) as an ergonomic way to retrieve a Collection<Document>, but after writing a few of these conversions, it seemed simpler / easier to just use the turbofish (::<>) operator on the existing helpers.

e.g.

let collection: Collection<Document> = db.collection("foo"); // explicit type way let collection = db.collection::<Document>("foo"); // turbofish let collection = db.collection_with_document("foo"); // helper

To me, the helper didn't really make it any easier to type nor did it actually make it clearer what was happening, so it seemed fine to me to just not have it. Thoughts?

from my outsider perspective it doesn't seem to make a significant difference, but I'd defer to you two on whether collection_with_document is in any way more idiomatic

This is fine by me. The explicit type/turbofish ways both make it clearer that the value returned is generic over Document. I also think a collection_with_document method might be confusing when we have both a document and raw document type.

patrickfreed · 2021-04-08T18:37:23Z

src/db/mod.rs

@@ -213,7 +194,7 @@ impl Database {
        &self,
        filter: impl Into<Option<Document>>,
        options: impl Into<Option<ListCollectionsOptions>>,
-    ) -> Result<Cursor> {
+    ) -> Result<Cursor<Document>> {


In Swift, we have this return a MongoCursor<CollectionSpecification>, and I think it would be a nice improvement here as well for 2.0. The versioned API will ensure that we won't have to worry about deserialization errors for the foreseeable future.

sgtm (though that's probably not surprising since I already agreed to it once...)

if so, would be nice to use a DatabaseSpecification type for list_databases too (assuming listDatabases is also in API version 1?)

Good call, I filed RUST-740 to cover the both of them.

I'm not super familiar with the swift library, would CollectionSpecification/DatabaseSpecification be structs that model the info returned from the list methods? If so that sounds good to me

Yep, exactly.

patrickfreed · 2021-04-08T18:40:55Z

src/sync/coll.rs

@@ -153,7 +153,7 @@ where
        &self,
        pipeline: impl IntoIterator<Item = Document>,
        options: impl Into<Option<AggregateOptions>>,
-    ) -> Result<Cursor> {
+    ) -> Result<Cursor<Document>> {


We could consider returning a Cursor<O> here (where O is a generic parameter to the aggregate method, not the T from the Collection). Or should we have a separate aggregate_with_output_type or something? I think with aggregation it's more likely that a user might just want to retrieve a document than with a find, since they may mutate the output along the way, so it's a little riskier to have this one be pure generic. Thoughts?

per slack conversation: seems we're all on the same page the T from the collection should not be used since people often do transformations here.

as for whether to require a type be specified in this method (which I was not thinking about the option of in the slack discussion), or just having a separate aggregate_with_type I don't have particularly strong feelings. I agree with your comment that it does seem a bit at odds with the choice to make requiring a type the default for db.collection. OTOH maybe it's less likely that people will bother writing custom Deserialize types for the output types of single methods?

Yeah, my original inclination was to just leave this as Document by default and introduce the new separate method, since I imagine it'll be pretty common to use Document at least while prototyping the pipeline. I'm starting to flip back to wanting to maintain consistency though, given how easy it is to just append a ::<Document> to the end of the aggregate call. Especially in the face of complex pipelines, handling the generic type would be a piece of cake.

However

This got me trying it out and I noticed an interesting / frustrating technical challenge to doing it the Cursor<O> way: turbofish cannot be used in conjunction with impl Trait in the argument position (yet, see rust-lang/rust#83701).

e.g.

fn foo<T>(bar: impl ToString) { ... } foo::<usize>("blah"); // compile error!

yields the following compiler error:

error[E0632]: cannot provide explicit generic arguments when `impl Trait` is used in argument position --> src/main.rs:21:11 | 21 | foo::<usize>("hello"); | ^^^^^ explicit generic argument not allowed

So you'd have to do it the explicit type parameter way:

let x: Cursor<Document> = coll.aggregate(vec![], None).await?

Which could be pretty frustrating.

This will make it tough to implement the one that takes in a generic argument, since we use impl Trait in all our CRUD methods for our options types (e.g. to be able to say options instead of having to say Some(options)). This makes me think we may want to do the builder approach here and have the method be on Cursor actually, something like:

let cursor = coll.aggregate(vec![ ... ], None).await?.with_type::<MyDeserializeType>();

This is pretty ugly admittedly, but I think it's preferable to breaking consistency with how options are provided to the other CRUD methods or requiring users to use explicit type annotations instead of turbofish.

I agree changing the way we pass options either here or everywhere doesn't seem like a great path forward.

is turbofish common/used rather than type annotations that people would be running into this a lot?

adding the method to cursor seems reasonable though it does look clunky.... it also seems consistent with the clone_with_type behavior that we currently have on Collection to change the generic type. and maybe it would be useful in other use cases as well - e.g. if you provide a projection via FindOptions that changes the shape of your documents as well, and you could call this on the resulting cursor.

Yeah none of these options is very ideal to me; with_type seems like it would be more ergonomic for chaining method calls together, since providing a type annotation would require breaking out the call to aggregate into a separate line. Updating our CRUD API to use the builder pattern would get rid of this issue, right?

In any case since we're opting to go the aggregate_with_type route, I think we can defer coming to a conclusion here til we implement that (and make a decision regarding the builder pattern).

I think even with the builder pattern for options/sessions we'd still have the same issue unfortunately, since we use impl IntoIterator for the pipeline.

Also, to clarify, I don't think aggregate_with_type method will really be possible, since we'd have to ditch the impl Trait arguments, so I think we kind of have to do the Cursor::with_type approach. That being said, we don't need to implement that functionality now.

Ah right, I forgot we use impl for the pipeline.

kmahar · 2021-04-08T21:12:40Z

src/cursor/mod.rs

@@ -139,7 +139,7 @@ where
        let coll = self
            .client
            .database(ns.db.as_str())
-            .collection(ns.coll.as_str());
+            .collection::<Document>(ns.coll.as_str());


this is honestly not really relevant to the PR, but I'm just curious for my own knowledge and future reference for Swift- aren't some cursors not actually tied to particular collections e.g. one resulting from db-level aggregate? how does this work in that case?

I'm not actually sure either. From testing it out, the namespace appears to be "blah.$cmd.aggregate":

{ "cursor" : { "firstBatch" : [ { "_id" : { "id" : UUID("8b95ed9d-7d24-4880-bc4e-483510bd5046"), "uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=") }, "lastUse" : ISODate("2021-04-08T22:53:20.885Z") } ], "id" : NumberLong(0), "ns" : "blah.$cmd.aggregate" }, "ok" : 1 }

We should probably add some test coverage of this case to ensure it's working properly. Assuming that's a valid namespace to send a killCursors to, it seems like it should be okay though.

kmahar · 2021-04-08T21:17:15Z

src/db/mod.rs

@@ -213,7 +194,7 @@ impl Database {
        &self,
        filter: impl Into<Option<Document>>,
        options: impl Into<Option<ListCollectionsOptions>>,
-    ) -> Result<Cursor> {
+    ) -> Result<Cursor<Document>> {


sgtm (though that's probably not surprising since I already agreed to it once...)

if so, would be nice to use a DatabaseSpecification type for list_databases too (assuming listDatabases is also in API version 1?)

kmahar · 2021-04-08T21:37:46Z

src/db/mod.rs

    /// Gets a handle to a collection with type `T` specified by `name` of the database. The
    /// `Collection` options (e.g. read preference and write concern) will default to those of the
    /// `Database`.
    ///
    /// This method does not send or receive anything across the wire to the database, so it can be
    /// used repeatedly without incurring any costs from I/O.
-    pub fn collection_with_type<T>(&self, name: &str) -> Collection<T>
+    pub fn collection<T>(&self, name: &str) -> Collection<T>


from my outsider perspective it doesn't seem to make a significant difference, but I'd defer to you two on whether collection_with_document is in any way more idiomatic

kmahar · 2021-04-08T21:55:07Z

src/sync/coll.rs

@@ -153,7 +153,7 @@ where
        &self,
        pipeline: impl IntoIterator<Item = Document>,
        options: impl Into<Option<AggregateOptions>>,
-    ) -> Result<Cursor> {
+    ) -> Result<Cursor<Document>> {


per slack conversation: seems we're all on the same page the T from the collection should not be used since people often do transformations here.

as for whether to require a type be specified in this method (which I was not thinking about the option of in the slack discussion), or just having a separate aggregate_with_type I don't have particularly strong feelings. I agree with your comment that it does seem a bit at odds with the choice to make requiring a type the default for db.collection. OTOH maybe it's less likely that people will bother writing custom Deserialize types for the output types of single methods?

isabelatkinson

Code looks good, just a few questions. Also wanted to double check that the examples in the README are still good, I believe they should be unaffected but would be worth taking a glance to sanity check if you haven't yet. (It would also probably be a good idea to put those in a test somewhere to make sure they don't become outdated.)

isabelatkinson · 2021-04-08T21:43:38Z

src/db/mod.rs

    /// Gets a handle to a collection with type `T` specified by `name` of the database. The
    /// `Collection` options (e.g. read preference and write concern) will default to those of the
    /// `Database`.
    ///
    /// This method does not send or receive anything across the wire to the database, so it can be
    /// used repeatedly without incurring any costs from I/O.
-    pub fn collection_with_type<T>(&self, name: &str) -> Collection<T>
+    pub fn collection<T>(&self, name: &str) -> Collection<T>


This is fine by me. The explicit type/turbofish ways both make it clearer that the value returned is generic over Document. I also think a collection_with_document method might be confusing when we have both a document and raw document type.

isabelatkinson · 2021-04-09T16:30:24Z

src/db/mod.rs

@@ -213,7 +194,7 @@ impl Database {
        &self,
        filter: impl Into<Option<Document>>,
        options: impl Into<Option<ListCollectionsOptions>>,
-    ) -> Result<Cursor> {
+    ) -> Result<Cursor<Document>> {


I'm not super familiar with the swift library, would CollectionSpecification/DatabaseSpecification be structs that model the info returned from the list methods? If so that sounds good to me

isabelatkinson · 2021-04-09T20:08:15Z

src/sync/coll.rs

@@ -153,7 +153,7 @@ where
        &self,
        pipeline: impl IntoIterator<Item = Document>,
        options: impl Into<Option<AggregateOptions>>,
-    ) -> Result<Cursor> {
+    ) -> Result<Cursor<Document>> {


Yeah none of these options is very ideal to me; with_type seems like it would be more ergonomic for chaining method calls together, since providing a type annotation would require breaking out the call to aggregate into a separate line. Updating our CRUD API to use the builder pattern would get rid of this issue, right?

In any case since we're opting to go the aggregate_with_type route, I think we can defer coming to a conclusion here til we implement that (and make a decision regarding the builder pattern).

patrickfreed

I was originally planning on holding off on updating the README until close to the release (RUST-736), but now that we're planning on doing a second alpha I went ahead and updated them. We can still use RUST-736 to track the work for updating the examples to be use serde more prominently. Also, good point on having tests for them--I added a file to our tests folder since the examples only use the public API to ensure the examples compile.

patrickfreed · 2021-04-09T20:59:42Z

src/db/mod.rs

@@ -213,7 +194,7 @@ impl Database {
        &self,
        filter: impl Into<Option<Document>>,
        options: impl Into<Option<ListCollectionsOptions>>,
-    ) -> Result<Cursor> {
+    ) -> Result<Cursor<Document>> {


Yep, exactly.

patrickfreed · 2021-04-09T21:09:37Z

src/sync/coll.rs

@@ -153,7 +153,7 @@ where
        &self,
        pipeline: impl IntoIterator<Item = Document>,
        options: impl Into<Option<AggregateOptions>>,
-    ) -> Result<Cursor> {
+    ) -> Result<Cursor<Document>> {


I think even with the builder pattern for options/sessions we'd still have the same issue unfortunately, since we use impl IntoIterator for the pipeline.

Also, to clarify, I don't think aggregate_with_type method will really be possible, since we'd have to ditch the impl Trait arguments, so I think we kind of have to do the Cursor::with_type approach. That being said, we don't need to implement that functionality now.

isabelatkinson · 2021-04-09T21:31:53Z

src/sync/coll.rs

@@ -153,7 +153,7 @@ where
        &self,
        pipeline: impl IntoIterator<Item = Document>,
        options: impl Into<Option<AggregateOptions>>,
-    ) -> Result<Cursor> {
+    ) -> Result<Cursor<Document>> {


Ah right, I forgot we use impl for the pipeline.

isabelatkinson · 2021-04-09T21:34:32Z

whoops, just noticed there's a clippy failure in one of the README examples tests (lgtm otherwise though, no need for another round of review on that)

edit: looks like there's also somewhere in the AWS test that needs a type annotation

seanpianka · 2021-04-18T17:24:05Z

After updating from alpha to alpha.1, I'm now seeing this run-time error with the same code:

failed to update collection: Error { kind: ArgumentError { message: \"update document must have first key starting with \\\'$\" }, labels: [] }" }

Upserting is failing due to this missing key, causing the update_document_check call to fail. However, this was never an issue previously... is there some automatic key insertion for Collection that is no longer being performed?

patrickfreed · 2021-04-20T15:06:36Z

The update_ family of methods are meant to be used with update operator expressions rather than replacement documents, which the replace family of methods are mean to be used with, but we only started validating this in alpha.1 (RUST-718). To fix your issue, update any usages of update_one that use a replacement document to use replace_one instead.

e.g.

coll.update_one(doc! { "x": 2 }, doc! { "x": 3 }, None).await?; // error
coll.replace_one(doc! { "x": 2 }, doc! { "$inc": { "x": 1 } }, None).await?; // error

coll.update_one(doc! { "x": 2 }, doc! { "$inc": { "x": 1 } }, None).await?; // correct
coll.replace_one(doc! { "x": 2 }, doc! { "x": 3 }, None).await?; // correct

If you have any more questions or concerns, please open a ticket on our Jira project rather than continuing the discussion here or on any other PR, since it can be difficult for the team or other users to find the discussion in the future otherwise.

seanpianka · 2021-04-20T19:15:23Z

Gotcha, I appreciate your detailed reply, thank you @patrickfreed! In the future, I'll open a ticket in the Jira project.

I managed to fix the update_one call via doc! { "$set": replacement_doc }, but replace_one more closely captures my intent. I'll out that, thanks again!

patrickfreed · 2021-04-22T17:35:45Z

No problem, glad you were able to get it working!

patrickfreed added 2 commits April 8, 2021 14:07

remove Document as default generic

9700c84

update doctests

7e49bd8

patrickfreed commented Apr 8, 2021

View reviewed changes

fix sync doctests

546ef0f

patrickfreed requested review from isabelatkinson and kmahar April 8, 2021 19:36

patrickfreed marked this pull request as ready for review April 8, 2021 19:36

kmahar reviewed Apr 8, 2021

View reviewed changes

isabelatkinson reviewed Apr 9, 2021

View reviewed changes

update readme examples, add tests that ensure they compile

861fb50

patrickfreed commented Apr 9, 2021

View reviewed changes

isabelatkinson approved these changes Apr 9, 2021

View reviewed changes

patrickfreed added 2 commits April 9, 2021 17:47

fix aws-ecs-test

af9f755

fix clippy

7196276

kmahar approved these changes Apr 9, 2021

View reviewed changes

patrickfreed merged commit 3e68edf into mongodb:master Apr 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RUST-735 Remove Document as default generic type on Collection and Cursor #323

RUST-735 Remove Document as default generic type on Collection and Cursor #323

patrickfreed commented Apr 8, 2021

patrickfreed left a comment

patrickfreed Apr 8, 2021

kmahar Apr 8, 2021

isabelatkinson Apr 8, 2021

patrickfreed Apr 8, 2021

kmahar Apr 8, 2021

patrickfreed Apr 8, 2021

isabelatkinson Apr 9, 2021

patrickfreed Apr 9, 2021

patrickfreed Apr 8, 2021

kmahar Apr 8, 2021

patrickfreed Apr 8, 2021

kmahar Apr 9, 2021

isabelatkinson Apr 9, 2021

patrickfreed Apr 9, 2021

isabelatkinson Apr 9, 2021

kmahar Apr 8, 2021

patrickfreed Apr 8, 2021

kmahar Apr 8, 2021

kmahar Apr 8, 2021

kmahar Apr 8, 2021

isabelatkinson left a comment

isabelatkinson Apr 8, 2021

isabelatkinson Apr 9, 2021

isabelatkinson Apr 9, 2021

patrickfreed left a comment

patrickfreed Apr 9, 2021

patrickfreed Apr 9, 2021

isabelatkinson Apr 9, 2021

isabelatkinson commented Apr 9, 2021 •

edited

Loading

seanpianka commented Apr 18, 2021 •

edited

Loading

patrickfreed commented Apr 20, 2021

seanpianka commented Apr 20, 2021

patrickfreed commented Apr 22, 2021

RUST-735 Remove Document as default generic type on Collection and Cursor #323

RUST-735 Remove Document as default generic type on Collection and Cursor #323

Conversation

patrickfreed commented Apr 8, 2021

patrickfreed left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

isabelatkinson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickfreed left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

isabelatkinson commented Apr 9, 2021 • edited Loading

seanpianka commented Apr 18, 2021 • edited Loading

patrickfreed commented Apr 20, 2021

seanpianka commented Apr 20, 2021

patrickfreed commented Apr 22, 2021

isabelatkinson commented Apr 9, 2021 •

edited

Loading

seanpianka commented Apr 18, 2021 •

edited

Loading