Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUST-735 Remove Document as default generic type on Collection and Cursor #323

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .evergreen/aws-ecs-test/src/main.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use mongodb::Client;
use mongodb::{bson::Document, Client};

#[tokio::main]
async fn main() {
Expand All @@ -7,7 +7,7 @@ async fn main() {

client
.database("aws")
.collection("somecoll")
.collection::<Document>("somecoll")
.find_one(None, None)
.await
.unwrap();
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,11 +97,11 @@ for collection_name in db.list_collection_names(None).await? {
```
#### Inserting documents into a collection
```rust
use mongodb::bson::doc;
use mongodb::bson::{doc, Document};
```
```rust
// Get a handle to a collection in the database.
let collection = db.collection("books");
let collection = db.collection::<Document>("books");

let docs = vec![
doc! { "title": "1984", "author": "George Orwell" },
Expand Down Expand Up @@ -147,14 +147,14 @@ The driver also provides a blocking sync API. See the [Installation](#enabling-t
The various sync-specific types are found in the `mongodb::sync` submodule rather than in the crate's top level like in the async API. The sync API calls through to the async API internally though, so it looks and behaves similarly to it.
```rust
use mongodb::{
bson::{doc, Bson},
bson::{doc, Bson, Document},
sync::Client,
};
```
```rust
let client = Client::with_uri_str("mongodb://localhost:27017")?;
let database = client.database("mydb");
let collection = database.collection("books");
let collection = database.collection::<Document>("books");

let docs = vec![
doc! { "title": "1984", "author": "George Orwell" },
Expand Down
4 changes: 2 additions & 2 deletions src/client/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ const DEFAULT_SERVER_SELECTION_TIMEOUT: Duration = Duration::from_secs(30);
///
/// ```rust
/// # #[cfg(not(feature = "sync"))]
/// # use mongodb::{Client, error::Result};
/// # use mongodb::{bson::Document, Client, error::Result};
/// # #[cfg(feature = "async-std-runtime")]
/// # use async_std::task;
/// # #[cfg(feature = "tokio-runtime")]
Expand All @@ -56,7 +56,7 @@ const DEFAULT_SERVER_SELECTION_TIMEOUT: Duration = Duration::from_secs(30);
/// let client_ref = client.clone();
///
/// task::spawn(async move {
/// let collection = client_ref.database("items").collection(&format!("coll{}", i));
/// let collection = client_ref.database("items").collection::<Document>(&format!("coll{}", i));
///
/// // Do something with the collection
/// });
Expand Down
11 changes: 7 additions & 4 deletions src/client/session/test.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
use std::{future::Future, time::Duration};

use bson::Document;
use futures::stream::StreamExt;
use tokio::sync::RwLockReadGuard;

Expand Down Expand Up @@ -38,7 +39,9 @@ macro_rules! db_op {
macro_rules! collection_op {
($test_name:expr, $coll:ident, $body:expr) => {
|client| async move {
let $coll = client.database($test_name).collection($test_name);
let $coll = client
.database($test_name)
.collection::<bson::Document>($test_name);
$body.await.unwrap();
}
};
Expand Down Expand Up @@ -285,7 +288,7 @@ async fn cluster_time_in_commands() {
cluster_time_test("aggregate", |client| async move {
client
.database(function_name!())
.collection(function_name!())
.collection::<Document>(function_name!())
.aggregate(vec![doc! { "$match": { "x": 1 } }], None)
.await
})
Expand All @@ -294,7 +297,7 @@ async fn cluster_time_in_commands() {
cluster_time_test("find", |client| async move {
client
.database(function_name!())
.collection(function_name!())
.collection::<Document>(function_name!())
.find(doc! {}, None)
.await
})
Expand All @@ -303,7 +306,7 @@ async fn cluster_time_in_commands() {
cluster_time_test("insert", |client| async move {
client
.database(function_name!())
.collection(function_name!())
.collection::<Document>(function_name!())
.insert_one(doc! {}, None)
.await
})
Expand Down
4 changes: 2 additions & 2 deletions src/coll/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,7 @@ where
&self,
pipeline: impl IntoIterator<Item = Document>,
options: impl Into<Option<AggregateOptions>>,
) -> Result<Cursor> {
) -> Result<Cursor<Document>> {
let mut options = options.into();
resolve_options!(
self,
Expand All @@ -240,7 +240,7 @@ where
pipeline: impl IntoIterator<Item = Document>,
options: impl Into<Option<AggregateOptions>>,
session: &mut ClientSession,
) -> Result<SessionCursor> {
) -> Result<SessionCursor<Document>> {
let mut options = options.into();
resolve_options!(
self,
Expand Down
8 changes: 4 additions & 4 deletions src/cursor/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,11 @@ use common::{GenericCursor, GetMoreProvider, GetMoreProviderResult};
///
/// ```rust
/// # use futures::stream::StreamExt;
/// # use mongodb::{Client, error::Result};
/// # use mongodb::{bson::Document, Client, error::Result};
/// #
/// # async fn do_stuff() -> Result<()> {
/// # let client = Client::with_uri_str("mongodb://example.com").await?;
/// # let coll = client.database("foo").collection("bar");
/// # let coll = client.database("foo").collection::<Document>("bar");
/// # let mut cursor = coll.find(None, None).await?;
/// #
/// while let Some(doc) = cursor.next().await {
Expand Down Expand Up @@ -81,7 +81,7 @@ use common::{GenericCursor, GetMoreProvider, GetMoreProviderResult};
/// # }
/// ```
#[derive(Debug)]
pub struct Cursor<T = Document>
pub struct Cursor<T>
where
T: DeserializeOwned + Unpin,
{
Expand Down Expand Up @@ -139,7 +139,7 @@ where
let coll = self
.client
.database(ns.db.as_str())
.collection(ns.coll.as_str());
.collection::<Document>(ns.coll.as_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is honestly not really relevant to the PR, but I'm just curious for my own knowledge and future reference for Swift- aren't some cursors not actually tied to particular collections e.g. one resulting from db-level aggregate? how does this work in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure either. From testing it out, the namespace appears to be "blah.$cmd.aggregate":

{
	"cursor" : {
		"firstBatch" : [
			{
				"_id" : {
					"id" : UUID("8b95ed9d-7d24-4880-bc4e-483510bd5046"),
					"uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=")
				},
				"lastUse" : ISODate("2021-04-08T22:53:20.885Z")
			}
		],
		"id" : NumberLong(0),
		"ns" : "blah.$cmd.aggregate"
	},
	"ok" : 1
}

We should probably add some test coverage of this case to ensure it's working properly. Assuming that's a valid namespace to send a killCursors to, it seems like it should be okay though.

let cursor_id = self.wrapped_cursor.id();
RUNTIME.execute(async move { coll.kill_cursor(cursor_id).await });
}
Expand Down
8 changes: 4 additions & 4 deletions src/cursor/session.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,12 @@ use crate::{
///
/// ```rust
/// # use futures::stream::StreamExt;
/// # use mongodb::{Client, error::Result, ClientSession, SessionCursor};
/// # use mongodb::{bson::Document, Client, error::Result, ClientSession, SessionCursor};
/// #
/// # async fn do_stuff() -> Result<()> {
/// # let client = Client::with_uri_str("mongodb://example.com").await?;
/// # let mut session = client.start_session(None).await?;
/// # let coll = client.database("foo").collection("bar");
/// # let coll = client.database("foo").collection::<Document>("bar");
/// # let mut cursor = coll.find_with_session(None, None, &mut session).await?;
/// #
/// while let Some(doc) = cursor.with_session(&mut session).next().await {
Expand All @@ -40,7 +40,7 @@ use crate::{
/// # }
/// ```
#[derive(Debug)]
pub struct SessionCursor<T = Document>
pub struct SessionCursor<T>
where
T: DeserializeOwned + Unpin,
{
Expand Down Expand Up @@ -105,7 +105,7 @@ where
let coll = self
.client
.database(ns.db.as_str())
.collection(ns.coll.as_str());
.collection::<Document>(ns.coll.as_str());
let cursor_id = self.info.id;
RUNTIME.execute(async move { coll.kill_cursor(cursor_id).await });
}
Expand Down
35 changes: 8 additions & 27 deletions src/db/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ use crate::{
/// ```rust
///
/// # #[cfg(not(feature = "sync"))]
/// # use mongodb::{Client, error::Result};
/// # use mongodb::{bson::Document, Client, error::Result};
/// # #[cfg(feature = "async-std-runtime")]
/// # use async_std::task;
/// # #[cfg(feature = "tokio-runtime")]
Expand All @@ -55,7 +55,7 @@ use crate::{
/// let db_ref = db.clone();
///
/// task::spawn(async move {
/// let collection = db_ref.collection(&format!("coll{}", i));
/// let collection = db_ref.collection::<Document>(&format!("coll{}", i));
///
/// // Do something with the collection
/// });
Expand Down Expand Up @@ -129,45 +129,26 @@ impl Database {
self.inner.write_concern.as_ref()
}

/// Gets a handle to a collection specified by `name` of the database. The `Collection` options
/// (e.g. read preference and write concern) will default to those of the `Database`.
///
/// This method does not send or receive anything across the wire to the database, so it can be
/// used repeatedly without incurring any costs from I/O.
pub fn collection(&self, name: &str) -> Collection {
Collection::new(self.clone(), name, None)
}

/// Gets a handle to a collection with type `T` specified by `name` of the database. The
/// `Collection` options (e.g. read preference and write concern) will default to those of the
/// `Database`.
///
/// This method does not send or receive anything across the wire to the database, so it can be
/// used repeatedly without incurring any costs from I/O.
pub fn collection_with_type<T>(&self, name: &str) -> Collection<T>
pub fn collection<T>(&self, name: &str) -> Collection<T>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we discussed the possibility of introducing a collection_with_document helper (or something like that) as an ergonomic way to retrieve a Collection<Document>, but after writing a few of these conversions, it seemed simpler / easier to just use the turbofish (::<>) operator on the existing helpers.

e.g.

let collection: Collection<Document> = db.collection("foo"); // explicit type way
let collection = db.collection::<Document>("foo"); // turbofish
let collection = db.collection_with_document("foo"); // helper

To me, the helper didn't really make it any easier to type nor did it actually make it clearer what was happening, so it seemed fine to me to just not have it. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from my outsider perspective it doesn't seem to make a significant difference, but I'd defer to you two on whether collection_with_document is in any way more idiomatic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine by me. The explicit type/turbofish ways both make it clearer that the value returned is generic over Document. I also think a collection_with_document method might be confusing when we have both a document and raw document type.

where
T: Serialize + DeserializeOwned + Unpin + Debug,
{
Collection::new(self.clone(), name, None)
}

/// Gets a handle to a collection specified by `name` in the cluster the `Client` is connected
/// to. Operations done with this `Collection` will use the options specified by `options` by
/// default and will otherwise default to those of the `Database`.
///
/// This method does not send or receive anything across the wire to the database, so it can be
/// used repeatedly without incurring any costs from I/O.
pub fn collection_with_options(&self, name: &str, options: CollectionOptions) -> Collection {
Collection::new(self.clone(), name, Some(options))
}

/// Gets a handle to a collection with type `T` specified by `name` in the cluster the `Client`
/// is connected to. Operations done with this `Collection` will use the options specified by
/// `options` by default and will otherwise default to those of the `Database`.
///
/// This method does not send or receive anything across the wire to the database, so it can be
/// used repeatedly without incurring any costs from I/O.
pub fn collection_with_type_and_options<T>(
pub fn collection_with_options<T>(
&self,
name: &str,
options: CollectionOptions,
Expand Down Expand Up @@ -213,7 +194,7 @@ impl Database {
&self,
filter: impl Into<Option<Document>>,
options: impl Into<Option<ListCollectionsOptions>>,
) -> Result<Cursor> {
) -> Result<Cursor<Document>> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Swift, we have this return a MongoCursor<CollectionSpecification>, and I think it would be a nice improvement here as well for 2.0. The versioned API will ensure that we won't have to worry about deserialization errors for the foreseeable future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm (though that's probably not surprising since I already agreed to it once...)

if so, would be nice to use a DatabaseSpecification type for list_databases too (assuming listDatabases is also in API version 1?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I filed RUST-740 to cover the both of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super familiar with the swift library, would CollectionSpecification/DatabaseSpecification be structs that model the info returned from the list methods? If so that sounds good to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, exactly.

let list_collections = ListCollections::new(
self.name().to_string(),
filter.into(),
Expand All @@ -234,7 +215,7 @@ impl Database {
filter: impl Into<Option<Document>>,
options: impl Into<Option<ListCollectionsOptions>>,
session: &mut ClientSession,
) -> Result<SessionCursor> {
) -> Result<SessionCursor<Document>> {
let list_collections = ListCollections::new(
self.name().to_string(),
filter.into(),
Expand Down Expand Up @@ -392,7 +373,7 @@ impl Database {
&self,
pipeline: impl IntoIterator<Item = Document>,
options: impl Into<Option<AggregateOptions>>,
) -> Result<Cursor> {
) -> Result<Cursor<Document>> {
let mut options = options.into();
resolve_options!(
self,
Expand All @@ -417,7 +398,7 @@ impl Database {
pipeline: impl IntoIterator<Item = Document>,
options: impl Into<Option<AggregateOptions>>,
session: &mut ClientSession,
) -> Result<SessionCursor> {
) -> Result<SessionCursor<Document>> {
let mut options = options.into();
resolve_options!(
self,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
use std::{collections::HashMap, sync::Arc, time::Duration};

use approx::abs_diff_eq;
use bson::Document;
use semver::VersionReq;
use serde::Deserialize;
use tokio::sync::RwLockWriteGuard;
Expand Down Expand Up @@ -157,7 +158,7 @@ async fn load_balancing_test() {
for _ in 0..10 {
let collection = client
.database("load_balancing_test")
.collection("load_balancing_test");
.collection::<Document>("load_balancing_test");
handles.push(
RUNTIME
.spawn(async move {
Expand Down
4 changes: 2 additions & 2 deletions src/sync/client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ use crate::{
/// so it can safely be shared across threads. For example:
///
/// ```rust
/// # use mongodb::{sync::Client, error::Result};
/// # use mongodb::{bson::Document, sync::Client, error::Result};
/// #
/// # fn start_workers() -> Result<()> {
/// let client = Client::with_uri_str("mongodb://example.com")?;
Expand All @@ -35,7 +35,7 @@ use crate::{
/// let client_ref = client.clone();
///
/// std::thread::spawn(move || {
/// let collection = client_ref.database("items").collection(&format!("coll{}", i));
/// let collection = client_ref.database("items").collection::<Document>(&format!("coll{}", i));
///
/// // Do something with the collection
/// });
Expand Down
6 changes: 3 additions & 3 deletions src/sync/coll.rs
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ use crate::{
/// ```

#[derive(Clone, Debug)]
pub struct Collection<T = Document>
pub struct Collection<T>
where
T: Serialize + DeserializeOwned + Unpin + Debug + Send + Sync,
{
Expand Down Expand Up @@ -153,7 +153,7 @@ where
&self,
pipeline: impl IntoIterator<Item = Document>,
options: impl Into<Option<AggregateOptions>>,
) -> Result<Cursor> {
) -> Result<Cursor<Document>> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could consider returning a Cursor<O> here (where O is a generic parameter to the aggregate method, not the T from the Collection). Or should we have a separate aggregate_with_output_type or something? I think with aggregation it's more likely that a user might just want to retrieve a document than with a find, since they may mutate the output along the way, so it's a little riskier to have this one be pure generic. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per slack conversation: seems we're all on the same page the T from the collection should not be used since people often do transformations here.

as for whether to require a type be specified in this method (which I was not thinking about the option of in the slack discussion), or just having a separate aggregate_with_type I don't have particularly strong feelings. I agree with your comment that it does seem a bit at odds with the choice to make requiring a type the default for db.collection. OTOH maybe it's less likely that people will bother writing custom Deserialize types for the output types of single methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my original inclination was to just leave this as Document by default and introduce the new separate method, since I imagine it'll be pretty common to use Document at least while prototyping the pipeline. I'm starting to flip back to wanting to maintain consistency though, given how easy it is to just append a ::<Document> to the end of the aggregate call. Especially in the face of complex pipelines, handling the generic type would be a piece of cake.

However

This got me trying it out and I noticed an interesting / frustrating technical challenge to doing it the Cursor<O> way: turbofish cannot be used in conjunction with impl Trait in the argument position (yet, see rust-lang/rust#83701).

e.g.

fn foo<T>(bar: impl ToString) { ... }
foo::<usize>("blah"); // compile error!

yields the following compiler error:

error[E0632]: cannot provide explicit generic arguments when `impl Trait` is used in argument position
  --> src/main.rs:21:11
   |
21 |     foo::<usize>("hello");
   |           ^^^^^ explicit generic argument not allowed

So you'd have to do it the explicit type parameter way:

let x: Cursor<Document> = coll.aggregate(vec![], None).await?

Which could be pretty frustrating.

This will make it tough to implement the one that takes in a generic argument, since we use impl Trait in all our CRUD methods for our options types (e.g. to be able to say options instead of having to say Some(options)). This makes me think we may want to do the builder approach here and have the method be on Cursor actually, something like:

let cursor = coll.aggregate(vec![ ... ], None).await?.with_type::<MyDeserializeType>();

This is pretty ugly admittedly, but I think it's preferable to breaking consistency with how options are provided to the other CRUD methods or requiring users to use explicit type annotations instead of turbofish.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree changing the way we pass options either here or everywhere doesn't seem like a great path forward.

is turbofish common/used rather than type annotations that people would be running into this a lot?

adding the method to cursor seems reasonable though it does look clunky.... it also seems consistent with the clone_with_type behavior that we currently have on Collection to change the generic type. and maybe it would be useful in other use cases as well - e.g. if you provide a projection via FindOptions that changes the shape of your documents as well, and you could call this on the resulting cursor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah none of these options is very ideal to me; with_type seems like it would be more ergonomic for chaining method calls together, since providing a type annotation would require breaking out the call to aggregate into a separate line. Updating our CRUD API to use the builder pattern would get rid of this issue, right?

In any case since we're opting to go the aggregate_with_type route, I think we can defer coming to a conclusion here til we implement that (and make a decision regarding the builder pattern).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think even with the builder pattern for options/sessions we'd still have the same issue unfortunately, since we use impl IntoIterator for the pipeline.

Also, to clarify, I don't think aggregate_with_type method will really be possible, since we'd have to ditch the impl Trait arguments, so I think we kind of have to do the Cursor::with_type approach. That being said, we don't need to implement that functionality now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, I forgot we use impl for the pipeline.

let pipeline: Vec<Document> = pipeline.into_iter().collect();
RUNTIME
.block_on(self.async_collection.aggregate(pipeline, options.into()))
Expand All @@ -169,7 +169,7 @@ where
pipeline: impl IntoIterator<Item = Document>,
options: impl Into<Option<AggregateOptions>>,
session: &mut ClientSession,
) -> Result<SessionCursor> {
) -> Result<SessionCursor<Document>> {
let pipeline: Vec<Document> = pipeline.into_iter().collect();
RUNTIME
.block_on(self.async_collection.aggregate_with_session(
Expand Down
Loading