Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single Table Design #528

Open
jpmtrabbold opened this issue Sep 23, 2021 · 6 comments
Open

Single Table Design #528

jpmtrabbold opened this issue Sep 23, 2021 · 6 comments
Labels
enhancement New feature or request

Comments

@jpmtrabbold
Copy link

Any ideas on how to use dyngoose with a single table design approach?

@benhutchins
Copy link
Owner

I love the concept. There have been some tickets about this same topic in the past (but a quick search didn't help me pull them up). There are plenty of use-cases where a single-table design with DynamoDB is a great solution.

It's certainly possible support could be added within Dyngoose. Conceptually I think I know how I would implement the "functionality", the issue I have is around all the typing definitions. I really focused on ensuring Dyngoose provides intelligent typing as much as possible, to catch as many issues as possible during compilation. A single table design would end up with an array of possible sub types. It would be simple enough to allow Dyngoose to manage the primary keys, and possibly add some hidden attributes which Dyngoose could use to make sure it understands the "document-subtype". Likely would also need to become more of a classic ORM - on global tables it would have to prevent the use of any global secondary index where the projection isn't set to ALL, so we can be sure we always have the complete record available.

For the typing definitions, I wouldn't touch the existing Dyngoose.Table class, I think it would be smarter to implement a new Dyngoose.SingleTable or similar. The two classes might be refactored to both extend from some kind of parent class for common methods, but there wouldn't be a ton… instead, the SingleTable would then allow you to design the document sub types and indexes for the table, then when reading it, you'd end up with either a flat array of mixed types or a mapped array of document sub type to an array of documents.

I'd likely have to experiment to see the best developer experience. I've tried to focus on how Dyngoose is used, focusing on the API-first and then designing the library to meet how I'd like the API to work. I believe it has lead to a very user-friendly and intuitive library (my opinion is biased, I am aware). I'd like to continue that approach if single table design support is implemented to make it extremely easy to utilize every aspect of the experience.

As it stands though, I am using Dyngoose in a production environment, but my application would not benefit substantially from single-table design support so I have not implemented it. I am open to contributions if you were thinking of adding the functionality. If you are not thinking taking on the task, I'd like to keep this issue open to allow others to show interest in the functionality. If the Dyngoose continues to grow and support for this feature grows, I will certainly take on the task.

@jpmtrabbold
Copy link
Author

That's awesome, Ben. I totally agree that Dyngoose is, indeed very intuitive - I would even say the most intuitive so far! (and I tested lots of other dynamodb orms before arriving at Dyngoose) - so I commend your intention in keeping it on the same path.

The reason I asked is that I'm building a new SaaS for my company and I'd like to benefit from the partition key access when querying related entities. I'm also considering using typedorm as it was developed with single-table design in mind. As much as I would love to take on the opportunity to take on that task, my stakeholders want to go to market quickly so I can't really afford that!

You made a beautiful piece of software and it's so great already the way it is. If at some point it aligns with your interests to bring it to support single-table design, wow it would be unstoppable! The idea that you posed on how to do it makes sense to me.

All the best!

@benhutchins
Copy link
Owner

@jpmtrabbold I've started toying with this a bit, designing the API implementation and creating some tests. I'll plan to open a branch to share the ongoing development soon. I am not sure what the timeline is for being able to release it, but I wanted to get an idea of what it'll take to add the functionality. I'll share another update soon.

@jpmtrabbold
Copy link
Author

@benhutchins that's exciting!

@benhutchins
Copy link
Owner

So after playing with a few ways this could be built, I've ended up on how the API will operate.

I played with an inheritance model a bit, you'd have a single/master table class and then the various entities would extend it. This works well for inheriting primary key attributes and a few other global attributes you might share across the various entities but it makes querying a mess and I didn't like how it would work.

I ended up finding an intelligent way to make this work, I believe. The implementation will look like this:

@Dyngoose.$Table({
  name: 'master-table',

  // Dyngoose will store the type of each entity as a special attribute, this will allow it to know
  // which type of entity a document is and load it into the right class. I know this could be done
  // by pasting out some prefix from the sort key, but I think having an explicit field has more advantages
  entityTypeAttributeName: 'type', // defaults to something like '_t'
})
class MasterTable extends Dyngoose.Table {
  @Dyngoose.$PrimaryKey('id', 'sk')
  public static readonly primaryKey: Dyngoose.Query.PrimaryKey<MasterTable, string, string>

  // you can also define additional indexes like normal

  // ID will be a unique ID for the customer in this example
  @Dyngoose.Attribute.String()
  public id: string

  @Dyngoose.Attribute.String()
  public sk: string

  // you can define the entity type attribute optionally, in case you want to query by it or something
  @Dyngoose.Attribute.String()
  public type: 'customer' | 'order'
}

@Dyngoose.$Entity({
  name: 'customer',
  table: MasterTable,
})
class Customer extends Dyngoose.Entity<MasterTable> {
  // you'll have to duplicate the attributes from the parent table here,
  // they'll be automatically set for you so need to repeat the @Dyngoose.Attribute decorators
  public id: string

  // but you can override the attributes here
  @Dyngoose.Attribute.String({
    // this is set to the customer will come before any of the orders when sorting
    default: () => '#',
  })
  public sk: string

  @Dyngoose.Attribute.String()
  public name: string

  @Dyngoose.Attribute.Date()
  public dob: Date
}

@Dyngoose.$Entity({
  name: 'order',
  table: MasterTable,
})
class Order extends Dyngoose.Entity<MasterTable> {
  public id: string

  // a new composite attribute type can help build sort key values
  // this can be used on tables and entities, it can useful for some GSIs
  @Dyngoose.Attribute.Composite({
    // you can use prefix of text: or attr: … I haven't found a smarter way to handle this
    // this is set so the customer will load before any of the customers orders
    join: ['text:CUS', 'attr:createdAt'],
  })
  public sk: string

  @Dyngoose.Attribute.String()
  public product: string

  @Dyngoose.Attribute.Number()
  public price: number

  @Dyngoose.Attribute.Date()
  public createdAt: Date
}

// to create new records:
Customer.new({
  id: 'ben was here',
  name: 'ben',
})

Order.new({
  id: 'ben was here',
  product: 'toast with jam',
  price: 5.00,
  createdAt: new Date(),
})

Order.new({
  id: 'ben was here',
  product: 'cold brew coffee, lg',
  product: 5.00,
  createdAt: new Date(), // note: this would cause a conflict based on my set up, this is for demonstration purpsoes only
})

// you now have your tables and entities
const records = await MasterTable.primaryKey.query({
  id: 'ben was here',
})

// records comes back as a QueryOutput… in this case it will be a list of MasterTable records with the only known attributes being id and sk
// but additional methods will exist on QueryOuput

// toMap returns a map, using the entity class as the key
const map: Map<Customer | Order, (Customer | Order)[]> = records.toMap<Customer, Order>()
const customer = map.get(Customer)![0]
const orders = map.get(Order)

// toObject returns an object using the entity name as a key
const object: { customer: Customer[], order: Order[] } = records.toObject<Customer, Order>()
const customer = object.customer[0]
const orders = object.order

// toArray returns an array of entities, converting the array of MasterTable to entities
const array: (Customer | Order)[] = records.toArray<Customer | Order>()

// these methods are available on all query outputs, so you can also use a search:
const onlyOrders = (await MasterTable.primaryKey.search({
  id: 'ben was here',
  type 'order',
})).toArray<Order>()

My main issue is that I still don't like the querying. The issue is that the master table would know about the entities, but I don't have a way to get TypeScript definitions to know about the entities. I played with a few ways and it causes some issues when I try to reference the entities as a type in the master table class, since the master table class needs to be defined before the entities are defined. I could possibly resolve this by creating a query class, something like:

@Dyngoose.$TableQuery({
  table: MasterTable,
})
export class MasterTableQuery extends Dyngoose.TableQuery<MasterTable, Customer, Order> {
}

You could then use the get, query, search, and other methods from MasterTableQuery which would all have the proper type definitions. I believe it'd make sense to offer this class, which would make development easier when you have the proper types working. I'd likely come up with a better name for this concept then TableQuery.

It could probably support querying by entity too, something like:

This idea would mean that Entity and Table would be both extend from a document model class, which handles all the regular property and attribute set/get/remove methods and other utilities, but the table class would still handle all the work with talking to DynamoDB. If you call .save on an entity, it would use the table's Dyngoose.DocumentClient to put the entity, no need to recreate everything there. There wouldn't be any querying methods on the Entity classes, at least not on the first implementation, you'd instead have to query from the master table and you can optionally filter by entity type.

I've played around a bit with this, built the composite attribute type functionality which is useful so I'll likely release that separately so get it available sooner; just need to add some test coverage for it.

What do you think about this implementation approach @jpmtrabbold? It's not quite first-class support, but I think it would work out well.

@jpmtrabbold
Copy link
Author

Hey man! Sorry for the delay - I see that you want to make an API that is similar and compatible with the multi-table approach, so impressive effort on that. But if we would think of the functionality first - I think it could be a bit safer. If you have a look here, you can make safer queries, the PK and SK are more declarative and obvious (which is a big deal in single table design). I like your annotations system better though. Hard eh? Appreciate the effort - I think you are onto something here.

@benhutchins benhutchins added the enhancement New feature or request label Apr 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants