Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial draft of spec #62

Merged
merged 33 commits into from
Mar 16, 2023
Merged

Add initial draft of spec #62

merged 33 commits into from
Mar 16, 2023

Conversation

evanstade
Copy link
Collaborator

@evanstade evanstade commented Jan 27, 2023

Latest: Preview | Diff


Preview | Diff

evanstade and others added 13 commits December 22, 2022 15:48
* Updates around user visibility of buckets.

* Add self as author.

* Clarify wording.

* Add bucket durability and IndexedDB text.

* algorithm tag

Co-authored-by: Evan Stade <[email protected]>
* Spec for creating a bucket
* fix english
* Address comments
* Add caches and getDirectory text. (#47)

* Updates around user visibility of buckets.

* Add self as author.

* Clarify wording.

* Add bucket durability and IndexedDB text.

* algorithm tag

* caches and getDirectory

* remove abbreviation

Co-authored-by: Evan Stade <[email protected]>

* Revert "Add caches and getDirectory text. (#47)" (#49)

This reverts commit a2fa5bc.

* Algorithm for keys()

* fix append

Co-authored-by: Evan Stade <[email protected]>
Co-authored-by: Evan Stade <[email protected]>
* Add caches and getDirectory text. (#47)

* Updates around user visibility of buckets.

* Add self as author.

* Clarify wording.

* Add bucket durability and IndexedDB text.

* algorithm tag

* caches and getDirectory

* remove abbreviation

Co-authored-by: Evan Stade <[email protected]>

* Revert "Add caches and getDirectory text. (#47)" (#49)

This reverts commit a2fa5bc.

Co-authored-by: Evan Stade <[email protected]>
Co-authored-by: Evan Stade <[email protected]>
* Enhance IDB, CacheStorage algorithms.

Also distinguish between the conceptual bottle and the StorageBottle object in the
open() definition.

* algo tags

Co-authored-by: Evan Stade <[email protected]>
* Add more algorithms for StorageBucket attributes.

* reference StorageEstimate object

Co-authored-by: Evan Stade <[email protected]>
* Enhancements.

1. Specify when expired buckets are removed.

2. Improve some text around durability, quota.

* fix some link errors

* or equal to

* Review updates

Co-authored-by: Evan Stade <[email protected]>
* Write persist/persisted algos

* no backticks on true

Co-authored-by: Evan Stade <[email protected]>
* Write persist/persisted algos

* no backticks on true

* Fill in some error types.

* Indentation fixes

* Fix redundant step.

* Streamline delete bucket.

Co-authored-by: Evan Stade <[email protected]>
* Add Clear Site Data.

Most of this text needs to be worked into various parts of the CSD spec.

* Better annotations

Co-authored-by: Evan Stade <[email protected]>
Copy link

@wanderview wanderview left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good start!

In addition to the comments on the PR, I had one comment on the existing text as well:

  • The storagebucket getter steps say to return "this's StorageBucketManager". But this refers to a Navigator or WorkerNavigator and the environment settings object owns the StorageBucketManager. So I think this should be "this's relevant settings object's StorageBucketManager".

index.bs Outdated

1. Let |storageKey| be the result of running [=obtain a storage key=] given |environment|.

1. If |storageKey| is failure, then [=exception/throw=] a "{{SecurityError}}" {{DOMException}} and abort these steps.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why SecurityError DOMException here instead of just TypeError?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, a TypeError is for when there's an invalid parameter. I believe this text was pretty much lifted from IDB, e.g. https://www.w3.org/TR/IndexedDB/#dom-idbfactory-deletedatabase

However I agree it is a bit odd because the steps to obtain a storage key don't seem to mention anything about a secure context, just opaque origins and I can't tell exactly what those are. Implementation wise, you can see that navigator.storageBuckets is restricted to SecureContexts and there's an opaque origin check as well which throws this security error (seems to be true across all storage APIs).

@ayuishii

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specifically has been updated to retrieve |shelf| instead, and updated to return UnknownError... but following the guidance given here about error mappings [1], which mentions that it recommends specific errors depending on the scenario. Although this is probably not followed consistently in practice, curious about your thoughts and if there is more I can learn about the TypeError recommendation?

[1] whatwg/fs#57 (comment)

index.bs Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated

1. If |permission| is "{{PermissionState/granted}}", then set |persisted| to true.

1. Let |bucket| be the [=/storage bucket=] named |name| in |storageKey| or null otherwise.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very informal. I think we should probably define a data structure to more strictly define the relationships. For example, see how service workers defines and uses "registration map":

https://w3c.github.io/ServiceWorker/#dfn-scope-to-registration-map

You could have something similar keyed by StorageKey containing a BucketMap or something.

Or better yet, it looks like a lot of this data structure might already be defined in the storage spec. Maybe you just need to reference those data structures (possible with monkey patches if necessary):

https://storage.spec.whatwg.org/#ref-for-storage-bucket%E2%91%A2

I see you reference some of these types below, like [=/storage bucket=]. So maybe we just need to work in the references to the bucket map, etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried addressing this by updating to use [=bucket map=] from the local storage shelf. But let me know if it doesn't look like what you expected. Thanks!

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated

<div algorithm>

A {{StorageBucket}} has an {{IDBFactory}} object. The <dfn attribute for=StorageBucket>indexedDB</dfn> getter steps are:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like every time someone accesses the getter we will be creating a new IDBFactory, CacheStorage, etc. We probably want to create these when the bucket is created and just return them instead? Or create them once, cache, and then then return the cached values later? I think most of these getters have SameObject on them, so we can't return new ones each time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, done for IDB and caches. I think those are the only two that have this issue.


1. Let |bucket name| be the [=code-unit-substring-by-positions|code unit substring=] from 8 to |end| of |type|.

1. If the result of [=validate a bucket name=] with |bucket name| is failure, then abort these steps.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a clear use case to clear specific buckets via clear-site-data header? Do we have folks interested in this willing to help us test it? This seems more complicated than I expected for a first pass.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is not particularly complicated, FWIW. Given the examples here it seems useful --- e.g. in the case of a user logging out, some user-specific bucket should be cleared, but shared buckets should not be cleared. Having separate buckets for each logged-in user in a multi-login use case is one example use case we often cite for storage buckets.

I don't know of any partners who have asked for it and don't know the history of this idea @ayuishii

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back at historical discussions, looks like this was a request from @asakusuma from LinkedIn back in 2019. Not sure if they would be active during the Origin Trial for testing though. Haven't heard of other partners that have shown interest otherwise.

Context: https://docs.google.com/document/d/1eBWhY91nUfdT2mys3GaNKX4fKPei79Wk-KlWHYffbAA/edit?usp=sharing

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Seems fair to keep. Thanks for the context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wanderview we're not in a position right this moment to do an origin trial, but we are definitely still wanting to be able to use clear-site-data to remotely clear things like just a service worker without collateral damage to other storage items.

evanstade and others added 11 commits February 12, 2023 18:46
* Address review feedback

* additional review comment

---------

Co-authored-by: Evan Stade <[email protected]>
* Address review feedback

* additional review comment

---------

Co-authored-by: Evan Stade <[email protected]>
* Address review feedback

* additional review comment

---------

Co-authored-by: Evan Stade <[email protected]>
Co-authored-by: Evan Stade <[email protected]>
* Use TypeError

* Update index.bs

* null or undefined

* s/and/, then/

* backticks instead of <code>
* durability value

* quota value

* expiration value
* quota value in bytes

* code point

* Operate open bucket on storage bucket map
* Expand usage of bucket map

* remove the

* Use bucket map for clear site data

* short-circuit if bucket is already persistent
* Update time

* Update time 2

* check for null bucket expiration

* parse duration string

* Fix link to html spec
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved

1. Let |bucket name| be the [=code-unit-substring-by-positions|code unit substring=] from 8 to |end| of |type|.

1. If the result of [=validate a bucket name=] with |bucket name| is failure, then abort these steps.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Seems fair to keep. Thanks for the context.

* Update to use TypeError instead of UnknownError

* Remove obtaining |storageKey|

* Fix parallel steps

* Fix dfn for storage usage

* Fix parallel steps for keys() & delete()

* Fix delete()

* Handle undefined quota

* Use wall time

* UA bucket clearing wording

* Add bucket to bucket map on creation

* Fix indent
* Add algorithm for bucket removal

* Update index.bs

* update algorithms with remove

* Fix wording

* Fix wording #2
Copy link

@wanderview wanderview left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is really close. I still just have some questions about the expiration time stuff. Sorry if I'm confused.

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved

<div algorithm>

The <dfn method for="StorageBucket">setExpires(|expires|)</dfn> method steps are:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit confused between durations and moments before. I thought this was taking an absolute time moment, but I see in the text above there was some attempt to use durations.

Should we have setExpires() using absolute time moment and then also setExpiresByDuration() using a duration value to compute the expires time?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I added to the confusion because I was also confused when writing this draft 😅 . I think I'm leaning toward updating everything to moments, and having setExpires() also use absolute time moment. WDYT?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense. We just have to do some conversion from the value the js passes in for "expires" into moments. I think if we treat that value as a duration that is added to the unix epoch then we will have a moment we can use for comparisons.

cc @noamr as the expert who I believe wrote the time spec infrastructure.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually @jysasskin wrote that (big) bit.
Not sure why you need to convert it to a moment here. A DOMHighResTimestamp that represents the duration from the unix epoch should suffice for any comparison.

Moments are useful when you get some internal value and need to ensure it's coarsened etc. That's why there is no duration->moment conversion, only moment->duration. @jysasskin can correct/confirm :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for my confusion. @jyasskin, can you help here?

I thought we needed a conversion because [=current wall time=] returns the result from [=coarsen time=] which is a moment. Is there something that defines how timestamps compare against moments? AFAIK this spec PR does not say anywhere that the timestamp is relative to the unix epoch, so it seems ambiguous to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest using [=epoch-relative timestamp=] instead of [=current wall time=]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since epoch-relative timestamps are just untyped numbers that get confused between their interpretations as epoch-relative, page-load-relative, and durations between other points in time, if a spec is trying to represent a point in time, I prefer to use moments inside of specifications. Good wording for that, given an expires initialized by Date.now() + duration_ms, would probably be Let |expiration| be |expires| milliseconds after the [=Unix epoch=].

I have less of a strong preference for how to express the Javascript interface, at least until Temporal is ready. I'd lean toward a setMaxAge(duration_ms) so readers don't have to wonder if it's a duration or a timestamp, and then Let |expiration| be |duration_ms| milliseconds after [=this=]'s [=relevant settings object=]'s [=environment settings object/current wall time=].

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the discussion and examples! I updated the wording to match |expires| milliseconds after the [=Unix epoch=].

I think my personal preference is to use timestamp (over duration), but with no strong reasoning. This might be nice to pull out in a separate issue though to easily be able to reference later. I've created an issue here (#79) to continue the discussion it this works for folks.

* Address comments

* Fix bucket removal step
* Add Service Worker deletion to issue
* Remove string parsing step for expires

* Update duration to moment

* get or expire a bucket

* Allow UA bucket deletion on expiration

* Nit
index.bs Outdated Show resolved Hide resolved
Copy link

@wanderview wanderview left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good! I think we just need to figure out how to convert the expires input value into a moment correctly in order to compare against the wall clock's current time moment.

index.bs Outdated Show resolved Hide resolved

<div algorithm>

The <dfn method for="StorageBucket">setExpires(|expires|)</dfn> method steps are:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense. We just have to do some conversion from the value the js passes in for "expires" into moments. I think if we treat that value as a duration that is added to the unix epoch then we will have a moment we can use for comparisons.

cc @noamr as the expert who I believe wrote the time spec infrastructure.

Copy link
Member

@jyasskin jyasskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on the use of time:

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved

<div algorithm>

The <dfn method for="StorageBucket">setExpires(|expires|)</dfn> method steps are:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since epoch-relative timestamps are just untyped numbers that get confused between their interpretations as epoch-relative, page-load-relative, and durations between other points in time, if a spec is trying to represent a point in time, I prefer to use moments inside of specifications. Good wording for that, given an expires initialized by Date.now() + duration_ms, would probably be Let |expiration| be |expires| milliseconds after the [=Unix epoch=].

I have less of a strong preference for how to express the Javascript interface, at least until Temporal is ready. I'd lean toward a setMaxAge(duration_ms) so readers don't have to wonder if it's a duration or a timestamp, and then Let |expiration| be |duration_ms| milliseconds after [=this=]'s [=relevant settings object=]'s [=environment settings object/current wall time=].

index.bs Outdated Show resolved Hide resolved
ayuishii and others added 2 commits March 13, 2023 17:49
Co-authored-by: Jeffrey Yasskin <[email protected]>
* expiration time

* Specify moment on the wall clock

* Use [=Unix epoch=]

* Mark -> set

* Fix expiration comparison
Copy link
Member

@jyasskin jyasskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The time bits look good. I didn't look hard at the rest of the spec.

@@ -11,6 +11,24 @@ Editor: Ayu Ishii, Google https://www.google.com/, [email protected]
Former Editor: Victor Costan
!Participate: <a href="https://github.com/WICG/storage-buckets">GitHub WICG/storage-buckets</a> (<a href="https://github.com/WICG/storage-buckets/issues/new">new issue</a>, <a href="https://github.com/WICG/storage-buckets/issues?state=open">open issues</a>)
Abstract: The Storage Buckets API provides a way for sites to organize locally stored data into groupings called "storage buckets". This allows the user agent or sites to manage and delete buckets independently rather than applying the same treatment to all the data from a single origin.
Markup Shorthands: css no, markdown yes
</pre>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend using Assume Explicit For: yes so that you can't accidentally use [=current wall time=] without noticing that it's [=environment settings object/current wall time=] and needs a settings object, and also so that other specs don't introduce errors into this one by defining new fields with an existing name but a unique for. That'll probably require some fixup, so it's best to do it in a different PR, and I also don't insist..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! I'll try doing this is a follow-up.

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated

1. If |bucket|'s [=storage bucket/removed=] flag is true, [=reject=] |p| with an {{InvalidStateError}}.

1. Otherwise, set |bucket|'s [=StorageBucket/expiration time=] to |expires| milliseconds after the [=Unix epoch=] and [=/resolve=] |p|.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's surprising that this method returns a Promise but never goes in parallel. If this is to accommodate UAs that might want to send IPCs to change bucket attributes, you should spec it to resolve the promise from a parallel context, which is an observable change for callers, because it involves queueing a new task. If there's no reason to use a Promise, it's simpler not to.

Copy link

@wanderview wanderview Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommended that they do this. They had parallel steps, but accessing context stuff that can't be used on a parallel task. And in spec world all the other stuff is sync, but in implementation world there will be async IPCs. Do we just queue a task to resolve the promise and do nothing else? This seems unnecessary since resolving a promise will already result in an async microtask.

Also, it does seem reasonable to make an API shape support async with a promise for future proofing even if some implementations can by synchronous today.

Is there a recommended best practice here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's queue a task, with a note about the implementation detail that requires it. It's possible to write JavaScript that cares whether it's just a microtask or a whole task, so we don't want to write a spec that guarantees an immediate resolution if our implementation actually lets the event loop spin.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On future proofing, I think Hyrum's Law makes us do more than just returning a promise to actually give future-us extra flexibility. Alex Russell has argued that Promises are enough to let us add permission prompts, and I think that's wrong for the same reason. Exactly what more we need to do will depend on the situation, and the fact that our implementation is actually async is enough here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I think there might be a number of methods that need to be updated for this guidance. FYI @ayuishii.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the discussion! I've updated relevant methods to use queue a task, hopefully I did this correctly. Please let me know if you spot any issues. Thanks!

ayuishii and others added 2 commits March 15, 2023 12:45
Use "before" instead of "less than or equal"

Co-authored-by: Jeffrey Yasskin <[email protected]>
* Use [=queue a task=]

* Additional changes
index.bs Outdated Show resolved Hide resolved
index.bs Outdated

1. Otherwise, [=/resolve=] |p| with false.
1. Otherwise, [=queue a task=] to [=/resolve=] |p| with false.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there are two steps that both resolve the promise like this I could see moving them into a parallel steps section. But this is probably fine too. I don't want to make you go back and forth too much.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDB seemed to follow this pattern so decided on this route, although not sure if it was the best example. Happy to update in a follow-up though if this isn't preferred.

index.bs Show resolved Hide resolved
Copy link

@wanderview wanderview left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few remaining comments, but i think the spec is looking quite good. Thanks for all your work on this! I'm happy for this to land and we can tweak minor things later if necessary.

* Fix persist() to not reference [=this=] in parallel steps

* Note on promises
@ayuishii
Copy link
Collaborator

Thanks @wanderview @jyasskin for all the in-depth reviews! I expect there's still some more tweaks to come, but I appreciate all your attention and patience while iterating on this initial draft. 🙏

@ayuishii ayuishii merged commit 5126125 into gh-pages Mar 16, 2023
@ayuishii ayuishii deleted the spec-draft branch March 16, 2023 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants