Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Network module #204

Merged
merged 2 commits into from
Mar 30, 2023
Merged

Add Network module #204

merged 2 commits into from
Mar 30, 2023

Conversation

jgraham
Copy link
Member

@jgraham jgraham commented May 9, 2022

This is designed to support logging network requests. Important use
cases include:

  • Determing if a specific network request occured
  • Monitoring request performance
  • Generating a HAR file for offline analysis of requests

This does not currently attempt to support network request
interception (i.e. mutable access to requests), but the intent is that
the same lifecycle events could be used in a blocking way.

It also currently only supports HTTP-type requests and not
e.g. WebSockets. Some support for integration with service workers is
also missing.

The typical order of events is

network.beforeRequestSent - before request is sent, would later be the
right point to change the request headers or body or prevent the
request entirely.

network.responseStarted - after response headers are received, but
before body. Would later be a point to override the response headers
or body.

network.responseCompleted - after the request is fully complete.

network.fetchError - After any error that will prevent the request
from completing.

Compared to CDP this is missing an event for data being
received. Compared to WebExtensions, this is missing an event after
the headers are sent but before the data is sent.


Preview | Diff

@jgraham
Copy link
Member Author

jgraham commented May 9, 2022

This only has CDDL so far; no prose or integration with the fetch spec.

The main question at this stage is about the API shape; whether it seems reasonable and what's missing to support the use cases we want to support.

(CCing some people who might be interested in giving initial feedback @sadym-chromium @bwalderman @shs96c @mathiasbynens )

index.bs Outdated
[=Remote end definition=] and [=local end definition=]

<pre class="cddl local-cddl">
NetworkFetchTimingInfo = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in parts similar to https://w3c.github.io/navigation-timing/#the-performancetiming-interface, is the intention that the definitions be taken from that spec?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! So in my ideal world this would be exactly like performance timing, since that will be easy to specify and implement. But I think performance timing is a bit different from what's in CDP (https://chromedevtools.github.io/devtools-protocol/tot/Network/#type-ResourceTiming) so some input on what's actually required here would be very helpful. @janodvarko might also have some input on what's required here e.g. for HAR export.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance timing spec is missing "requestEnd" attribute that could be used to calculate time needed to send a request: requestEnd - requestStart == wait in HAR spec. (this might be especially useful for requests sending POST data).

CDP has both sendStart and sendEnd, so from this perspective CDP timing is closer to HAR (i.e. it has all HAR needs)

HAR also needs onLoad and domContentLoaded timings. Those are defined by "Navigation Timing interface" https://w3c.github.io/navigation-timing/#dom-performancenavigationtiming

Does CDP has an interface for "NavigationTiming"?

HAR spec request timings: http://www.softwareishard.com/blog/har-12-spec/#timings
HAR spec page timings: http://www.softwareishard.com/blog/har-12-spec/#pageTimings

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDP has some events in the Page domain that allow getting timestamps for load-related events e.g. https://chromedevtools.github.io/devtools-protocol/tot/Page/#event-domContentEventFired . I think we could add timestamps to similar events in browsingContext for that part of the use case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just filed #271 to add timestamps to browsingContext related load events.

@css-meeting-bot
Copy link
Member

The Browser Testing and Tools Working Group just discussed Network Module.

The full IRC log of that discussion <jgraham> Topic: Network Module
<brwalder> jgraham: Possible compromise. Allow strong reference on root object. If strong ref is not requested, objectIds are used in the serialization for cycles only, not for identification.
<jgraham> GitHub: https://github.com//pull/204
<simonstewart> q+
<brwalder> jgraham: PR draft is up for logging and HAR file scenario. Pretty close to what's in CDP, also considering what extensions do. Please take a look, add comments. Once we have consensus, we can update the Fetch spec and proceed with the PR.
<jgraham> ack sadym
<brwalder> sadym: Took a look and noticed some differences in how redirection is handled vs how CDP handles it. Also some possible issues with cookie handling.
<brwalder> jgraham: Redirect handling may need another look. There is also an issue in the WebDriver spec for cookie and post body encoding. The proposal for cookies follows what WebExtensions does, where there is either a utf8 value field or a binary field.
<jgraham> ack simonstewart
<brwalder> simonstewart: **Confirmed that this will be the future basis for interception and we won't need to add a new module like CDP.
<brwalder> simonstewart: How are responses handled? Provide by default or opt in? Also how are chunks handled? Do you get multiple events?
<brwalder> jgraham: Not all edge cases are considered yet. These are good questions for the PR.
<brwalder> jgraham: Plan for interception is to add an "isBlocked" field that the client can check and act on. Unclear whether CDPs two separate APIs are a historical accident or a design choice.
<foolip> q?

@jgraham
Copy link
Member Author

jgraham commented May 23, 2022

On redirects: it seems like what CDP does is just sends the network.responseReceivedExtraInfo event, and not the network.responseReceived event when a response will be a redirect. Presumably the advantage of this is that the client that doesn't care about the individual redirects can just listen for network.responseReceived and ignore the intermediate netowrk.responseReceivedExtraInfo events. You also get a network.requestWillBeSent[ExtraInfo] pair per request, with the redirectResponse property of the event set to the network.Response data you would have got from a network.ResponseReceived event if there had been one.

@jgraham
Copy link
Member Author

jgraham commented Jun 22, 2022

So, for redirects I have two possible suggestions: we either have a property on events so that they're identified as a redirect, or we have different event names for response/request pairs caused by redirects.

In the former case, clients that just want to know when a request is complete and don't care about following the full chain have to listen for all the events, but can filter out any with (e.g.) an isRedirect flag set.

In the latter case we end up with redirectResponseStarted and beforeSendRedirectRequest events (or some similar names) and it has the advantage that clients can just not subscribe to those events if they don't care about redirects, but the disadvantage that we're making things more complex for clients that do care about following the redirect chain.

@jgraham
Copy link
Member Author

jgraham commented Jun 23, 2022

For chunked responses, it seems like CDP does "nothing". You get the same events, no indication of where the chunks were, and no access to trailing headers (tested with http://wpt.live/xhr/resources/chunked.py) I guess this is not a very common case that devtools need to support.

@whimboo whimboo added needs-discussion Issues to be discussed by the working group module-network Network module labels Jun 23, 2022
@css-meeting-bot
Copy link
Member

The Browser Testing and Tools Working Group just discussed Network request logging.

The full IRC log of that discussion <jgraham> topic: Network request logging
<jgraham> github: https://github.com//pull/204
<simonstewart> jgraham: updates on the previous discussion: draft PR up with a rough sketch of the API for logging network events. Is a prelude for network interception
<simonstewart> jgraham: last time, the two main questions were about redirects, and what to do with chunked responses. Research has been done!
<simonstewart> jgraham: Chunked responses are not supported in any meaningful way anywhere. In particular, trailing headers are inaccessible in CDP and Firefox devtools
<simonstewart> jgraham: Redirects do need to be supported. CDP has two separate events which it sends for normal requests and responses containing extra info. If you get a redirect, you only get the extra info event
<simonstewart> jgraham: Similarly, the request for a redirect only gets the extra-info event.
<simonstewart> jgraham: option 1 is to do what CDP does here. When we think about this, there are a few use cases.
<simonstewart> jgraham: 1/ People who don't care about redirects, and just want to read the response body. 2/ when you do care about redirects because you want to debug network connections
<simonstewart> jgraham: if we copy the CDP model, case 2 would register for everything, and case 1 for only one event
<simonstewart> jgraham: Two options in the PR. 1/ if it's a redirect, there's a flag in the body. 2/ we have two completely separate sets of events: one for a user initiated request, and another for response/request pairs which are part of a redirect
<simonstewart> jgraham: so if you use case 2, you'd register for both events.
<jgraham> q?
<patrickangle> q+
<jgraham> ack patrickangle
<simonstewart> patrickangle: looking at the PR, it seems like it would be better for clients and people using the APIs if all network events were treated the same way.
<simonstewart> patrickangle: people want to know that a network event has occurred, and having a single set of events makes this clearer.
<simonstewart> jgraham: I'm not sure. My concern with a single set of events is whether we want to support use-case 1.
<simonstewart> q+
<jgraham> ack simonstewart
<jgraham> simonstewart: For a user having fewer events to subscribe to makes a simpler model. In our experience people are interested in redirects, even if they don't really want to log that.
<jgraham> q?
<simonstewart> jgraham: sounds like people want one event with a flag on it, rather than multiple different events.
<simonstewart> jgraham: would be interesting to know how CDP ended up with two events....
<jgraham> q?
<simonstewart> jgraham: if anyone knows, please update the PR

index.bs Outdated Show resolved Hide resolved
@jgraham jgraham force-pushed the network_logging branch 2 times, most recently from ae26fff to b1e5ffa Compare July 27, 2022 12:24
Copy link
Contributor

@juliandescottes juliandescottes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some additional fly-by comments based on a prototype building HAR files using the current events.

There's also the cache entry which does not correspond to anything currently defined in the events, but I think all fields are marked as optional (https://w3c.github.io/web-performance/specs/HAR/Overview.html#cache).

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated
Comment on lines 3825 to 3826
navigation: Navigation / null,
context: BrowsingContext / null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For HAR generation, we normally have to map each request to a specific "page" (https://w3c.github.io/web-performance/specs/HAR/Overview.html#pages)

I am wondering if navigation + context will be enough information to match a request with a specific page. Imagine a tab navigates from pageA to pageB. For a late (non-navigation related) request triggered in pageA, we would not get a navigation information, only context. And since the context id remains identical for a top-level browsing context across navigations, we might have to "guess" the page based on when we process this event. This might work but feels racy and I know it's already a complaint users have for HAR creation in Firefox.

Exposing an id unique to the window global could help here, although that's something we would need to also expose in browsing context events?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, this is a good point. I wonder if we should defer this to after we have support for returning Window objects from scripts; I think that will end up with some kind of per-Window id along with the context id, and maybe we could reuse that here. Or we could use the id of the realm in which the request originated (although arguably that wouldn't make sense for a non-scripted request orginating in a document).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave this for later as well. In practice we should be able to infer the page from the time of the request for now (or we can leave the page information out in a first iteration as well)

@foolip
Copy link
Member

foolip commented Aug 24, 2022

@jgraham this PR was mentioned (by @whimboo) in Berlin. It's a draft, which is mentioned in comments above, but would you like review to get it out of draft, or do you want to do some more work first?

@jgraham jgraham marked this pull request as ready for review September 2, 2022 12:17
@jgraham
Copy link
Member Author

jgraham commented Sep 2, 2022

OK, I've marked this as ready for review, although it depends on an integration in the fetch spec that currently doesn't exist. In the meantime I'd appreciate review on the API design and featureset, in the hope that we can have a pretty solid idea of what we need in Fetch before getting review there.

Copy link
Contributor

@sadym-chromium sadym-chromium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the fact WebDriver Classic can be referenced for the action specification.
Wrong PR, sorry

@foolip
Copy link
Member

foolip commented Sep 12, 2022

@whimboo is the needs-discussion label still needed?

@whimboo
Copy link
Contributor

whimboo commented Sep 12, 2022

@whimboo is the needs-discussion label still needed?

Yes, this PR still needs to be discussed and we put it onto the agenda for TPAC.

@css-meeting-bot
Copy link
Member

The Browser Testing and Tools Working Group just discussed Network logging.

The full IRC log of that discussion <AutomatedTester> topic: Network logging
<AutomatedTester> git
<AutomatedTester> github: https://github.com//pull/204
<AutomatedTester> jgraham (IRC): we have a PR up for introducing a network module that gives you a series of events to get netwrok activity
<AutomatedTester> ... it's limited to HTTP like connections only
<AutomatedTester> ... the use cases we are targeting are
<AutomatedTester> ... check whether a specific resource was requested
<AutomatedTester> ... and generating HAR files for perf monitoring
<AutomatedTester> ... we have discussed this in previous meetings which has shaped this
<AutomatedTester> ... we have added 1 set of events unlike the way CDP handles this
<AutomatedTester> ... and if we get a redirect we would add a flag to show that
<AutomatedTester> ... the API and how it works is still to be written
<AutomatedTester> ... and at this point what i would like to do is make sure that we don't have issues with the initial design before we start on the fetch integration
<AutomatedTester> ... otherwise it will be a lot of work to go back and redo it
<AutomatedTester> q?
<AutomatedTester> Devin Rousso: i like that this is simpler, it's nice
<AutomatedTester> automatedtester: Where is this in your priority list?
<AutomatedTester> jgraham (IRC): this is pretty high in my list.
<AutomatedTester> ... we would liek to start prototyping this
<sadym> q+
<AutomatedTester> ... from our point it is not blocked on the fetch work that needs to happen but it would be good to get agreement on the api etc
<AutomatedTester> ack next
<AutomatedTester> sadym (IRC): how would the interception of redirects work since there is only 1 event/
<AutomatedTester> jgraham (IRC): for logging we discussed in a previous group meeting
<AutomatedTester> ... the rough concensus then was to have 1 event and flags on it
<AutomatedTester> ... instead of events for all of the redirect chain
<AutomatedTester> ... assuming we are happy with that design
<AutomatedTester> ... [describes fields that could be added to the event]
<AutomatedTester> Devin Rousso: would the request ID be maintained the whole time?
<AutomatedTester> jgraham (IRC): I hadn't thought about that
<AutomatedTester> Devin Rousso: [describes how it works in web inspector]
<AutomatedTester> Devin Rousso: so as long as we use the same request iD for the network journey. For redirects we want to have the same ID or have a field saying it's a redirect
<AutomatedTester> ... i prefer the former but can make either work
<AutomatedTester> q?
<AutomatedTester> jgraham (IRC): if you look through the current PR. there are TODO items in there that need answers. We can discuss
<sadym> @foolip is about to review the PR
<AutomatedTester> ... or people could answer on the PR
<AutomatedTester> Devin Rousso: generally speaking we like this

@whimboo whimboo removed the needs-discussion Issues to be discussed by the working group label Sep 23, 2022
@whimboo whimboo requested a review from foolip September 23, 2022 07:32
@whimboo
Copy link
Contributor

whimboo commented Sep 23, 2022

As per action items from the W3C meeting @foolip wanted to review the PR.

@jgraham mind resolving the conflicts that meanwhile exist? Thanks.

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
@jgraham jgraham force-pushed the network_logging branch 2 times, most recently from 1eb7520 to e3953da Compare October 25, 2022 12:54
@jgraham
Copy link
Member Author

jgraham commented Nov 7, 2022

I updated the naming to match #319 so I've resolved all the issues relating to that, assuming that PR will land before this one.

index.bs Outdated Show resolved Hide resolved
index.bs Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
jgraham added 2 commits March 29, 2023 16:44
This is designed to support logging network requests. Important use
cases include:

* Determing if a specific network request occured
* Monitoring request performance
* Generating a HAR file for offline analysis of requests

This does not currently attempt to support network request
interception (i.e. mutable access to requests), but the intent is that
the same lifecycle events could be used in a blocking way.

It also currently only supports HTTP-type requests and not
e.g. WebSockets. Some support for integration with service workers is
also missing.

The typical order of events is

network.beforeRequestSent - before request is sent, would later be the
right point to change the request headers or body or prevent the
request entirely.

network.responseStarted - after response headers are received, but
before body. Would later be a point to override the response headers
or body.

network.responseCompleted - after the request is fully complete.

network.fetchError - After any error that will prevent the request
from completing.

Compared to CDP this is missing an event for data being
received. Compared to WebExtensions, this is missing an event after
the headers are sent but before the data is sent.
@jgraham jgraham merged commit e97c0af into master Mar 30, 2023
@jgraham jgraham deleted the network_logging branch March 30, 2023 10:27
index.bs Show resolved Hide resolved
thiagowfx pushed a commit to web-platform-tests/wpt that referenced this pull request May 25, 2023
tlsEnd never made it to the spec:
https://w3c.github.io/webdriver-bidi/#type-network-FetchTimingInfo

c.f. w3c/webdriver-bidi#204:

> tlsEnd: float this should be the same as connectEnd

As it is a redundant param anyway, remove it from WPT.
@jgraham
Copy link
Member Author

jgraham commented May 25, 2023

@juliandescottes

thiagowfx pushed a commit to web-platform-tests/wpt that referenced this pull request May 25, 2023
tlsEnd never made it to the spec:
https://w3c.github.io/webdriver-bidi/#type-network-FetchTimingInfo

c.f. w3c/webdriver-bidi#204:

> tlsEnd: float this should be the same as connectEnd

As it is a redundant param anyway, remove it from WPT.
@thiagowfx
Copy link
Member

Another follow-up here: network.responseStarted does not map to CDP (c.f. https://chromedevtools.github.io/devtools-protocol/tot/Network/). There's no straightforward way to do it in CDP.

The closest event is Network.responseReceivedExtraInfo, however:

  • (i) it's experimental
  • (ii) it only includes headers - all other needed fields are not available
  • (iii) it's not guaranteed to be emitted, and even if it is, there's no guarantee it comes before Network.responseReceived:

Fired when additional information about a responseReceived event is available from the network stack. Not every responseReceived event will have an additional responseReceivedExtraInfo for it, and responseReceivedExtraInfo may be fired before or after responseReceived.

What is the motivation to include this in the spec? Can we revisit if it is absolutely necessary?

(cc @sadym-chromium as we discussed)

@thiagowfx
Copy link
Member

Also cc @OrKoN to confirm the above statement from the CDP side.

@OrKoN
Copy link
Contributor

OrKoN commented May 31, 2023

To the extend of my knowledge, this is right: it does not look like there is an equivalent in CDP (responseReceivedExtraInfo is not exactly that and cannot be used for interception if that is the plan for the event).

@whimboo
Copy link
Contributor

whimboo commented May 31, 2023

Could we please discuss that on a new issue? Using a PR which has been closed already is not a good place. Thanks.

@thiagowfx
Copy link
Member

Could we please discuss that on a new issue? Using a PR which has been closed already is not a good place. Thanks.

Sure! I'll just open a new issue and link to the existing PR then, sorry.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Jun 13, 2023
…mingInfo, a=testonly

Automatic update from web-platform-tests
bidi: remove tlsEnd from network.FetchTimingInfo (#40223)

tlsEnd never made it to the spec:
https://w3c.github.io/webdriver-bidi/#type-network-FetchTimingInfo

c.f. w3c/webdriver-bidi#204:

> tlsEnd: float this should be the same as connectEnd

As it is a redundant param anyway, remove it from WPT.
--

wpt-commits: 1d684466a3f028910b5b479c41093310622a9b66
wpt-pr: 40223
ErichDonGubler pushed a commit to erichdongubler-mozilla/firefox that referenced this pull request Jun 14, 2023
…mingInfo, a=testonly

Automatic update from web-platform-tests
bidi: remove tlsEnd from network.FetchTimingInfo (#40223)

tlsEnd never made it to the spec:
https://w3c.github.io/webdriver-bidi/#type-network-FetchTimingInfo

c.f. w3c/webdriver-bidi#204:

> tlsEnd: float this should be the same as connectEnd

As it is a redundant param anyway, remove it from WPT.
--

wpt-commits: 1d684466a3f028910b5b479c41093310622a9b66
wpt-pr: 40223
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this pull request Jun 16, 2023
…mingInfo, a=testonly

Automatic update from web-platform-tests
bidi: remove tlsEnd from network.FetchTimingInfo (#40223)

tlsEnd never made it to the spec:
https://w3c.github.io/webdriver-bidi/#type-network-FetchTimingInfo

c.f. w3c/webdriver-bidi#204:

> tlsEnd: float this should be the same as connectEnd

As it is a redundant param anyway, remove it from WPT.
--

wpt-commits: 1d684466a3f028910b5b479c41093310622a9b66
wpt-pr: 40223

UltraBlame original commit: c8e9e4cd8d2ee59811381fa6df6a45750e2cbb1e
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Jun 16, 2023
…mingInfo, a=testonly

Automatic update from web-platform-tests
bidi: remove tlsEnd from network.FetchTimingInfo (#40223)

tlsEnd never made it to the spec:
https://w3c.github.io/webdriver-bidi/#type-network-FetchTimingInfo

c.f. w3c/webdriver-bidi#204:

> tlsEnd: float this should be the same as connectEnd

As it is a redundant param anyway, remove it from WPT.
--

wpt-commits: 1d684466a3f028910b5b479c41093310622a9b66
wpt-pr: 40223

UltraBlame original commit: c8e9e4cd8d2ee59811381fa6df6a45750e2cbb1e
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Jun 16, 2023
…mingInfo, a=testonly

Automatic update from web-platform-tests
bidi: remove tlsEnd from network.FetchTimingInfo (#40223)

tlsEnd never made it to the spec:
https://w3c.github.io/webdriver-bidi/#type-network-FetchTimingInfo

c.f. w3c/webdriver-bidi#204:

> tlsEnd: float this should be the same as connectEnd

As it is a redundant param anyway, remove it from WPT.
--

wpt-commits: 1d684466a3f028910b5b479c41093310622a9b66
wpt-pr: 40223

UltraBlame original commit: c8e9e4cd8d2ee59811381fa6df6a45750e2cbb1e
@christian-bromann
Copy link
Member

It also currently only supports HTTP-type requests and not
e.g. WebSockets. Some support for integration with service workers is
also missing.

@jgraham do we have an issue for this? Should we start tracking this?

@whimboo
Copy link
Contributor

whimboo commented Jun 26, 2024

It also currently only supports HTTP-type requests and not
e.g. WebSockets. Some support for integration with service workers is
also missing.

@jgraham do we have an issue for this? Should we start tracking this?

@christian-bromann I would suggest that you file new issues. I cannot find any existing ones covering those topics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module-network Network module
Projects
None yet
Development

Successfully merging this pull request may close these issues.