forked from w3c/security-questionnaire
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.bs
1028 lines (859 loc) · 45.3 KB
/
index.bs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<pre class="metadata">
Title: Self-Review Questionnaire: Security and Privacy
Status: ED
TR: https://www.w3.org/TR/security-privacy-questionnaire/
ED: https://w3ctag.github.io/security-questionnaire/
Shortname: security-privacy-questionnaire
Repository: w3ctag/security-questionnaire
Level: None
Editor: Theresa O’Connor, w3cid 40614, Apple Inc. https://apple.com, [email protected]
Editor: Pete Snyder, w3cid 109401, Brave https://brave.com, [email protected]
Former Editor: Jason Novak, Apple Inc., https://apple.com
Former Editor: Lukasz Olejnik, Independent researcher, https://lukaszolejnik.com
Former Editor: Mike West, Google Inc., [email protected]
Group: tag
Markup Shorthands: css no, markdown yes
Local Boilerplate: status yes
Abstract: This document contains a set of questions to be used when
evaluating the security and privacy implications of web platform
technologies.
</pre>
<h2 id="intro">Introduction</h2>
When designing new features for the Web platform,
we must always consider the security and privacy implications of our work.
New Web features should always
maintain or enhance
the overall security and privacy of the Web.
This document contains a set of questions
intended to help <abbr title="specification">spec</abbr> authors
as they think through
the security and privacy implications
of their work.
It also documents mitigation strategies
that spec authors can use to address
security and privacy concerns they encounter as they work on their spec.
This document is itself a work in progress,
and there may be security or privacy concerns
which this document does not (yet) cover.
Please [let us know](https://github.com/w3ctag/security-questionnaire/issues/new)
if you identify a security or privacy concern
this questionnaire should ask about.
<h3 id="howtouse">How To Use The Questionnaire</h3>
Spec authors should work through these questions
early on in the design process,
when things are easier to change.
When privacy and security issues are only found later,
after the feature has shipped,
it's much harder to change the design.
If security or privacy issues are found late,
user agents may need to adopt breaking changes
to protect their users' privacy and security.
These questions should be kept in mind throughout work on any specification.
Spec authors should periodically revisit this questionnaire
to continue to consider the privacy and security implications of
their features, as their design changes over time.
<h3 id=reviews>TAG and PING reviews and this questionnaire</h3>
When authors request
a [review](https://github.com/w3ctag/design-reviews)
from the [Technical Architecture Group (TAG)](https://www.w3.org/2001/tag/),
the TAG asks that authors provide answers
to the questions in this document.
The [Privacy Interest Group (PING)](https://www.w3.org/Privacy/IG/)
also considers answers to these questions
while conducting
[privacy reviews](https://github.com/w3cping/privacy-reviews/issues).
The TAG and PING use this document
to record security and privacy questions
which come up during our reviews.
Working through these questions can save
both spec authors and the people performing design reviews
a lot of time.
To make it easier for anyone requesting a review
to provide their answers to these questions to the reviewers,
we've prepared [a list of these questions in Markdown](https://raw.githubusercontent.com/w3ctag/security-questionnaire/master/questionnaire.markdown).
<h2 id="questions">Questions to Consider</h2>
<h3 class=question id="purpose">
What information might this feature expose to Web sites or other parties,
and for what purposes is that exposure necessary?
</h3>
Just because information can be exposed to the web doesn’t mean that it
should be. How does exposing this information to an origin benefit a user?
Is the benefit outweighed by the potential risks? If so, how?
In answering this question, it often helps to ensure that the use cases your
feature and specification is enable are made clear in the specification
itself to ensure that TAG and PING understand the feature-privacy tradeoffs
being made.
<h3 class=question id="minimum-data">
Is this specification exposing the minimum amount of information necessary
to power the feature?
</h3>
Regardless of what data is being exposed, is the specification exposing the
bare minimum necessary to achieve the desired use cases? If not, why not and
why expose the additional information?
<h3 class=question id="personal-data">
How does this specification deal with personal information or
personally-identifiable information or information derived thereof?
</h3>
Personal information is data about a user (home address) or information
that could be used to identify a user (alias or email address). This is
distinct from personally identifiable information (PII), as the exact
definition of what’s considered PII varies from jurisdiction to jurisdiction.
If the specification under consideration exposes personal information or PII
or their derivatives that could still identify an individual to the web, it’s
important to consider ways to mitigate the obvious impacts. For instance:
* A feature which uses biometric data (fingerprints or retina scans)
should refuse to expose the raw data to the web, instead using the raw
data only to unlock some origin-specific and ephemeral secret and
transmitting that secret instead.
* Including a factor of user mediation should be considered, in order to
ensure that no data is exposed without a user’s explicit choice (and
hopefully understanding). One way to achieve this may be the use of
Permission API [[PERMISSIONS]], or additional dialogs like in Payment
Request API [[PAYMENT-REQUEST]]
<h3 class=question id="sensitive-data">
How does this specification deal with sensitive information?
</h3>
Just because data is not personal information or PII, that does not mean
that it is not sensitive information; moreover, whether any given information
is sensitive may vary from user to user. Data to consider if sensitive
includes: financial data, credentials, health information, location, or
credentials. When this data is exposed to the web, steps should be taken to
mitigate the risk of exposing it.
<p class=example>
Credential Management [[CREDENTIAL-MANAGEMENT-1]] allows sites to request
a user's credentials from a user agent's password manager in order to
sign the user in quickly and easily. This opens the door for abuse, as
a single XSS vulnerability could expose user data trivially to
JavaScript. The Credential Management API mitigates
the risk by offering the username and password as only an opaque
{{FormData}} object which cannot be directly read by JavaScript
and strongly suggests that authors use Content Security Policy [[CSP]]
with reasonable [=connect-src=] and [=form-action=]
values to further mitigate the risk of exfiltration.
</p>
<p class=example>
Geolocation information can serve many use cases at a much less granular
precision than the user agent can offer. For instance, a restaurant
recommendation can be generated by asking for a user’s city-level
location rather than a position accurate to the centimeter.
</p>
<p class=example>
A Geofencing proposal [[GEOFENCING-EXPLAINED]] ties itself to service workers and
therefore to encrypted and authenticated origins.
</p>
<h3 class=question id="persistent-origin-specific-state">
Does this specification introduce new state for an origin that persists
across browsing sessions?
</h3>
Allowing an origin to persist data on a user’s device across browsing
sessions introduces the risk that this state may be used to track a user
without their knowledge or control, either in a first party or third party
contexts. New state persistence mechanisms should not be introduced without
mitigations to prevent it from being used to track users across domains or
without control over clearing this state. And, are there specific caches
that a user agent should specially consider?
<p class=example>
Service Worker [[SERVICE-WORKERS]] intercept all requests made by an
origin, allowing sites to function perfectly even when offline. A
maliciously-injected service worker, however, would be devastating (as
documented in [[SERVICE-WORKERS#security-considerations]]).
They mitigate the risks an [=active network attacker=] or [=XSS=]
vulnerability present by requiring an encrypted and authenticated
connection in order to register a service worker.
</p>
<p class=example>
Platform-specific DRM implementations might expose origin-specific
information in order to help identify users and determine whether they
ought to be granted access to a specific piece of media. These kinds of
identifiers should be carefully evaluated to determine how abuse can be
mitigated; identifiers which a user cannot easily change are very
valuable from a tracking perspective, and protecting the identifiers from
an active network attacker is an important concern.
</p>
<p class=example>
Cookies, `ETag`, `Last Modified`, {{localStorage}}, {{indexedDB}}, etc. all
allow an origin to store information about a user, and retrieve it later,
directly or indirectly. User agents mitigate the risk that these kinds of
storage mechanisms will form a persistent identifier by offering users the
ability to wipe out the data contained in these types of storage.
</p>
<h3 class=question id="underlying-platform-data">
What information from the underlying platform, e.g. configuration data, is
exposed by this specification to an origin?
</h3>
If so, is the information exposed from the underlying platform consistent
across origins? This includes but is not limited to information relating to
the user configuration, system information including sensors, and
communication methods.
When a specification exposes specific information about a host to an origin,
if that information changes rarely and is not variable across origins, then
it can be used to uniquely identify a user across two origins — either
directly because any given piece of information is unique or because the
combination of disparate pieces of information are unique and can be used to
form a fingerprint [[FINGERPRINTING-GUIDANCE]]. Specifications and user agents
should treat the risk of fingerprinting by carefully considering the surface
of available information, and the relative differences between software and
hardware stacks. Sometimes reducing fingerprintability may as simple as
ensuring consistency, i.e. ordering the list of fonts, but sometimes may be
more complex.
Such information should not be revealed to an origin without a user’s
knowledge and consent barring mitigations in the specification to prevent the
information from being uniquely identifying or able to unexpectedly
exfiltrate data.
<p class=example>
The `RENDERER` string exposed by some WebGL implementations
improves performance in some kinds of applications, but does so at the
cost of adding persistent state to a user's fingerprint. These kinds of
device-level details should be carefully weighed to ensure that the costs
are outweighed by the benefits.
</p>
<p class=example>
The {{NavigatorPlugins}} list exposed via the DOM practically never
changes for most users. Some user agents have taken steps to reduce the
entropy introduced by [disallowing direct enumeration of the plugin list](https://bugzilla.mozilla.org/show_bug.cgi?id=757726).
</p>
<h3 class=question id="sensor-data">
Does this specification allow an origin access to sensors on a user’s
device
</h3>
If so, what kind of sensors and information derived from those sensors does
this standard expose to origins?
Information from sensors may serve as a fingerprinting vector across origins.
In addition, sensor also reveals something about my device or environment and
that fact might be what is sensitive. In addition, as technology advances,
mitigations in place at the time a specification is written may have to be
reconsidered as the threat landscape changes.
Sensor data might even become a cross-origin identifier when the sensor reading
is relatively stable, for example for short time periods (seconds, minutes, even days), and
is consistent across-origins. In fact, if two user-agents expose the same
sensor data the same way, it may become a cross-browser, possibly even a cross-device identifier.
<p class=example>
As gyroscopes advanced, their sampling rate had to be lowered to
prevent them from being used as a microphone as one such example
[[GYROSPEECHRECOGNITION]].
</p>
<p class=example>
ALS sensors could allowed for an attacker to exfiltrate whether or not a
user had visited given links [[OLEJNIK-ALS]].
</p>
<p class=example>
Even relatively short lived data, like the battery status, may be able to
serve as an identifier if misused/abused [[OLEJNIK-BATTERY]].
</p>
<h3 class=question id="other-data">
What data does this specification expose to an origin? Please also
document what data is identical to data exposed by other features, in the
same or different contexts.
</h3>
As noted above in [[#sop-violations]], the [=same-origin policy=] is an
important security barrier that new features need to carefully consider.
If a specification exposes details about another origin's state, or allows
POST or GET requests to be made to another origin, the consequences can be
severe.
<p class=example>
Content Security Policy [[CSP]] unintentionally exposed redirect targets
cross-origin by allowing one origin to infer details about another origin
through violation reports (see [[HOMAKOV]]). The working group eventually
mitigated the risk by reducing a policy's granularity after a redirect.
</p>
<p class=example>
Beacon [[BEACON]] allows an origin to send POST requests to an endpoint
on another origin. They decided that this feature didn't add any new
attack surface above and beyond what normal form submission entails, so
no extra mitigation was necessary.
</p>
<h3 class=question id="string-to-script">
Does this specification enable new script execution/loading mechanisms?
</h3>
* HTML Imports [[HTML-IMPORTS]] create a new script-loading mechanism, using
<{link}> rather than <{script}>, which might be easy to overlook when
evaluating an application's attack surface. The working group notes this
risk, and ensured that they required reasonable interactions with Content
Security Policy's [=script-src=] directive.
* New string-to-script mechanism? (e.g. {{eval()}} or {{setTimeout()}})
* What about style?
<h3 class=question id="remote-device">
Does this specification allow an origin to access other devices?
</h3>
If so, what devices does this specification allow an origin to access?
Accessing other devices, both via network connections and via
direct connection to the user's machine (e.g. via Bluetooth,
NFC, or USB), could expose vulnerabilities - some of
these devices were not created with web connectivity in mind and may be inadequately
hardened against malicious input, or with the use on the web.
Exposing other devices on a user’s local network also has significant privacy
risk:
* If two user agents have the same devices on their local network, an
attacker may infer that the two user agents are running on the same host
or are being used by two separate users who are in the same physical
location.
* Enumerating the devices on a user’s local network provides significant
entropy that an attacker may use to fingerprint the user agent.
* If the specification exposes persistent or long lived identifiers of
local network devices, that provides attackers with a way to track a user
over time even if a user takes steps to prevent such tracking (e.g.
clearing cookies and other stateful tracking mechanisms).
* Direct connections might be also be used to bypass security checks that
other APIs would provide. For example, attackers used the WebUSB API to
access others sites' credentials on a hardware security, bypassing
same-origin checks in an early U2F API. [[YUBIKEY-ATTACK]]
<p class=example>
The Network Service Discovery API [[DISCOVERY-API]] recommended CORS
preflights before granting access to a device, and requires user agents to
involve the user with a permission request of some kind.
</p>
<p class=example>
Likewise, the Web Bluetooth [[WEB-BLUETOOTH]] has an extensive discussion of
such issues in [[WEB-BLUETOOTH#security-and-privacy]], which is worth
reading as an example for similar work.
</p>
<p class=example>
[[WEBUSB]] addresses these risks through a combination of user mediation /
prompting, secure origins, and feature policy.
See [[WEBUSB#security-and-privacy]] for more.
</p>
<h3 class=question id="native-ui">
Does this specification allow an origin some measure of control over a user
agent's native UI?
</h3>
Features that allow for control over a user agent’s UI (e.g. full screen
mode) or changes to the underlying system (e.g. installing an ‘app’ on a
smartphone home screen) may surprise users or obscure security / privacy
controls. To the extent that your feature does allow for the changing of a
user agent’s UI, can it effect security / privacy controls? What analysis
confirmed this conclusion?
<h3 class=question id="temporary-id">
What temporary identifiers might this this specification create or expose
to the web?
</h3>
If a standard exposes a temporary identifier to the web, the identifier
should be short lived and should rotate on some regular duration to mitigate
the risk of this identifier being used to track a user over time. When a
user clears state in their user agent, these temporary identifiers should be
cleared to prevent re-correlation of state using a temporary identifier.
If this specification does create or expose a temporary identifier to the
web, how is it exposed, when, to what entities, and, how frequently is it
rotated?
Example temporary identifiers include TLS Channel ID, Session Tickets, and
IPv6 addresses.
The index attribute in the Gamepad API [[GAMEPAD]] — an integer that starts
at zero, increments, and is reset — is a good example of a privacy friendly
temporary identifier.
<h3 class=question id="first-third-party">
How does this specification distinguish between behavior in first-party and
third-party contexts?
</h3>
The behavior of a feature should be considered not just in the context of its
being used by a first party origin that a user is visiting but also the
implications of its being used by an arbitrary third party that the first
party includes. When developing your specification, consider the implications
of its use by third party resources on a page and, consider if support for
use by third party resources should be optional to conform to the
specification. If supporting use by third party resources is mandatory for
conformance, please explain why and what privacy mitigations are in place.
This is particularly important as user agents may take steps to reduce the
availability or functionality of certain features to third parties if the
third parties are found to be abusing the functionality.
<h3 class=question id="private-browsing">
How does this specification work in the context of a user agent’s Private
Browsing or "incognito" mode?
</h3>
Each major user agent implements a private browsing / incognito mode feature
with significant variation across user agents in threat models,
functionality, and descriptions to users regarding the protections afforded
[[WU-PRIVATE-BROWSING]].
One typical commonality across user agents' private browsing / incognito
modes is that they have a set of state than the user agents’ in their
‘normal’ modes.
Does the specification provide information that would allow for the
correlation of a single user's activity across normal and private browsing /
incognito modes? Does the specification result in information being written
to a user’s host that would persist following a private browsing / incognito
mode session ending?
There has been research into both:
* Detecting whether a user agent is in private browsing mode [[RIVERA]]
using non-standardized methods such as <code>[window.requestFileSystem()](https://developer.mozilla.org/en-US/docs/Web/API/Window/requestFileSystem)</code>.
* Using features to fingerprint a browser and correlate private and
non-private mode sessions for a given user. [[OLEJNIK-PAYMENTS]]
<h3 class=question id="considerations">
Does this specification have a "Security Considerations" and "Privacy
Considerations" section?
</h3>
Documenting the various concerns and potential abuses in "Security
Considerations" and "Privacy Considerations" sections of a document is a good
way to help implementers and web developers understand the risks that a
feature presents, and to ensure that adequate mitigations are in place.
Simply adding a section to your specification with yes/no responses to the
questions in this document is insufficient.
If it seems like a feature does not have security or privacy impacts,
then say so inline in the spec section for that feature:
> There are no known security or privacy impacts of this feature.
Saying so explicitly in the specification serves several purposes:
1. Shows that a spec author/editor has explicitly considered security and
privacy when designing a feature.
1. Provides some sense of confidence that there might be no such impacts.
1. Challenges security and privacy minded individuals to think of and find
even the potential for such impacts.
1. Demonstrates the spec author/editor's receptivity to feedback about such
impacts.
1. Demonstrates a desire that the specification should not be introducing
security and privacy issues
[[RFC3552]] provides general advice as to writing Security Consideration
sections. Generally, there should be a clear description of the kinds of
privacy risks the new specification introduces to for users of the web
platform. Below is a set of considerations, informed by that RFC, for
writing a privacy considerations section.
Authors must describe:
1. What privacy attacks have been considered?
1. What privacy attacks have been deemed out of scope (and why)?
1. What privacy mitigations have been implemented?
1. What privacy mitigations have considered and not implemented (and why)?
In addition, attacks considered must include:
1. Fingerprinting risk;
1. Unexpected exfiltration of data through abuse of sensors;
1. Unexpected usage of the specification / feature by third parties;
1. If the specification includes identifiers, the authors must document what
rotation period was selected for the identifiers and why.
1. If the specification introduces new state to the user agent, the authors
must document what guidance regarding clearing said storage was given and
why.
1. There should be a clear description of the residual risk to the user
after the privacy mitigations has been implemented.
The crucial aspect is to actually considering security and privacy. All new
specifications must have security and privacy considerations sections to be
considered for wide reviews. Interesting features added to the web platform
generally often already had security and/or privacy impacts.
<h3 class=question id="relaxed-sop">
Does this specification allow downgrading default security characteristics?
</h3>
Does this feature allow for a site to opt-out of security settings to
accomplish some piece of functionality? If so, in what situations does your
specification allow such security setting downgrading and what mitigations
are in place to make sure optional downgrading doesn't dramatically increase
risks?
* {{Document/domain|document.domain}}
* [[CORS]]
* [[WEBMESSAGING]]
* [[REFERRER-POLICY]]'s <a>"unsafe-url"</a>
<h3 class=question id="missing-questions">
What should this questionnaire have asked?
</h3>
This questionnaire is not exhaustive.
After completing a privacy review,
it may be that
there are privacy aspects of your specification
that a strict reading, and response to, this questionnaire,
would not have revealed.
If this is the case,
please convey those privacy concerns,
and indicate if you can think of improved or new questions
that would have covered this aspect.
Please consider [filing an issue](https://github.com/w3ctag/security-questionnaire/issues/new)
to let us know what the questionnaire should have asked.
<h2 id="threats">Threat Models</h2>
To consider security and privacy it is convenient to think in terms of threat
models, a way to illuminate the possible risks.
There are some concrete privacy concerns that should be considered when
developing a feature for the web platform [[RFC6973]]:
* Surveillance: Surveillance is the observation or monitoring of an
individual's communications or activities.
* Stored Data Compromise: End systems that do not take adequate measures to
secure stored data from unauthorized or inappropriate access.
* Intrusion: Intrusion consists of invasive acts that disturb or interrupt
one's life or activities.
* Misattribution: Misattribution occurs when data or communications related
to one individual are attributed to another.
* Correlation: Correlation is the combination of various pieces of
information related to an individual or that obtain that characteristic
when combined.
* Identification: Identification is the linking of information to a
particular individual to infer an individual's identity or to allow the
inference of an individual's identity.
* Secondary Use: Secondary use is the use of collected information about an
individual without the individual's consent for a purpose different from
that for which the information was collected.
* Disclosure: Disclosure is the revelation of information about an
individual that affects the way others judge the individual.
* Exclusion: Exclusion is the failure to allow individuals to know about
the data that others have about them and to participate in its handling
and use.
In the mitigations section, this document outlines a number of techniques
that can be applied to mitigate these risks.
Enumerated below are some broad classes of threats that should be
considered when developing a web feature.
<h3 id="passive-network">
Passive Network Attackers
</h3>
A <dfn>passive network attacker</dfn> has read-access to the bits going over
the wire between users and the servers they're communicating with. She can't
*modify* the bytes, but she can collect and analyze them.
Due to the decentralized nature of the internet, and the general level of
interest in user activity, it's reasonable to assume that practically every
unencrypted bit that's bouncing around the network of proxies, routers, and
servers you're using right now is being read by someone. It's equally likely
that some of these attackers are doing their best to understand the encrypted
bits as well, including storing encrypted communications for later
cryptanalysis (though that requires significantly more effort).
* The IETF's "Pervasive Monitoring Is an Attack" document [[RFC7258]] is
useful reading, outlining some of the impacts on privacy that this
assumption entails.
* Governments aren't the only concern; your local coffee shop is likely to
be gathering information on its customers, your ISP at home is likely to
be doing the same.
<h3 id="active-network">
Active Network Attackers
</h3>
An <dfn>active network attacker</dfn> has both read- and write-access to the
bits going over the wire between users and the servers they're communicating
with. She can collect and analyze data, but also modify it in-flight,
injecting and manipulating Javascript, HTML, and other content at will.
This is more common than you might expect, for both benign and malicious
purposes:
* ISPs and caching proxies regularly cache and compress images before
delivering them to users in an effort to reduce data usage. This can be
especially useful for users on low-bandwidth, high-latency devices like
phones.
* ISPs also regularly inject JavaScript [[COMCAST]] and other identifiers
[[VERIZON]] for less benign purposes.
* If your ISP is willing to modify substantial amounts of traffic flowing
through it for profit, it's difficult to believe that state-level
attackers will remain passive.
<h3 id="sop-violations">
Same-Origin Policy Violations
</h3>
The <dfn>same-origin policy</dfn> is the cornerstone of security on the web;
one origin should not have direct access to another origin's data (the policy
is more formally defined in Section 3 of [[RFC6454]]). A corollary to this
policy is that an origin should not have direct access to data that isn't
associated with *any* origin: the contents of a user's hard drive,
for instance. Various kinds of attacks bypass this protection in one way or
another. For example:
* <dfn local-lt="XSS">Cross-site scripting attacks</dfn> involve an
attacker tricking an origin into executing attacker-controlled code in
the context of a target origin.
* Cross-site request forgery attacks trick user agents into exerting a
user's ambient authority on sites where they've logged in by submitting
requests on their behalf.
* Data leakage occurs when bits of information are inadvertently made
available cross-origin, either explicitly via CORS headers [[CORS]],
or implicitly, via side-channel attacks like [[TIMING]].
<h3 id="third-party-tracking">
Third-Party Tracking
</h3>
Part of the power of the web is its ability for a page to pull in content
from other third parties — from images to javascript — to enhance the content
and/or a user's experience of the site. However, when a page pulls in
content from third parities, it inherently leaks some information to third
parties — referer information and other information that may be used to track
and profile a user. This includes the fact that cookies go back to the
domain that initially stored them allowing for cross origin tracking.
Moreover, third parties can gain execution power through third party
Javascript being included by a webpage. While pages can take steps to
mitigate the risks of third party content and browsers may differentiate
how they treat first and third party content from a given page, the risk of
new functionality being executed by third parties rather than the first party
site should be considered in the feature development process.
The simplest example is injecting a link to a site that behaves differently
under specific condition, for example based on the fact that user is or is not
logged to the site. This may reveal that the user has an account on a site.
<h3 id="legitimate-misuse">
Legitimate Misuse
</h3>
Even when powerful features are made available to developers, it does not
mean that all the uses should always be a good idea, or justified; in fact,
data privacy regulations around the world may even put limits on certain uses
of data. In the context of first party, a legitimate website is potentially
able to interact with powerful features to learn about the user behavior or
habits. For example:
* Tracking the user while browsing the website via mechanisms such as mouse
move tracking
* Behavioral profiling of the user based on the usage patterns
* Accessing powerful features enabling to reason about the user system,
himself or the user surrounding, such as a webcam, Web Bluetooth or
sensors
This point is admittedly different from others - and underlines that even if
something may be possible, it does not mean it should always be done,
including the need for considering a privacy impact assessment or even an
ethical assessment. When designing a specification with security and privacy
in mind, all both use and misuse cases should be in scope.
<h2 id="mitigations">
Mitigation Strategies
</h2>
To mitigate the security and privacy risks you’ve identified in your
specification as you’ve filled out the questionnaire,
you may want to apply one or more of the mitigations described below to your
feature.
<h3 id="data-minimization">
Data Minimization
</h3>
Minimization is a strategy that involves exposing as little information to
other communication partners as is required for a given operation to
complete. More specifically, it requires not providing access to more
information than was apparent in the user-mediated access or allowing the
user some control over which information exactly is provided.
For example, if the user has provided access to a given file, the object
representing that should not make it possible to obtain information about
that file's parent directory and its contents as that is clearly not what is
expected.
In context of data minimization it is natural to ask what data is passed
around between the different parties, how persistent the data items and
identifiers are, and whether there are correlation possibilities between
different protocol runs.
For example, the W3C Device APIs Working Group has defined a number of
requirements in their Privacy Requirements document. [[DAP-PRIVACY-REQS]]
Data minimization is applicable to specification authors and implementers, as
well as to those deploying the final service.
As an example, consider mouse events. When a page is loaded, the application
has no way of knowing whether a mouse is attached, what type of mouse it is
(e.g., make and model), what kind of capabilities it exposes, how many are
attached, and so on. Only when the user decides to use the mouse — presumably
because it is required for interaction — does some of this information become
available. And even then, only a minimum of information is exposed: you could
not know whether it is a trackpad for instance, and the fact that it may have
a right button is only exposed if it is used. For instance, the Gamepad API
makes use of this data minimization capability. It is impossible for a Web game
to know if the user agent has access to gamepads, how many there are, what
their capabilities are, etc. It is simply assumed that if the user wishes to
interact with the game through the gamepad then she will know when to action
it — and actioning it will provide the application with all the information
that it needs to operate (but no more than that).
The way in which the functionality is supported for the mouse is simply by
only providing information on the mouse's behaviour when certain events take
place. The approach is therefore to expose event handling (e.g., triggering
on click, move, button press) as the sole interface to the device.
Two features that have minimized the data they make available are:
* [[BATTERY-STATUS]] <q>The user agent should not expose high precision readouts</q>
* [[GENERIC-SENSOR]] <q>Limit maximum sampling frequency</q>,
<q>Reduce accuracy</q></em>
<h3 id="privacy-friendly-defaults">
Default Privacy Settings
</h3>
Users often do not change defaults, as a result, it is important that the
default mode of a specification minimizes the amount, identifiability, and
persistence of the data and identifiers exposed. This is particularly true
if a protocol comes with flexible options so that it can be tailored to
specific environments.
<h3 id="user-mediation">
Explicit user mediation
</h3>
If the security or privacy risk of a feature cannot otherwise be mitigated in
a specification, optionally allowing an implementer to prompt a user may
be the best mitigation possible, understanding it does not entirely remove
the privacy risk. If the specification does not allow for the implementer to
prompt, it may result in divergence implementations by different user agents
as some user agents choose to implement more privacy-friendly version.
It is possible that the risk of a feature cannot be mitigated because the
risk is endemic to the feature itself. For instance, [[GEOLOCATION-API]]
reveals a user’s location intentionally; user agents generally gate access to
the feature on a permission prompt which the user may choose to accept. This
risk is also present and should be accounted for in features that expose
personal data or identifiers.
Designing such prompts is difficult as is determining the duration that the
permission should provide.
Often, the best prompt is one that is clearly tied to a user action, like the
file picker, where in response to a user action, the file picker is brought
up and a user gives access to a specific file to an individual site.
Generally speaking, the duration and timing of the prompt should be inversely
proportional to the risk posed by the data exposed. In addition, the prompt
should consider issues such as:
* How should permission requests be scoped? Especially when requested by an
embedded third party iframe?
* Should persistence be based on the pair of top-level/embedded origins or a
different scope?
* How is it certain that the prompt is occurring in context of requiring the
data and at a time that it is clear to the user why the prompt is occurring.
* Explaining the implications of permission before prompting the user, in a
way that is accessible and localized -- _who_ is asking, _what_ are they
asking for, _why_ do they need it?
* What happens if the user rejects the request at the time of the prompt or
if the user later changes their mind and revokes access.
These prompts should also include considerations for what, if any, control a
user has over their data after it has been shared with other parties. For
example, are users able to determine what information was shared with other
parties?
<h3 id="restrict-to-first-party">
Explicitly restrict the feature to first party origins
</h3>
As described in the "Third-Party Tracking" section, a significant feature of
the web is the mixing of first and third party content in a single page, but,
this introduces risk where the third party content can use the same set of web
features as the first party content.
Authors should explicit specify the feature's scope of availability:
* When a feature should be made available to embedded third parties -- and
often first parties should be able to explicitly control that (using
iframe attributes or feature policy)
* Whether a feature should be available in the background or only in the
top-most, visible tab.
* Whether a feature should be available to offline service workers.
* Whether events will be fired simultaneously
Third party’s access to a feature should be an optional implementation for
conformance.
<h3 id="secure-contexts">
Secure Contexts
</h3>
If the primary risk that you’ve identified in your specification is the
threat posed by [=active network attacker=], offering a feature to an
insecure origin is the same as offering that feature to every origin because
the attacker can inject frames and code at will. Requiring an encrypted and
authenticated connection in order to use a feature can mitigate this kind of
risk.
Secure contexts also protect against [=passive network attackers=]. For
example, if a page uses the Geolocation API and sends the sensor-provided
latitude and longitude back to the server over an insecure connection, then
any passive network attacker can learn the user's location, without any
feasible path to detection by the user or others.
However, requiring a secure context is not sufficient to mitigate many
privacy risks or even security risks from other threat actors than active
network attackers.
<h3 id="drop-feature">
Drop the feature
</h3>
Possibly the simplest way
to mitigate potential negative security or privacy impacts of a feature
is to drop the feature,
though you should keep in mind that some security or privacy risks
may be removed or mitigated
by adding features to the platform.
Every feature in a spec
should be seen as
potentially adding security and/or privacy risk
until proven otherwise.
Discussing dropping the feature
as a mitigation for security or privacy impacts
is a helpful exercise
as it helps illuminate the tradeoffs
between the feature,
whether it is exposing the minimum amount of data necessary,
and other possible mitigations.
Consider also the cumulative effect
of feature addition
to the overall impression that users have
that [it is safe to visit a web page](https://w3ctag.github.io/design-principles/#safe-to-browse).
Doing things that complicate users' understanding
that it is safe to visit websites,
or that complicate what users need to understand
about the safety of the web
(e.g., adding features that are less safe)
reduces the ability of users
to act based on that understanding of safety,
or to act in ways that correctly reflect the safety that exists.
Every specification should seek to be as small as possible, even if only
for the reasons of reducing and minimizing security/privacy attack surface(s).
By doing so we can reduce the overall security and privacy attack surface
of not only a particular feature, but of a module (related set of
features), a specification, and the overall web platform.
Examples
* [Mozilla](https://bugzilla.mozilla.org/show_bug.cgi?id=1313580) and
[WebKit](https://bugs.webkit.org/show_bug.cgi?id=164213)
dropped the Battery Status API
* [Mozilla dropped](https://bugzilla.mozilla.org/show_bug.cgi?id=1359076)
devicelight, deviceproximity and userproximity events
<h3 id="privacy-impact-assessment">
Making a privacy impact assessment
</h3>
Some features are potentially supplying very sensitive data, and it is
the responsibility of the end-developer, system owner, or manager to realize
this and act accordingly in the design of his/her system. Some use may
warrant conducting a privacy impact assessment, especially when data
relating to individuals may be processed.
Specifications that expose such sensitive data should include a
recommendation that websites and applications adopting the API — but not
necessarily the implementing user agent — conduct a privacy impact assessment
of the data that they collect.
A features that recommends such is:
* [[GENERIC-SENSOR]] advises to consider performing of a privacy impact
assessment
Documenting these impacts is important for organizations although it should
be noted that there are limitations to putting this onus on organizations.
Research has shown that sites often do not comply with security/privacy
requirements in specifications. For example, in [[DOTY-GEOLOCATION]], it was
found that none of the studied websites informed users of their privacy
practices before the site prompted for location.
<pre class="anchors">
urlPrefix: https://tc39.github.io/ecma262/; spec: ECMASCRIPT
text: eval(); url: #sec-eval-x; type: method
</pre>
<pre class="link-defaults">
spec:html; type:element; text:script
spec:html; type:element; text:link
</pre>
<pre class="biblio">
{
"COMCAST": {
"href": "http://arstechnica.com/tech-policy/2014/09/why-comcasts-javascript-ad-injections-threaten-security-net-neutrality/",
"title": "Comcast Wi-Fi serving self-promotional ads via JavaScript injection",
"publisher": "Ars Technica",
"authors": [ "David Kravets" ]
},
"DOTY-GEOLOCATION": {
"href": "https://escholarship.org/uc/item/0rp834wf",
"title": "Privacy Issues of the W3C Geolocation API",
"authors": [ "Nick Doty, Deirdre K. Mulligan, Erik Wilde" ],
"publisher": "UC Berkeley School of Information"
},
"GEOFENCING-EXPLAINED": {
"href": "https://github.com/slightlyoff/Geofencing/blob/master/explainer.md",
"title": "Geofencing Explained",
"authors": [ "Alex Russell" ]
},
"GYROSPEECHRECOGNITION": {
"href": "https://www.usenix.org/system/files/conference/usenixsecurity14/sec14-paper-michalevsky.pdf",
"title": "Gyrophone: Recognizing Speech from Gyroscope Signals",
"publisher": "Proceedings of the 23rd USENIX Security Symposium",
"authors": [ "Yan Michalevsky", "Dan Boneh", "Gabi Nakibly"]
},
"HOMAKOV": {
"href": "http://homakov.blogspot.de/2014/01/using-content-security-policy-for-evil.html",
"title": "Using Content-Security-Policy for Evil",
"authors": [ "Egor Homakov" ]
},
"OLEJNIK-ALS": {
"href": "https://blog.lukaszolejnik.com/privacy-of-ambient-light-sensors/",
"title": "Privacy analysis of Ambient Light Sensors",
"publisher": "Lukasz Olejnik",
"authors": [ "Lukasz Olejnik" ]
},
"OLEJNIK-BATTERY": {
"href": "https://eprint.iacr.org/2015/616",
"title": "The leaking battery: A privacy analysis of the HTML5 Battery Status API",
"publisher": "Cryptology ePrint Archive, Report 2015/616",
"authors": [ "Lukasz Olejnik", "Gunes Acar", "Claude Castelluccia", "Claudia Diaz"]
},
"OLEJNIK-PAYMENTS": {
"href": "https://blog.lukaszolejnik.com/privacy-of-web-request-api/",
"title": "Privacy of Web Request API",
"authors": [ "Lukasz Olejnik" ],
"publisher": "Lukasz Olejnik"
},
"RIVERA": {
"href": "https://gist.github.com/jherax/a81c8c132d09cc354a0e2cb911841ff1",
"title": "Detect if a browser is in Private Browsing mode",
"authors": [ "David Rivera" ],