-
Notifications
You must be signed in to change notification settings - Fork 21
/
index.src.html
784 lines (609 loc) · 33 KB
/
index.src.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
<h1>Clear Site Data</h1>
<pre class="metadata">
Status: ED
ED: https://w3c.github.io/webappsec-clear-site-data/
TR: http://www.w3.org/TR/clear-site-data/
Shortname: clear-site-data
Editor: Mike West 56384, Google Inc., [email protected]
Previous Version: http://www.w3.org/TR/2016/WD-clear-site-data-20160720/
Group: webappsec
Abstract:
This document defines an imperative mechanism which allows web developers to
instruct a user agent to clear a site's locally stored data related to a
host.
Indent: 2
Level: None
Version History: https://github.com/w3c/webappsec-clear-site-data/commits/main/index.src.html
Boilerplate: omit conformance, omit feedback-header
Markup Shorthands: css off, markdown on
!Participate: <a href="https://github.com/w3c/webappsec-clear-site-data/issues/new">File an issue</a> (<a href="https://github.com/w3c/webappsec-clear-site-data/issues">open issues</a>)
</pre>
<pre class="link-defaults">
spec:html; type:dfn; for:/; text:browsing context
spec:fetch; type:dfn; for:/; text:response
spec:fetch; type:dfn; for:/; text:http-network fetch
<!-- Needs to be exported -->
spec:html; type:dfn; for:/; text:active sandboxing flag set
spec:html; type:dfn; for:/; text:application cache group
spec:html; type:dfn; for:/; text:reload-triggered navigation
spec:indexeddb; type:dfn; for:/; text:database
spec:indexeddb; type:dfn; for:/; text:delete a database
</pre>
<pre class="anchors">
spec: RFC5890; urlPrefix: https://tools.ietf.org/html/rfc5890
type: dfn
text: label; url: section-2.2
spec: RFC6265; urlPrefix: https://tools.ietf.org/html/rfc6265
type: dfn
text: canonicalized; url: section-5.1.2
text: cookie store; url: section-5.3
text: domain-match; url: section-5.1.3
spec: RFC6454; urlPrefix: https://tools.ietf.org/html/rfc6454
type: dfn
text: globally unique identifier; url: section-2.3
text: origin; url: section-3.2
text: the same; url: section-5
spec: RFC7234; urlPrefix: https://tools.ietf.org/html/rfc7234
type: dfn
text: network cache; url: section-2
spec: RFC7230; urlPrefix: https://tools.ietf.org/html/rfc7230
type: grammar
text: #rule; url: section-7
text: quoted-string; url: https://tools.ietf.org/html/rfc7230#section-3.2.6
spec: PSL; urlPrefix: https://publicsuffix.org/list/
type: dfn
text: registered domain; url: #
</pre>
<pre class="biblio">
{
"CHANNELID": {
"authors": [ "Dirk Balfanz", "Ryan Hamilton" ],
"href": "https://tools.ietf.org/html/draft-balfanz-tls-channelid",
"title": "Transport Layer Security (TLS) Channel IDs",
"publisher": "IETF"
},
"TOKBIND": {
"authors": [ "Andrei Popov", "Magnus Nystroem", "Dirk Balfanz", "Adam Langley", "Jeff Hodges" ],
"href": "https://tools.ietf.org/html/draft-ietf-tokbind-protocol",
"title": "The Token Binding Protocol Version 1.0",
"publisher": "IETF"
}
}
</pre>
<!--
████ ██ ██ ████████ ████████ ███████
██ ███ ██ ██ ██ ██ ██ ██
██ ████ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██ ████████ ██ ██
██ ██ ████ ██ ██ ██ ██ ██
██ ██ ███ ██ ██ ██ ██ ██
████ ██ ██ ██ ██ ██ ███████
-->
<section>
<h2 id="intro">Introduction</h2>
<em>This section is not normative.</em>
Web applications store data locally on a user's computer in order to provide
functionality while the user is offline, and to increase performance when the
user is online. These local caches have significant advantages for both users
and developers, but present risks as well.
A user's data is both sensitive and valuable; web developers ought to take
reasonable steps to protect it. One such step would be to encrypt data before
storing it. Another would be to remove data from the user's machine when it is
no longer necessary (for example, when the user signs out of the application,
or deletes their account).
Site authors can remove data from a number of storage mechanisms via
JavaScript, but others are difficult to deal with reliably. Consider cookies,
for instance, which can be partially cleared via JavaScript access to
`document.cookie`. `HttpOnly` cookies, however, can only
be removed via a number of `Set-Cookie` headers in an HTTP
response. This, of course, requires exhaustive knowledge of all the cookies
set for a host, which can be complicated to ascertain. Cache is still harder;
no imperative interface to a browser's network cache exists, period.
This document defines a new mechanism to deal with removing data from these
and other types of local storage, giving web developers the ability to clear
out a user's local cache of data via the <a http-header>Clear-Site-Data</a> HTTP response
header.
<h3 id="examples">Examples</h3>
<h4 id="example-signout">Signing Out</h4>
<div class="example">
A user signs out of Super Secret Social Network via a CSRF-protected POST to
`https://supersecretsocialnetwork.example.com/logout`, and the
site author wishes to ensure that locally stored data is removed as a
result.
They can do so by sending the following HTTP header in the response:
<pre>
<a http-header>Clear-Site-Data</a>: "<a grammar>cache</a>", "<a grammar>cookies</a>", "<a grammar>storage</a>", "<a grammar>executionContexts</a>"
</pre>
</div>
<h4 id="example-targeted">Targeted Clearing</h4>
<div class="example">
A user signs out of Megacorp Inc.'s site via a CSRF-protected POST to
`https://megacorp.example.com/logout`. Megacorp has a large
number of services available as subdomains, so many that it's not entirely
clear which of them would be safe to clear as a response to a logout action.
One option would be to simply clear everything, and deal with the fallout.
Megacorp's CEO, however, once lost hours and hours of progress in "Irate
Ibexes" due to inadvertent site-data clearing, and so refuses to allow such
a sweeping impact to the site's users.
The developers know, however, that the "Minus" application is certainly safe
to clear out. They can target this specific subdomain by including a request
to that subdomain as part of the logout landing page (ideally as a
CORS-enabled, CSRF-protected POST):
<pre>
fetch("https://minus.megacorp.example.com/clear-site-data",
{
method: "POST",
mode: "cors",
headers: new Headers({
"CSRF": "[<em>insert sekrit token here</em>]"
})
});
</pre>
That endpoint would return proper CORS headers in response to that request's
preflight, and would return the following header for the actual request:
<pre>
<a http-header>Clear-Site-Data</a>: "<a grammar>cache</a>", "<a grammar>cookies</a>", "<a grammar>storage</a>", "<a grammar>executionContexts</a>"
</pre>
</div>
<h4 id="example-keepcookies">Keep Critical Cookies</h4>
<div class="example">
A user opts-out of interest-based advertising via a CSRF-protected POST to
`https://ads-are-awesome.example.com/optout`. The site author
wishes to remove DOM-accessible data which might contain tracking
information, but needs to ensure that the opt-out cookie which the user has
just received isn't wiped along with it.
They can do so by sending the following HTTP header in the response, which
includes all the types except for "<a grammar>cookies</a>":
<pre>
<a http-header>Clear-Site-Data</a>: "<a grammar>cache</a>", "<a grammar>storage</a>", "<a grammar>executionContexts</a>"
</pre>
</div>
<h4 id="example-killswitch">Kill Switch</h4>
<div class="example">
Super Secret Social Network's developers learn that the site was vulnerable
to cross-site scripting attacks which allowed malicious parties to inject
arbitrary code into its origin. They fixed the site, and added a strong
Content Security Policy [[CSP]] to mitigate the risk going forward, but
they can't be entirely sure that clients are really back to a trustworthy
state. Perhaps the attackers found a clever persistence mechanism?
They can reduce the risk of a persistent client-side XSS by sending the
following HTTP header in a response to wipe out local sources of data:
<pre>
<a http-header>Clear-Site-Data</a>: "<a grammar>cache</a>", "<a grammar>cookies</a>", "<a grammar>storage</a>", "<a grammar>executionContexts</a>"
</pre>
Note: Installing a Service Worker guarantees that a request will go out to
a server every ~24 hours. That update ping would be a wonderful time to send
a header like this one in case of catastrophe. [[SERVICE-WORKERS]]
</div>
<h3 id="goals">Goals</h3>
Generally, the goal is to allow web developers more control over the data
stored locally by a user agent for their origins. In particular, developers
should be able to reliably ensure the following:
1. Data stored in an origin's client-side storage mechanisms like [[INDEXEDDB]],
WebSQL, Filesystem, {{localStorage}}, and {{sessionStorage}} is cleared.
2. Cookies for an origin's host are removed [[!RFC6265]].
3. Web Workers (dedicated and shared) running for an origin are terminated.
4. Service Workers registered for an origin are terminated and deregistered.
5. Resources from an origin are removed from the user agent's local cache.
6. The [=Accept-CH cache=] for an origin is purged.
7. None of the above can be bypassed by a maliciously active document that
retains interesting data in memory, and rewrites it if it's cleared.
</section>
<section>
<h2 id="infra">Infrastructure</h2>
This document uses ABNF grammar to specify syntax, as defined in [[!RFC5234]] and updated in
[[!RFC7405]], along with the `#rule` extension defined in
<a href="https://tools.ietf.org/html/rfc7230#section-7">Section 7</a> of [[!RFC7230]], and the
`quoted-string` rule defined in
<a href="https://tools.ietf.org/html/rfc7230#section-3.2.6">Section 3.2.6</a> of the same
document.
This document depends on the Infra Standard for a number of foundational concepts used in its
algorithms and prose [[!INFRA]].
</section>
<section>
<h2 id="clearing">Clearing Site Data</h2>
Developers may instruct a user agent to clear various types of relevant data
by delivering a `Clear-Site-Data` HTTP response header in response to a request.
<h3 id="header">
The `Clear-Site-Data` HTTP Response Header Field
</h3>
The <dfn http-header>`Clear-Site-Data`</dfn> HTTP response header field sends a signal to the
user agent that it ought to remove all data of a certain set of types. The header is
represented by the following grammar:
<pre class="abnf" link-type="grammar" dfn-type="grammar">
Clear-Site-Data = 1#( <a>quoted-string</a> )
; <a>#rule</a> is defined in Section 7 of RFC 7230.
</pre>
[[#fetch-integration]] and [[#parsing]] describe how the
<a http-header>Clear-Site-Data</a> header is processed.
This document defines an initial set of known data types which can be cleared using this
mechanism. See their descriptions below. Future versions of the header can support
additional datatypes, which MUST comply with the <a grammar>quoted-string</a> grammar. User agents
MUST ignore unknown types when parsing the header.
: "<dfn grammar>`cache`</dfn>"
:: The "`cache`" type indicates that the server wishes to remove locally
cached data associated with the <a>origin</a> of a particular
<a>response</a>'s [=response/url=]. This includes the <a>network
cache</a>, of course, but will also remove data from various other caches
which a user agent implements (prerendered pages, script caches, shader
caches, [=Accept-CH cache=], etc.).
Implementation details are in [[#clear-cache]].
<div class="example">
When delivered with a response from `https://example.com/clear`,
the following header will cause caches associated with the origin
`https://example.com`: to be cleared:
<pre>
<a http-header>Clear-Site-Data</a>: "<a grammar>cache</a>"
</pre>
</div>
Note: Caches are typically not organized by origin, but rather by URL and
timestamp. This means that in practice, clearing cache by origin might
have to be implemented using a linear scan. For large caches, this makes
it a prohibitively expensive operation.
: "<dfn grammar>`cookies`</dfn>"
:: The "`cookies`" type indicates that the server wishes to remove cookies
associated with the <a>origin</a> of a particular <a>response</a>'s
[=response/url=]. Along with cookies, HTTP authentication credentials
[[!RFC7235]], and origin-bound tokens such as those defined by Channel
ID [[!CHANNELID]] and Token Binding [[!TOKBIND]] are also cleared.
Implementation details are in [[#clear-cookies]].
<div class="example">
When delivered with a response from `https://example.com/clear`,
the following header will cause cookies associated with the origin
`https://example.com` to be cleared, as well as cookies on any origin in
the same registered domain (e.g. `https://www.example.com/` and
`https://more.subdomains.example.com/`).
<pre>
<a http-header>Clear-Site-Data</a>: "<a grammar>cookies</a>"
</pre>
</div>
Note: Clearing cookies should also clear the [=Accept-CH cache=] for
<a>origin</a>. This is because the cache is also cleared if the user
manually clears cookies.
: "<dfn grammar>`storage`</dfn>"
:: The "`storage`" type indicates that the server wishes to remove
locally stored data associated with the <a>origin</a> of a
particular <a>response</a>'s [=response/url=]. This includes storage
mechanisms such as ({{localStorage}}, {{sessionStorage}}, [[INDEXEDDB]],
[[WEBDATABASE]], etc), as well as tangentially related mechanism such as
<a>service worker registrations</a>.
Implementation details are in [[#clear-dom]].
<div class="example">
When delivered with a response from `https://example.com/clear`,
the following header will cause DOM-accessible storage for the origin
`https://example.com` to be cleared:
<pre>
<a http-header>Clear-Site-Data</a>: "<a grammar>storage</a>"
</pre>
</div>
: "<dfn grammar>`executionContexts`</dfn>"
:: The "`executionContexts`" type indicates that the server wishes to neuter
and reload execution contexts currently rendering the <a>origin</a> of a
particular <a>response</a>'s [=response/url=].
<div class="example">
When delivered with a response from `https://example.com/clear`,
the following header will cause execution contexts displaying the origin
`https://example.com` to be neutered and reloaded:
<pre>
<a http-header>Clear-Site-Data</a>: "<a grammar>executionContexts</a>"
</pre>
</div>
: "<dfn grammar>`clientHints`</dfn>"
:: The "`clientHints`" type indicates that the server wishes clear the [=Accept-CH cache=]
for the <a>origin</a> of a particular <a>response</a>'s [=response/url=].
<div class="example">
When delivered with a response from `https://example.com/clear`,
the following header will cause the [=Accept-CH cache=] for origin
`https://example.com` to be cleared:
<pre>
<a http-header>Clear-Site-Data</a>: "<a grammar>clientHints</a>"
</pre>
</div>
Note: The [=Accept-CH cache=] is also cleared for the <a grammar>cache</a> and
<a grammar>cookies</a> options, so it should be used only when neither of the
other options (or <a grammar>*</a>) are applied.
: "<dfn grammar>`*`</dfn>"
:: The "`*`" (wildcard) pseudotype indicates that the server has the same
effect as specifying all types.
<div class="example">
When delivered with a response from `https://example.com/clear`,
the following header will cause all cookies, caches, and DOM-accessible
storage associated with the origin `https://example.com` to be cleared,
as well as execution contexts for the same origin to be neutered and
reloaded:
<pre>
<a http-header>Clear-Site-Data</a>: "<a grammar>*</a>"
</pre>
</div>
<div class="note">
<p>
Note: The wildcard is forward-compatible in the sense that if more
datatypes are added in future versions of this header, they will also
be covered by it.
</p>
<p>
DO use the wildcard if the intention is to perform a broad cleanup,
i.e. clear all data associated with an origin that the header knows
about.
</p>
<p>
DO NOT use the wildcard as a shorthand for the four types listed
above, as this meaning might change.
</p>
</div>
Note: The syntax defined here is compatible with future extensions to this document which might
add more granular filtering mechanisms to the types we've defined. For example, it's likely that
"<a grammar>`cookies`</a>" will need to grow a mechanism to prevent deletion of specific cookie
values. Wrapping all of the type names in double-quotes means that we can easily shift from
simple splitting-strings-on-commas processing to something more complicated (like processing the
header value as JSON) without losing backwards compatibility.
<h3 id="fetch-integration">Fetch Integration</h3>
ISSUE: Monkey patching! Talk with Anne.
If the <a http-header>`Clear-Site-Data`</a> header is present in an HTTP <a>response</a>
received from the network, then data MUST be cleared before rendering the
response to the user. That is, after step #14 in the current
<a>HTTP-network fetch</a> algorithm, execute the following step:
15. If <var ignore>credentials flag</var> is set, and |response|'s
[=response/header list=] contains a header named <a http-header>`Clear-Site-Data`</a>, then
execute [[#clear-response]] on |response|.
Note: This happens <em>after</em> `Set-Cookie` headers are
processed. If we clear cookies, we clear all of them. This is intentional, as
removing only certain cookies might leave an application in an indeterminate
and vulnerable state. Removing specific cookies is best done via expiration
using the `Set-Cookie` header.
Note: If we clear the [=Accept-CH cache=] via the `Clear-Site-Data` header then any
`Accept-CH` and `Critical-CH` headers in the same request must be ignored.
Note: While the fetch `credentials flag` is intended to restrict the
modification of cookies, <a http-header>`Clear-Site-Data`</a> applies the same restriction
to all types for the sake of consistency.
<section>
<section>
<h2 id="algorithms">Algorithms</h2>
<h3 id="parsing" algorithm>Parsing</h3>
Given a [=response=], the user agent can <dfn abstract-op local-lt="parse response">parse
|response|'s `Clear-Site-Data` header</dfn>, returning a list of types, as follows:
<ol class="algorithm">
1. Let |types| be an empty list.
2. Let |header| be the result of [=extracting header list values=] given `Clear-Site-Data` and
|response|'s [=response/header list=].
3. If |header| is `null` or failure, return an empty list.
4. For each |type| in |header|, execute the first matching statement, if any, switching on
|type|:
: \``"cache"`\`
:: Append "<a grammar>cache</a>" and "<a grammar>clientHints</a>" to |types|.
: \``"cookies"`\`
:: Append "<a grammar>cookies</a>" and "<a grammar>clientHints</a>" to |types|.
: \``"storage"`\`
:: Append "<a grammar>storage</a>" to |types|.
: \``"executionContexts"`\`
:: Append "<a grammar>executionContexts</a>" to |types|.
: \``"clientHints"`\`
:: Append "<a grammar>clientHints</a>" to |types|.
: \``"*"`\`
:: Append "<a grammar>cache</a>", "<a grammar>cookies</a>", "<a grammar>storage</a>",
"<a grammar>clientHints</a>", and "<a grammar>executionContexts</a>" to |types|.
5. Return |types|.
</ol>
Note: All of the existing values can be handled with the simple switch above. If and when more
complex type definitions are created, the parser will likely shift over to JSON entirely.
<h3 id="clear-response" algorithm>
Clear data for |response|
</h3>
Given a [=response=] (|response|), the user agent can <dfn abstract-op>clear site data for
|response|</dfn> as follows:
<ol class="algorithm">
1. If |response|'s [=response/url=] is not an <a><i lang="la">a priori</i> authenticated
URL</a>, then [=iteration/break=].
2. Let |types| be the result of <a lt="parse response" abstract-op>parsing |response|'s
`Clear-Site-Data` header</a>.
3. Let |origin| be |response|'s [=response/url=]'s [=url/origin=].
4. Let |browsing contexts| be the result of <a lt="prepare" abstract-op>preparing to clear
data</a> for |origin| and |types|.
5. For each |type| in |types|:
1. Execute the first matching statement, if any, switching on |type|:
: "<a grammar>`cache`</a>"
:: <a lt="clear cache" abstract-op>Clear cache for |origin|</a>.
: "<a grammar>`cookies`</a>"
:: <a lt="clear cookies" abstract-op>Clear cookies for |origin|</a>.
: "<a grammar>`storage`</a>"
:: <a lt="clear storage" abstract-op>Clear DOM-accessible storage for |origin|</a>.
: "<a grammar>`clientHints`</a>"
:: [=list/Empty=] [=Accept-CH cache=][|origin|].
6. If |types| contains "<a grammar>executionContexts</a>", then
<a lt="reload" abstract-op>Reload |browsing contexts|</a>.
</ol>
Note: User agents are are encouraged to give web developers some mechanism by which the clearing
operation can be debugged. This might take the form of a console message or timeline entry
indicating success.
<h4 id="prepare" algorithm>
Prepare to clear |origin|'s data
</h4>
Given an [=origin=] (|origin|) and a list of types (|types|), the user agent can
<dfn local-lt="prepare" abstract-op>prepare to clear |origin|'s data</dfn> by executing
the following steps. The algorithm returns a list of [=browsing contexts=] which have been
sandboxed in order to prevent them from recreating cleared data from in-memory JavaScript
variables.
<ol class="algorithm">
1. Let |sandboxed| be an empty list.
2. If |types| does not [=list/contain=] "<a grammar>`executionContexts`</a>", return
|sandboxed|.
3. For each |context| in the user agent's set of [=browsing contexts=]:
1. Let |document| be |context|'s [=active document=].
2. If |document|'s [=relevant settings object=]'s [=environment settings object/origin=]
is not |origin|, [=iteration/continue=].
3. [=Parse a sandboxing directive=] using the empty string as the input, and |document|'s
[=active sandboxing flag set=] as the output.
4. [=list/append|Append=] |context| to |sandboxed|.
4. Return |sandboxed|.
</ol>
<h4 id="reload-contexts">
Reload |browsing contexts|
</h4>
Given a list of [=browsing contexts=] (|contexts|), the user agent can
<dfn local-lt="reload" abstract-op>reload browsing contexts</dfn> as follows:
<ol class="algorithm">
1. For each |context| in |contexts|:
1. Execute |context|'s [=active document=]'s [=relevant settings object=]'s [=environment
settings object/global object=]'s {{Location}} object's {{Location/reload()}}.
ISSUE: This is the simplest thing, but it's probably reaching a little too far into the
documents and mucking with their context. I probably just need to break down and
copy/paste the relevant bits from HTML.
</ol>
<h4 id="clear-cache">
Clear cache for |origin|
</h4>
Given an <a>origin</a> (|origin|), the user agent can
<dfn local-lt="clear cache" abstract-op>clear cache for |origin|</dfn> as follows:
<ol class="algorithm">
1. Let |host| be |origin|'s [=origin/host=].
2. Let |cache list| be the set of entries from the <a>network cache</a> whose `target URI`
[=url/host=] is identical to |host|.
3. For each |entry| in |cache list|:
1. Remove |entry| from the <a>network cache</a>.
4. If a user agent implements caches beyond a pure <a>network cache</a>, it MUST remove all
entries from those caches which match |origin|.
ISSUE: We're dealing with the network cache here, as defined in [[!RFC7234]], but that's not
nearly everything a user agent caches. How hand-wavey with the vendor-specific section can
we be? For instance, Chrome clears out prerendered pages, script caches, WebGL shader
caches, WebRTC bits and pieces, address bar suggestion caches, various networking bits that
aren't representations (HSTS/HPKP, SCDH, etc.). Perhaps [[STORAGE]] will make this clearer?
</ol>
<h4 id="clear-cookies">
Clear cookies for |origin|
</h4>
Given an <a>origin</a> (|origin|), the user agent can
<dfn local-lt="clear cookies" abstract-op>clear cookies for |origin|</dfn> as follows:
Note: We remove all the cookies for an entire <a>registered domain</a>, as cookies ignore the
same-origin policy, and there's a distinct risk that we'd leave applications in an ill-defined
state if we only cleared cookies for a particular subdomain. Consider `accounts.google.com` vs
`mail.google.com`, for instance, both of which have cookies that signal a user's signed-in status.
Note: This algorithm assumes that the user agent has implemented a <a>cookie store</a> (as
discussed in Section 5.3 of [[!RFC6265]]), which offers the ability to retrieve a list of cookies
by host, and to remove individual cookies.
<ol class="algorithm">
1. Let |registered| be the <a>registered domain</a> of |origin|'s [=origin/host=].
2. Let |cookie list| be the set of cookies from the <a>cookie
store</a> whose `domain` attribute is a <a>domain-match</a> with
|registered|.
3. For each |cookie| in |cookie list|:
1. Remove |cookie| from the <a>cookie store</a>.
4. If the user agent supports other forms of cookie-like storage, these MUST also be cleared
for [=origins=] whose [=origin/host=]'s <a>registered domain</a> is |registered|.
Note: For example, if the user agent supports Flash, its local stored objects will be
cleared via <a href="https://wiki.mozilla.org/NPAPI:ClearSiteData">NPP_ClearSiteData</a>.
5. Clear any Channel IDs [[!CHANNELID]] and bound tokens [[!TOKBIND]] associated with
[=origins=] whose [=origin/host=]'s <a>registered domain</a> is |registered|.
6. Clear [=authentication entries=] and [=proxy-authentication entries=] associated with
[=origins=] whose [=origin/host=]'s <a>registered domain</a> is |registered|.
</ol>
ISSUE(w3c/webappsec-clear-site-data#2): The process of clearing both bound
tokens/IDs and HTTP authentication is super hand-wavey.
<h4 id="clear-dom">
Clear DOM-accessible storage for |origin|
</h4>
Given an <a>origin</a> (|origin|), the user agent can
<dfn local-lt="clear storage" abstract-op>clear DOM-accessible storage for |origin|</dfn> as
follows:
<ol class="algorithm">
1. For each |area| in the user agent's set of <a lt="localStorage" attribute>local storage
areas</a> [[!HTML]]:
1. If |area|'s <a>origin</a> is |origin|:
1. Execute {{Storage/clear()}} on the {{Storage}} object associated
with |area|.
2. For each |area| in the user agent's set of <a lt="sessionStorage" attribute>session storage
areas</a> [[!HTML]]:
1. If |area|'s <a>origin</a> is |origin|:
1. Execute {{Storage/clear()}} on the {{Storage}} object associated
with |area|.
3. For each |database| in the set of [=databases=] for |origin| [[!INDEXEDDB]]:
1. [=delete a database|Delete=] |database|.
4. For each |registration| in the user agent's set of <a>service worker registrations</a>:
1. If |registration|'s <a>scope URL</a>'s [=url/origin=] is |origin|:
1. Execute {{ServiceWorkerRegistration/unregister()}} on |registration|.
5. For each |appcache| in the user agent's set of [=application caches=]:
1. If |appcache|'s [=application cache group=] is identified by an URL whose [=url/origin=]
is |origin|:
1. Discard |appcache|.
6. For any other script-accessible storage mechanism, the user agent MUST
delete any data associated with this origin. This includes (but is not
limited to) the following:
1. An origin's WebSQL databases [[!WEBDATABASE]].
2. An origin's filesystems [[!file-system-api]]
3. Plugin data (e.g. Flash via <a href="https://wiki.mozilla.org/NPAPI:ClearSiteData">NPP_ClearSiteData</a>),
5. ISSUE: Moar?
</ol>
</section>
<section>
<h2 id="security">Security Considerations</h2>
<h3 id="incomplete">Incomplete Clearing</h3>
It is possible that an application could be put into an indeterminate state
by clearing only one type of storage. We mitigate that to some extent by
clearing all storage options as a block, and by requiring that the header be
delivered over a secure connection.
<h3 id="service-workers">Service workers</h3>
It is imperative that the Clear-Site-Data header is only respected on
responses fetched over network, and not those served by a service worker.
This is because service workers can return arbitrary responses for resource
requests in their scope, including third-party requests. Thus, supporting
Clear-Site-Data would give them the ability to clear data for any origin.
Note that if a request is sent to a service worker, not handled by it, then
restarted with a [=request/service-workers mode=] of "`none`" and sent to the network, the
corresponding response is a network response and can be handled. The previous attempt at obtaining
the response from a service worker is irrelevant.
Note also that a service worker update is a network response, and is therefore
not affected by this restriction. This is important in order to support the
use case in [[#example-killswitch]].
<h2 id="privacy">Privacy Considerations</h2>
<h3 id="user-vs-author">Web developers control the timing.</h3>
If triggered at appropriate times, <a http-header>`Clear-Site-Data`</a> can
increase a user's privacy and security by clearing sensitive data from their
user agent. However, note that the web developer (and <em>not</em> the user)
is in control of when the clearing event is triggered. Even assuming a
non-malicious site author, users can't rely on data being cleared at any
particular point, nor are users in control of what data types are cleared.
If a user wishes to ensure that site data is indeed cleared at some specific
point, they ought to rely on the data-clearing functionality offered by their
user agent.
At a bare minimum, user agents OUGHT TO (in the [[RFC6919]] sense of the
words) offer the same functionality to users that they offer to web
developers. Ideally, they will offer significantly more than we can offer at
a platform level (clearing browsing history, for example).
<h3 id="remnants">Remnants of data on disk.</h3>
While <a http-header>`Clear-Site-Data`</a> triggers a clearing event in a
user's agent, it is difficult to make promises about the state of a user's
disk after a clearing event takes place. In particular, note that it is up
to the user agent to ensure that all traces of a site's date is actually
removed from disk, which can be a herculean task (consider virtual memory,
as a good example of a larger issue).
In short, most user agents implement data clearing as "best effort", but
can't promise an exhaustive wipe.
If a user wishes to ensure that site data does not remain on disk, the best
way to do so is to use a browsing mode that promises not to intentionally
write data to disk (Chrome's "Incognito", Internet Explorer's "InPrivate",
etc). These modes will do a better job of keeping data off disk, but are
still subject to a number of limitations at the edges.
</section>
<section>
<h2 id="iana-considerations">IANA Considerations</h2>
The permanent message header field registry should be updated
with the following registration: [[!RFC3864]]
<h3 id="iana-clear-site-data">
Clear-Site-Data
</h3>
<dl>
<dt>Header field name</dt>
<dd>Clear-Site-Data</dd>
<dt>Applicable protocol</dt>
<dd>http</dd>
<dt>Status</dt>
<dd>standard</dd>
<dt>Author/Change controller</dt>
<dd>W3C</dd>
<dt>Specification document</dt>
<dd>This specification (See [[#header]])</dd>
</dl>
<section>
<h2 id="acknowledgements">Acknowledgements</h2>
Michal Zalewski proposed a variant of this concept, and Mark Knichel helped
refine the details.
</section>