#57627 closed defect (bug) (fixed)
The Cache-Control header for logged-in pages should include `private`
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | 6.3 | Priority: | normal |
| Severity: | normal | Version: | |
| Component: | Administration | Keywords: | has-patch has-unit-tests |
| Focuses: | privacy | Cc: |
Description
I believe WordPress returns the following Cache-Control header for pages that are rendered for logged-in users:
Cache-Control: no-cache, must-revalidate, max-age=0
I think the relevant code is βhere and βhere.
For pages for logged-in users I believe this header should be modified to include the private directive to indicate that the response should not be cached by intermediary shared cache servers.
The change should not be made everywhere nocache_headers() is used--only for responses that vary based on the logged-in user. And maybe also for users who have recently left a comment (#16612 is related), though it seems like this is hard for the server to know reliably. You could key off the presence of one of the comment_author_* cookies but those aren't always set.
The Meanings of no-cache and private
You might think that no-cache would be sufficient to accomplish this, but it's not. It's a bit confusing but no-cache means "this response may be stored in a cache but it must be revalidated before it is used." And so I believe that shared cache servers are allowed to cache pages rendered for logged-in users.
I've found βMDN's caching guide to be helpful while trying to understand the meaning of the various directives. The Private Caches section says, "If a response contains personalized content and you want to store the response only in the private cache, you must specify a private directive." It's reiterated in the "Do Not Share With Others" section under "Don't Cache." And MDN's βCache-Control header reference contains a similar statement.
What's the Harm?
And of course the risk isn't just that the page is cached on a shared server, but that it's served to a user other than the logged-in user. Thankfully I think the risk is minimal for a few reasons:
no-cachemeans the cache will attempt to revalidate the page before using it. I believe revalidation is not possible by default because WordPress does not set the ETag or Last-Modified header for these responses. Though this isn't a guarantee: Someone could configure their web server or a caching reverse proxy server to set the headers and return HTTP 304 if appropriate. Or a plugin could do these things. The WP Super Cache plugin even has options for "304 Browser caching" and "Enable caching for all visitors" (even logged-in visitors), though I couldn't get it to serve a logged-in page to a non-logged-in user so it looks like it's clever enough to use different cached data based on the user's cookie (I see thatCookieis added to the Vary header), so that's great.
- When used as caching reverse proxies Nginx and Varnish appear to not cache responses if the Cache-Control header includes
no-cache, so they won't cache pages for logged-in users. For Nginx I think it's βthis logic. For Varnish I think it's βthis logic. I think they're allowed to cache these responses and it seems possible that they will in the future, but they don't currently. And as a counter example I believe Squid is willing to cache these responses (βthis FAQ is related but not super clear).
- I suspect shared cache servers are uncommon (thought I've made no attempt to find data about it).
- The number of https sites has increased greatly over time and shared cache servers can't cache objects served over https (unless they decrypt and reencrypt the data, which is mostly only possible in company-managed computers where the company is able to add their own signing certificate to the browser trust store).
So Why Should We Change It?
While I think it's rare that the lack of private will cause harm, WordPress is widely used and there are many ways to configure cache-related headers. I'd guess there is a non-zero chance that this problem has surfaced at some point in time and so I feel that it's worth changing. The risk from adding the header feels low to me.
I'll caveat this ticket by saying that I'm not intimately familiar with caching behavior. I've just been looking at it a lot over the last few days. It's entirely possible that I'm wrong about all of this.
Related Tickets
- #16612 proposes using
nocache headers()for requests with comment cookies. That seems appropriate to me, and also usingprivate. - #21938 proposes adding
no-storeto thenocache headers()list. This is a separate consideration from the issue I'm raising above. I don't know whether it's a good proposal. There's a lot to think about there.