HTTP-заголовки, часть 2
И еще несколько полезных заголовков. Заголовок Keep-Alive указывает серверу, что соединение нужно оставить открытым: сервер не будет закрывать соединение сразу после отправки ответа. Это приведет к тому, что следующий запрос от этого же клиента к серверу будет выполнен быстрее.
Но если все клиенты будут требовать постоянного соединения, тогда проблемы начнутся уже у сервера. Сервер будет или недоступен, или начнет закрывать соединения по своему выбору.
7.2 Заголовок Cache-Control
С помощью заголовка Cache-Control можно управлять кэшированием контента. Грамотно настроенное кэширование ускоряет работу с контентом, криво настроенное создает проблемы на ровном месте.
Чтобы отключить кэширование нужно написать такой заголовок:
Cache-Control: no-cache, no-store, must-revalidate
В кэше не должно сохраняться ничего — ни по запросам клиента, ни по ответам сервера. Запрос всегда отправляется на сервер, ответ всегда загружается полностью.
Также можно включить самый примитивный и надежный тип кэширования:
Перед тем, как выдать копию, кэш запрашивает исходный сервер на предмет актуальности ресурса.
Можно указать время кэширования ресурса в секундах. Выглядеть такой заголовок будет так:
Cache-Control: max-age=31536000
Этот заголовок задает максимальное время хранения контента в кэше.
Более детально про кеширование ты можешь почитать тут
7.3 Cookie
Сервер может хранить данные на стороне клиента . Такие данные называются cookie. Впрочем, cookie может сохранить и клиент. Они бывают очень полезны обеим сторонам.
Например, ты заходишь на сайт, а ты на нем уже авторизирован. То есть, когда ты залогинился в него в прошлый раз, сервер приказал браузеру сохранить у себя информацию об успешном логине определенного пользователя.
Вот как выглядит Cookie в запросе:
Cookie: name=value;name2=value2;nameN=valueN00
Cookie обычно хранит браузер и они привязаны к определенному домену . Когда ты снова заходишь на тот же домен, cookie автоматически добавляются к http-запросу и http-ответу. Сервер/домен не может получить cookie, которые хранит в браузере другой сервер/домен.
У каждого cookie есть 4 основных параметра:
Cookie хранятся и передаются в текстовом виде, так что и имя, и значение – это строки. Если время действия cookie не указано, то они уничтожаются после закрытия браузера.
7.4 Session
После того, как пользователь залогинился на сайте, говорят, что между сайтом и сервером установилась сессия.
Сервер у себя создает специальный объект – HttpSession, где хранит всю нужную информацию для работы с авторизованным клиентом. А уникальный номер этого объекта хранит в браузере в виде Cookie.
Веб-сервера на Java обычно используют имя JSESSIONID для хранения идентификатора сессии. Выглядит это примерно так:
На стороне сервера можно задать время существования сессии, а также то, будет ли она автоматически закрываться при закрытии браузера.
Persistent Connections
HTTP persistent connections, also called HTTP keep-alive, or HTTP connection reuse, is the idea of using the same TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new one for every single request/response pair. Using persistent connections is very important for improving HTTP performance.
There are several advantages of using persistent connections, including:
- Network friendly. Less network traffic due to fewer setting up and tearing down of TCP connections.
- Reduced latency on subsequent request. Due to avoidance of initial TCP handshake
- Long lasting connections allowing TCP sufficient time to determine the congestion state of the network, thus to react appropriately.
The advantages are even more obvious with HTTPS or HTTP over SSL/TLS. There, persistent connections may reduce the number of costly SSL/TLS handshake to establish security associations, in addition to the initial TCP connection set up.
In HTTP/1.1, persistent connections are the default behavior of any connection. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server. However, the protocol provides means for a client and a server to signal the closing of a TCP connection.
What makes a connection reusable?
Since TCP by its nature is a stream based protocol, in order to reuse an existing connection, the HTTP protocol has to have a way to indicate the end of the previous response and the beginning of the next one. Thus, it is required that all messages on the connection MUST have a self-defined message length (i.e., one not defined by closure of the connection). Self demarcation is achieved by either setting the Content-Length header, or in the case of chunked transfer encoded entity body, each chunk starts with a size, and the response body ends with a special last chunk.
What happens if there are proxy servers in between?
Since persistent connections applies to only one transport link, it is important that proxy servers correctly signal persistent/or-non-persistent connections separately with its clients and the origin servers (or to other proxy servers). From a HTTP client or server’s perspective, as far as persistence connection is concerned, the presence or absence of proxy servers is transparent.
What does the current JDK do for Keep-Alive?
The JDK supports both HTTP/1.1 and HTTP/1.0 persistent connections.
When the application finishes reading the response body or when the application calls close() on the InputStream returned by URLConnection.getInputStream() , the JDK’s HTTP protocol handler will try to clean up the connection and if successful, put the connection into a connection cache for reuse by future HTTP requests.
The support for HTTP keep-Alive is done transparently. However, it can be controlled by system properties http.keepAlive , and http.maxConnections , as well as by HTTP/1.1 specified request and response headers.
The system properties that control the behavior of Keep-Alive are:
http.keepAlive=
default: true
Indicates if keep alive (persistent) connections should be supported.
http.maxConnections=
default: 5
Indicates the maximum number of connections per destination to be kept alive at any given time
HTTP header that influences connection persistence is:
If the «Connection» header is specified with the value «close» in either the request or the response header fields, it indicates that the connection should not be considered ‘persistent’ after the current request/response is complete.
The current implementation doesn’t buffer the response body. Which means that the application has to finish reading the response body or call close() to abandon the rest of the response body, in order for that connection to be reused. Furthermore, current implementation will not try block-reading when cleaning up the connection, meaning if the whole response body is not available, the connection will not be reused.
What’s new in JDK 5 ?
When the application encounters a HTTP 400 or 500 response, it may ignore the IOException and then may issue another HTTP request. In this case, the underlying TCP connection won’t be Kept-Alive because the response body is still there to be consumed, so the socket connection is not cleared, therefore not available for reuse. What the application needs to do is call HttpURLConnection.getErrorStream() after catching the IOException , read the response body, then close the stream. However, some existing applications are not doing this. As a result, they do not benefit from persistent connections. To address this problem, we have introduced a workaround.
The workaround involves buffering the response body if the response is >=400, up to a certain amount and within a time limit, thus freeing up the underlying socket connection for reuse. The rationale behind this is that when the server responds with a >=400 error (client error or server error. One example is «404: File Not Found» error), the server usually sends a small response body to explain whom to contact and what to do to recover.
Several new Oracle JDK implementation specific properties are introduced to help clean up the connections after error response from the server.
sun.net.http.errorstream.enableBuffering=
default: false
With the above system property set to true (default is false), when the response code is >=400, the HTTP handler will try to buffer the response body. Thus freeing up the underlying socket connection for reuse. Thus, even if the application doesn’t call getErrorStream() , read the response body, and then call close(), the underlying socket connection may still be kept-alive and reused.
The following two system properties provide further control to the error stream buffering behavior:
sun.net.http.errorstream.timeout= in millisecond
default: 300 millisecond
sun.net.http.errorstream.bufferSize= in bytes
default: 4096 bytes
What can you do to help with Keep-Alive?
Do not abandon a connection by ignoring the response body. Doing so may results in idle TCP connections. That needs to be garbage collected when they are no longer referenced.
If getInputStream() successfully returns, read the entire response body.
When calling getInputStream() from HttpURLConnection , if an IOException occurs, catch the exception and call getErrorStream() to get the response body (if there is any).
Reading the response body cleans up the connection even if you are not interested in the response content itself. But if the response body is long and you are not interested in the rest of it after seeing the beginning, you can close the InputStream. But you need to be aware that more data could be on its way. Thus the connection may not be cleared for reuse.
Here’s a code example that complies to the above recommendation:
try < URL a = new URL(args[0]); URLConnection urlc = a.openConnection(); is = conn.getInputStream(); int ret = 0; while ((ret = is.read(buf)) >0) < processBuf(buf); >// close the inputstream is.close(); > catch (IOException e) < try < respCode = ((HttpURLConnection)conn).getResponseCode(); es = ((HttpURLConnection)conn).getErrorStream(); int ret = 0; // read the response body while ((ret = es.read(buf)) >0) < processBuf(buf); >// close the errorstream es.close(); > catch(IOException ex) < // deal with the exception >>
If you know ahead of time that you won’t be interested in the response body, you should issue a HEAD request instead of a GET request. For example when you are only interested in the meta info of the web resource or when testing for its validity, accessibility and recent modification. Here’s a code snippet:
URL a = new URL(args[0]); URLConnection urlc = a.openConnection(); HttpURLConnection httpc = (HttpURLConnection)urlc; // only interested in the length of the resource httpc.setRequestMethod("HEAD"); int len = httpc.getContentLength();
Changes in JDK 6
Prior to JDK 6, if an application closes a HTTP InputStream when more than a small amount of data remains to be read, then the connection had to be closed, rather than being cached. Now in JDK 6, the behavior is to read up to 512 Kbytes off the connection in a background thread, thus allowing the connection to be reused. The exact amount of data which may be read is configurable through the http.KeepAlive.remainingData system property.