深海游弋的鱼 – 默默的点滴

通过HTTP传送数据时，有些时候并不能事先确定body的长度，因此无法得到Content-Length的值，就不能在header中指定Content-Length了，造成的最直接的影响就是：接收方无法通过Content-Length得到报文体的长度，那怎么判断发送方发送完毕了呢？HTTP 1.1协议在header中引入了Transfer-Encoding，当其值为chunked时, 表明采用chunked编码方式来进行报文体的传输

HTTP 1.1中有两个实体头(Entity-Header)直接与编码相关,分别为Content-Encoding和Transfer-Encoding.
先说Content-Encoding, 该头表示实体已经采用了的编码方式.Content-Encoding是请求URL对应实体(Entity)本身的一部分.比如请求URL为http://host/image.png.gz时,可能会得到的Content-Encoding为gzip.Content-Encoding的值是不区分大小写的,目前HTTP1.1标准中已包括的有gzip/compress/deflate/identity等.
与Content-Encoding头对应,HTTP请求中包含了一个Accept-Encoding头,该头用来说明用户代理(User-Agent,一般也就是浏览器)能接受哪些类型的编码. 如果HTTP请求中不存在该头,服务器可以认为用户代理能接受任何编码类型.

接下来重点描述Transfer-Encoding, 该头表示为了达到安全传输或者数据压缩等目的而对实体进行的编码. Transfer-Encoding与Content-Encoding的不同之处在于:
1, Transfer-Encoding只是在传输过程中才有的,并非请求URL对应实体的本身特性.
2, Transfer-Encoding是一个"跳到跳"头,而Content-Encoding是"端到端"头.
该头的用途举例如,请求URL为http://host/abc.txt,服务器发送数据时认为该文件可用gzip方式压缩以节省带宽,接收端看到Transfer-Encoding为gzip首先进行解码然后才能得到请求实体.
此外多个编码可能同时对同一实体使用,所以Transfer-Encoding头中编码顺序相当重要,它代表了解码的顺序过程.同样,Transfer-Encoding的值也是不区分大小写的,目前HTTP1.1标准中已包括的有gzip/compress/deflate/identity/chunked等.
Transfer-Encoding中有一类特定编码:chunked编码.该编码将实体分块传送并逐块标明长度,直到长度为0块表示传输结束, 这在实体长度未知时特别有用(比如由数据库动态产生的数据). HTTP1.1标准规定,只要使用了Transfer-Encoding的地方就必须使用chunked编码,并且chunked必须为最后一层编码.任何HTTP 1.1应用都必须能处理chunked编码.
与Transfer-Encoding对应的请求头为TE,它主要表示请求发起者愿意接收的Transfer-Encoding类型. 如果TE为空或者不存在,则表示唯一能接受的类型为chunked.
其他与Transfer-Encoding相关的头还包括Trailer,它与chunked编码相关,就不细述了.

顾名思义,Content-Length表示传输的实体长度,以字节为单位(在请求方法为HEAD时表示会要发送的长度,但并不实际发送.).Content-Length受Transfer-Encoding影响很大,只要Transfer-Encoding不为identity,则实际传输长度由编码中的chunked决定,Content-Length即使存在也被忽略.

关于HTTP Message Body的长度
在HTTP中有消息体(Message body)和实体(Entity body)之分,简单说来在没有Transfer-Encoding作用时,消息体就是实体,而应用了Transfer-Encoding后,消息体就是编码后的实体,如下:

    Message body = Transfer-Encoding encode(Entity body)
如何确定消息体的长度? HTTP 1.1标准给出了如下方法(按照优先级依次排列):
    1, 响应状态(Response Status)为1xx/204/304或者请求方法为HEAD时,消息体长度为0.
    2, 如果使用了非"identity"的Transfer-Encoding编码方式,则消息体长度由"chunked"编码决定,除非该消息以连接关闭为结束.
    3, 如果存在"Content-Length"实体头,则消息长度为该数值.
    3, 如果消息使用关闭连接方式代表消息体结束,则长度由关闭前收到的长度决定. 该条对HTTP Request包含的消息体不适用.

Message body = Transfer-Encoding encode(Entity body)

如何确定消息体的长度? HTTP 1.1标准给出了如下方法(按照优先级依次排列):

1, 响应状态(Response Status)为1xx/204/304或者请求方法为HEAD时,消息体长度为0.

2, 如果使用了非"identity"的Transfer-Encoding编码方式,则消息体长度由"chunked"编码决定,除非该消息以连接关闭为结束.

3, 如果存在"Content-Length"实体头,则消息长度为该数值.

3, 如果消息使用关闭连接方式代表消息体结束,则长度由关闭前收到的长度决定. 该条对HTTP Request包含的消息体不适用.

具体详细的 RFC 7230 说明如下：

3.3.3.  Message Body Length

   The length of a message body is determined by one of the following
   (in order of precedence):

   1.  Any response to a HEAD request and any response with a 1xx
       (Informational), 204 (No Content), or 304 (Not Modified) status
       code is always terminated by the first empty line after the
       header fields, regardless of the header fields present in the
       message, and thus cannot contain a message body.

   2.  Any 2xx (Successful) response to a CONNECT request implies that
       the connection will become a tunnel immediately after the empty
       line that concludes the header fields.  A client MUST ignore any
       Content-Length or Transfer-Encoding header fields received in
       such a message.

   3.  If a Transfer-Encoding header field is present and the chunked
       transfer coding (Section 4.1) is the final encoding, the message
       body length is determined by reading and decoding the chunked
       data until the transfer coding indicates the data is complete.

       If a Transfer-Encoding header field is present in a response and
       the chunked transfer coding is not the final encoding, the
       message body length is determined by reading the connection until
       it is closed by the server.  If a Transfer-Encoding header field
       is present in a request and the chunked transfer coding is not
       the final encoding, the message body length cannot be determined
       reliably; the server MUST respond with the 400 (Bad Request)
       status code and then close the connection.

       If a message is received with both a Transfer-Encoding and a
       Content-Length header field, the Transfer-Encoding overrides the
       Content-Length.  Such a message might indicate an attempt to
       perform request smuggling (Section 9.5) or response splitting
       (Section 9.4) and ought to be handled as an error.  A sender MUST
       remove the received Content-Length field prior to forwarding such
       a message downstream.

   4.  If a message is received without Transfer-Encoding and with
       either multiple Content-Length header fields having differing
       field-values or a single Content-Length header field having an
       invalid value, then the message framing is invalid and the
       recipient MUST treat it as an unrecoverable error.  If this is a
       request message, the server MUST respond with a 400 (Bad Request)
       status code and then close the connection.  If this is a response
       message received by a proxy, the proxy MUST close the connection
       to the server, discard the received response, and send a 502 (Bad
       Gateway) response to the client.  If this is a response message
       received by a user agent, the user agent MUST close the
       connection to the server and discard the received response.

   5.  If a valid Content-Length header field is present without
       Transfer-Encoding, its decimal value defines the expected message
       body length in octets.  If the sender closes the connection or
       the recipient times out before the indicated number of octets are
       received, the recipient MUST consider the message to be
       incomplete and close the connection.

   6.  If this is a request message and none of the above are true, then
       the message body length is zero (no message body is present).

   7.  Otherwise, this is a response message without a declared message
       body length, so the message body length is determined by the
       number of octets received prior to the server closing the
       connection.

3.3.3. Message Body Length

The length of a message body is determined by one of the following

(in order of precedence):

1. Any response to a HEAD request and any response with a 1xx

(Informational), 204 (No Content), or 304 (Not Modified) status

code is always terminated by the first empty line after the

header fields, regardless of the header fields present in the

message, and thus cannot contain a message body.

2. Any 2xx (Successful) response to a CONNECT request implies that

the connection will become a tunnel immediately after the empty

line that concludes the header fields. A client MUST ignore any

Content-Length or Transfer-Encoding header fields received in

such a message.

3. If a Transfer-Encoding header field is present and the chunked

transfer coding (Section 4.1) is the final encoding, the message

body length is determined by reading and decoding the chunked

data until the transfer coding indicates the data is complete.

If a Transfer-Encoding header field is present in a response and

the chunked transfer coding is not the final encoding, the

message body length is determined by reading the connection until

it is closed by the server. If a Transfer-Encoding header field

is present in a request and the chunked transfer coding is not

the final encoding, the message body length cannot be determined

reliably; the server MUST respond with the 400 (Bad Request)

status code and then close the connection.

If a message is received with both a Transfer-Encoding and a

Content-Length header field, the Transfer-Encoding overrides the

Content-Length. Such a message might indicate an attempt to

perform request smuggling (Section 9.5) or response splitting

(Section 9.4) and ought to be handled as an error. A sender MUST

remove the received Content-Length field prior to forwarding such

a message downstream.

4. If a message is received without Transfer-Encoding and with

either multiple Content-Length header fields having differing

field-values or a single Content-Length header field having an

invalid value, then the message framing is invalid and the

recipient MUST treat it as an unrecoverable error. If this is a

request message, the server MUST respond with a 400 (Bad Request)

status code and then close the connection. If this is a response

message received by a proxy, the proxy MUST close the connection

to the server, discard the received response, and send a 502 (Bad

Gateway) response to the client. If this is a response message

received by a user agent, the user agent MUST close the

connection to the server and discard the received response.

5. If a valid Content-Length header field is present without

Transfer-Encoding, its decimal value defines the expected message

body length in octets. If the sender closes the connection or

the recipient times out before the indicated number of octets are

received, the recipient MUST consider the message to be

incomplete and close the connection.

6. If this is a request message and none of the above are true, then

the message body length is zero (no message body is present).

7. Otherwise, this is a response message without a declared message

body length, so the message body length is determined by the

number of octets received prior to the server closing the

connection.

参考链接

Transfer-Encoding 的作用

一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Transfer-Encoding 的作用

参考链接

发布者

默默

发表回复取消回复

参考链接

发布者

默默

发表回复 取消回复

发表回复取消回复