HTTP

From Conservapedia
This is an old revision of this page, as edited by JlHawkwell (Talk | contribs) at 20:56, August 2, 2010. It may differ significantly from current revision.

Jump to: navigation, search

HTTP is a single-transaction text-based data transfer protocol using the request-and-respond method. Following with the client-server model of peer communications, a client constitutes software that makes requests, an example of which is a typical web browser. Servers accept connections and requests from clients and return the results to the client, often with intermediate processing that determines the content to return, or modifies content to return. Software such as MediaWiki used by Conservapedia runs on the server and performs this intermediate processing. Software on the server may also act as a client in the case of a reverse proxy for load balancing, or to obtain content from other servers, as is commonly done in the fight against internet spam.

Requests

Clients make requests by sending a series of headers after connecting to a server. These headers specify the nature of the request in great detail and allow the client to specify exactly what it needs from the server. The first line of the request always includes the type of request, the location on the server, and the HTTP version requested. Servers are free to downgrade to HTTP/1.0 when 1.1 is requested, or upgrade to HTTP/1.1 if 1.0 is requested. All other headers may appear in any order. Requests and responses are comprised of two parts: the header and the body.

Request Headers

  • First Line - Special header in that it states the nature of the request (GET, POST, PUT, and HEAD are the most common), a resource on the server, and the HTTP version. All other headers must appear after this first line.
  • Host - If the server provides content for multiple domain names, this header specifies which domain name the request is to be made against. Since clients never know if this is needed, they always send this header.
  • Accept - List of content types the client is willing to accept.
  • User-Agent - A simple string that identifies the client, usually used by web browsers to provide a scary amount of information that is only sometimes useful.
  • Referer - If the client software is a web browser, this header contains the complete URL from the previous page. Only provided if the user clicked a link.
  • Cookie - Cookies are small bits of text that are meant to carry information across requests. The most common uses of this data include authentication information (when you log into a website) and tracking information (such as a unique client ID used by advertisers for demographics collection).

Response Headers

  • First Line - Similar to the first line in the request header, this one provides the HTTP version used by the server, the status code and the status text.
  • Cache-Control - Instructions on how the response body may be stored for later use to reduce the number of requests made against the server in question.
  • Content-Encoding - Only used when the content has been encoded, such as when the response body has been compressed.
  • Content-Type - Specifies the type of data encapsulated in the response body. This may be text/html for web pages, or image/png for Portable Network Graphics images.
  • Content-Language - When the content requested is available in multiple languages, this header may be used to specify the language the content is written in. Note that this only applies to text/ content types, and refers to human language, not a programming or scripting language.
  • Date - Tells the client what the server's clock thinks the current date and time are. Only required if time-sensitive transfers are used, such as in secure connections.
  • Expires - This is used by the server to set a specific expiration date for the content provided. This may be set sometime in the past to prevent caching, or sometime in the future to encourage caching.
  • Last-Modified - Used to inform the client when the content was last updated, according to the server's clock.
  • Server - Tells clients about the server software in question. Additionally, a header named X-Powered-By tells the client what technology was used in processing the request. These are sometimes disabled by administrators for security purposes.

Example Request and Response

Request:

GET /Conservapedia HTTP/1.1
Host: www.conservapedia.com
Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Referer: http://www.conservapedia.com/Main_Page
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.125 Safari/533.4

Here we can see a request for www.conservapedia.com/Conservapedia and the user clicked on a link to this resource from the front page of Conservapedia. We can also see the user is using an Intel-based Mac running Mac OS X 10.6.4, the browser is a WebKit-based browser called Chrome and is willing to accept any type of content.

Response:

HTTP/1.1 200 OK
Cache-Control: private, must-revalidate, max-age=0
Connection: close
Content-Encoding: gzip
Content-Type: text/html; charset=UTF-8
Content-language: en
Date: Mon, 02 Aug 2010 19:38:27 GMT
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Last-Modified: Mon, 02 Aug 2010 19:10:52 GMT
Server: Apache/1.3.41 (Unix)
Transfer-Encoding: chunked
Vary: Accept-Encoding,Cookie
X-Powered-By: PHP/5.2.5
X-Vary-Options: Accept-Encoding;list-contains=gzip,Cookie;string-contains=cpwiki_mediaToken;string-contains=cpwiki_mediaLoggedOut;string-contains=cpwiki_media_session

The server accepted the request and says everything is good-to-go. We can clearly see the server doesn't like public caches, and doesn't even want local caches to be used. It wants to close the connection after the content is sent, and the content is compressed. The content expired a long time ago, to further discourage clients from caching the content, but was recently modified. This server also tells us it is Apache, with a request processor called PHP.

History

HTTP was developed by Tim Berners-Lee in 1990 [1]. At the time it was known as HTTP/0.9 and was updated and extended many times by many people, yet was not a standard until it was published in RFC-1945 [2] as HTTP/1.0. HTTP was designed to supplement and transfer HTML, also developed by Tim Berners-Lee. Modern servers and clients typically use HTTP/1.1 which was defined as an update to HTTP/1.0 in RFC-2616 [3].

References