HTTP State & Session Management - JohnHau/mis GitHub Wiki
Introduction HTTP is a stateless (or non-persistent) protocol. Each request is treated by its own. A request will not know what was done in the previous requests. The protocol is designed to be stateless for simplicity. However, some Internet applications, such as e-commerce shopping cart, require the state information to be passed one request to the next. Since the protocol is stateless, it is the responsibility of the application to maintain state information within their application.
A few techniques can be used to maintain state information across multiple HTTP requests, namely,
Cookie Hidden fields of the HTML form. URL rewriting. Cookies HTTP is a stateless protocol. That is, the server does not hold any information on previous requests sent by the client. Client-side cookies were introduced by Netscape to maintain state, by storing client-specific information on the client's machine and later retrieved to obtain the state information. Montulli, the creator of cookie from Netscape, chose the term "cookie" as "a cookie is a well-known computer science term that is used when describing an opaque piece of data held by an intermediary". The term opaque here implies that the content is of interest and relevance only to the server and not the client.
A cookie is a small piece of information that a server sends to a browser and stored inside the browser. The browser will automatically include the cookie in all its subsequent requests to the originating host of the cookie. Take note that cookies are only sent back by the browser to their originating host and not any other hosts. In this way, the server can uniquely identify a client (or a browser).
A cookie has a name and a value, and other attribute such as domain and path, expiration date, version number, and comments. Domain and path specifies which server (and path) to return the cookie. Cookie could be persistent (with a future expiration date) or last only for that particular browser’s session (i.e., removed when you close the browser). A web browser is only required to accept up to 20 cookies per server and 300 cookies in total. Each cookie is limited to 4 KB.
A server can set the cookie's value to uniquely identify a client. Hence, cookies are commonly used for session and user management. Cookie can be used to hold personalized information, or to help in on-line sales/service (e.g. shopping cart), or tracking popular links, or for demographics purpose.
The main limitation of using cookie for tracking is users may disable cookie in their browser.
There are 2 versions of cookie specifications. Version 0 is specified by Netscape, while version 1 is specified in "RFC2965 and RFC 2109: HTTP State Management Mechanism".
"Set-Cookie" Response Header A cookie is created when a client visits a site for the first time. A server-side program sends a response message containing a "Set-Cookie" response header. The header contains a name/value pair of the cookie plus some additional information.
Cookie Version 0 "Set-Cookie" Header (Netscape) Set-Cookie: cookie-name=cookie-value; expires=date; path=path-name; domain=domain-name; secure cookie-name=cookie-value: Required, the name and value of the cookie to be set. expires=date: Specifies the expired date of that cookie in the form "Day, DD-Mon-YYYY HH:MM:SS GMT". If not specified, the cookie will expire when the current user's session ends (i.e., non-persistent cookie). domain=domain-name: Specifies the domain that this cookie is valid. "Tail matching" is performed, e.g. "test.com" matches hostnames "my.test.com" and "a.b.test.com". The default value is the hostname of the server which generates the cookie. path=path-name: Specifies the subset of URLs in the domain for which the cookie is valid. The cookie must first pass the domain matching, before performing the path matching. If the path is not specified, it is default to current page. Secure: If a cookie is marked secure, it will only be transmitted over secure link, e.g., SSL. Multiple Set-Cookie headers can be used in the same message. You can delete a cookie by setting the "expires" to a value in the past, with the same domain and path. For Example:
Set-Cookie: cookie1=111; Expires=Sat, 21-Feb-2004 05:04:54 GMT Cookie Version 1 "Set-Cookie" Header (RFC2109/RFC2965) Set-Cookie: cookie-name=cookie-value; Comment=text; Domain=domain-name; Path=path-name; Max-Age=seconds; Version=1; Secure Max-Age: maximum age of the cookie in seconds. A positive value indicates that the cookie will expire after that many seconds have passed. Note that the value is the maximum age when the cookie will expire, not the cookie's current age. A negative value means that the cookie is not stored persistently and will be deleted when the web browser exits. A zero value causes the cookie to be deleted. Max-Age is used in version 1 in place of Expires. Version: set the version of the cookie protocol this cookie complies with. Value of 0 complies with the original Netscape cookie specification. Value of 1 complies with RFC2965/RFC2109. RFC2965 define a new header "Set-Cookie2", which might not be widely supported by browsers, as follows:
Set-Cookie2: cookie-name=cookie-value; Comment=text; CommentURL=url; Domain=domain-name; Path=path-name; Max-Age=seconds; Port=port-list; Version=1; Secure "Cookie" Request Header The client returns the cookie(s) to the matching domain and path in the subsequent requests, using a "Cookie" request header.
Cookie: cookie-name-1=cookie-value-1; cookie-name-2=cookie-value-2; ...
Security and Privacy Issues Cookie is not a program: A cookie is just a simple piece of text. It is not a program (like JavaScript or Java Applets), macro, or plug-in. It cannot be used a virus. It cannot access your hard disk, read/write your files or format your hard disk. Cookie cannot fill up your hard disk: The number of cookie a particular host can send to you is 20. The cookie size is limited to 4KB. The total number of cookie (from all hosts) is 300. Hence, cookie cannot fill up your hard disk. Cookie can only be sent back to its originated host and not to a third party host. Cookie is sent and stored in clear text and not encrypted. This is susceptible to eavesdropping and malicious tampering during transit. Cookie SHOULD NOT hold confidential information such as password or credit card number as they are sent in clear text. Cookie could pose some privacy concern: A cookie cannot snoop the hard disk and find out who you are or what your income is. The only way such information could end up in a cookie is if you provide them (e.g., filling up a form) to a site that creates the cookie. Furthermore, it is important to note that you will only get the cookie if you have visited the originated host. Many users wonder why they have a cookie from "ad.doubleclick.net" but they have never explicitly visited the host. This is because many sites on the Internet do not keep their advertisements locally, but they subscribe to a media service that places the ads for them thru a hyperlink. That hyperlink returns a ads image together with a cookie, also check the previous cookie saved, for demographic purpose. HTTP Session Management The following techniques can be used for HTTP session management:
Cookies URL rewriting Hidden fields in HTML form. Cookie Most often cookie is used to store a session ID. The session management is done at the server side, instead of client side.
For example, the following Java servlet uses cookie for managing session ID:
HashTable allSessions = new HashTable(); ... String sessionID = getUniqueSessionID(); HashTable sessionData = new HashTable(); allSessions.put(sessionID, sessionData); Cookie sessionCookie = new Cookie("sessionID", sessionID); sessionCookie.setPath("/"); response.addCookie(sessionCookie); The problem on using cookie is some users disable cookie due to the real and perceived privacy concerns over cookies.
Hidden Field in the HTML Form The principle is to include an HTML form with a hidden field containing a session ID in all the HTML pages sent to the client. This hidden field will be returned back to the server in the request. No cookie needed. The disadvantage of this approach is it requires careful and tedious programming effort, as all the pages have to be dynamically generated to include this hidden field. The advantage is all browser supports HTML form.
... URL rewriting The principle is to include a unique session ID in all the URLs issued by the client, which identifies the session. For example,http://host:port/shopping.html;sessionid=value To accomplish this objective, you must rewrite all the URLs in all the HTML files that is send to the client with this unique session ID (such as ,
etc.). Again, careful and tedious programming efforts are required. The advantage is all browser supports URL rewriting. URL rewriting works even if the browsers do not support cookies or the user disables cookies.Examples Java Servlet: Read ....
REFERENCES & RESOURCES RFC 2964: "Use of HTTP State Management", 2000. RFC 2109: "HTTP State Management Mechanism", 1997, (obsolete by RFC2965). RFC2965: "HTTP State Management Mechanism", 2000. RFC 2965 and RFC 2109: "HTTP State Management Mechanism". "Persistent Client State: HTTP Cookie (Cookie Version 0 Specification)", Netscape RFC 1945: "Hypertext Transfer Protocol HTTP/1.0", 1996. RFC 2616: "Hypertext Transfer Protocol HTTP/1.1", 1999.