Wednesday, November 15, 2006

Refreshing page in browser: HTTP headers

Working on some task I'm interested in the way how the browser's cache is working. Especially what are particular http headers sent or not by the browser when user is refreshing the page in the browser.

So, here is my experiment. I have 3 browsers: Firefox 2.0, IE 6.0, and Opera 9.0. Opera version may be old, it's because i don't use it. I don't know better browser than Firefox, sorry Opera fans.

How we can see http headers sent by browser ? There is not only one solution here. For this purpose we can use sniffer such ethereal to watch at HTTP protocol, some addition or extension to browser (for example Live Http Headers for FF or HttpWatch for IE), or even a local proxy server. But it is possible to track such data at server side. Lets take Tomcat 5.0 and write a little of code. It will be filter that listen all incoming requests to our web application and prints http headers. So, it is some kind of spider at server side :)

The code listing of filter :

package com.mycoolcompany.filters;

import java.io.IOException;
import java.util.Enumeration;

import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;

public class RequestDumpFilter implements Filter
{
public RequestDumpFilter() {}

public void destroy() {}

public void doFilter(
ServletRequest request,
ServletResponse response,
FilterChain chain) throws IOException, ServletException {

HttpServletRequest req = (HttpServletRequest) request;

System.out.println(req.getMethod()+
" "
+req.getRequestURI()+" HTTP/1.1");
Enumeration enumer = req.getHeaderNames();
String headerName = null;
while(enumer.hasMoreElements())
{
headerName = (String)enumer.nextElement();
System.out.println(headerName+
" : "
+req.getHeader(headerName));
}
chain.doFilter(request, response);
}

public void init(FilterConfig filterConfig)
throws
ServletException {}
}

url-pattern '/*' is the key. It's mean that we'll be listen to all requests for our web app.
Compile, deploy and enjoy :)
So, lets begin to dig.

Point your IE browser to http://localhost:8080/RequestDumper/
You'll get something like this at your server console :

GET /RequestDumper/ HTTP/1.1
accept : image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/x-shockwave-flash,
application/vnd.ms-excel, application/vnd.ms-powerpoint,
application/msword, */*
accept-language : ru
accept-encoding : gzip, deflate
user-agent : Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.2; .NET CLR 1.1.4322)
host : localhost:8080
connection : Keep-Alive

When you begin to frequently refresh te page you may notice that sometning changed :

GET /RequestDumper/ HTTP/1.1
accept : */*
accept-language : ru
accept-encoding : gzip, deflate
user-agent : Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.2; .NET CLR 1.1.4322)
host : localhost:8080
connection : Keep-Alive

Accept header become a little bit shorter ;)

If you press Ctrl-F5 the browser will send the following :

GET /RequestDumper/ HTTP/1.1
accept : */*
accept-language : ru
accept-encoding : gzip, deflate
user-agent : Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)
host : localhost:8080
connection : Keep-Alive
cache-control : no-cache

'cache-control : no-cache' - that's it !

Lets take Firefox. First request looks like this :

GET /RequestDumper/ HTTP/1.1
host : localhost:8080
user-agent : Mozilla/5.0 (Windows; U; Windows NT 5.2; ru; rv:1.8.1)
Gecko/20061010 Firefox/2.0
accept : text/xml,application/xml,application/xhtml+xml,text/html;
q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
accept-language : ru-ru,ru;q=0.8,en-us;q=0.5,en;q=0.3
accept-encoding : gzip,deflate
accept-charset : windows-1251,utf-8;q=0.7,*;q=0.7
keep-alive : 300
connection : keep-alive

Frequently refreshing page will change the headers :

GET /RequestDumper/ HTTP/1.1
host : localhost:8080
user-agent : Mozilla/5.0 (Windows; U; Windows NT 5.2; ru; rv:1.8.1)
Gecko/20061010 Firefox/2.0
accept : text/xml,application/xml,application/xhtml+xml,text/html;
q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
accept-language : ru-ru,ru;q=0.8,en-us;q=0.5,en;q=0.3
accept-encoding : gzip,deflate
accept-charset : windows-1251,utf-8;q=0.7,*;q=0.7
keep-alive : 300
connection : keep-alive
cache-control : max-age=0

'cache-control : max-age=0' - it is the difference.

Pressing Ctrl-F5 gives :

GET /RequestDumper/ HTTP/1.1
host : localhost:8080
user-agent : Mozilla/5.0 (Windows; U; Windows NT 5.2; ru; rv:1.8.1)
Gecko/20061010 Firefox/2.0
accept : text/xml,application/xml,application/xhtml+xml,
text/html;q=0.9,text/plain;
q=0.8,image/png,*/*;q=0.5
accept-language : ru-ru,ru;q=0.8,en-us;q=0.5,en;q=0.3
accept-encoding : gzip,deflate
accept-charset : windows-1251,utf-8;q=0.7,*;q=0.7
keep-alive : 300
connection : keep-alive
pragma : no-cache
cache-control : no-cache


We got some changes : 'pragma : no-cache' and 'cache-control : no-cache'.
And finally Opera.
First request:

GET /RequestDumper/ HTTP/1.1
user-agent : Opera/9.00 (Windows NT 5.2; U; en)
host : localhost:8080
accept : text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1
accept-language : en
accept-encoding : deflate, gzip, x-gzip, identity, *;q=0
connection : Keep-Alive, TE
te : deflate, gzip, chunked, identity, trailers

Frequently refreshing page will change the headers :

GET /RequestDumper/ HTTP/1.1
user-agent : Opera/9.00 (Windows NT 5.2; U; en)
host : localhost:8080
accept : text/html, application/xml;q=0.9,
application/xhtml+xml, image/png, image/jpeg,
image/gif, image/x-xbitmap, */*;q=0.1
accept-language : en
accept-encoding : deflate, gzip, x-gzip, identity, *;q=0
cache-control : no-cache
connection : Keep-Alive, TE
te : deflate, gzip, chunked, identity, trailers

'cache-control : no-cache' - new.

Pressing Ctrl-F5 :

GET /RequestDumper/ HTTP/1.1
user-agent : Opera/9.00 (Windows NT 5.2; U; en)
host : localhost:8080
accept : text/html, application/xml;q=0.9,
application/xhtml+xml, image/png, image/jpeg,
image/gif, image/x-xbitmap, */*;q=0.1
accept-language : en
accept-encoding : deflate, gzip, x-gzip, identity, *;q=0
cache-control : no-cache
connection : Keep-Alive, TE
te : deflate, gzip, chunked, identity, trailers

For now I need a little meditation on HTTP RFC ;)

2 comments:

JArik said...

А чего не на русском ? :)

Stas Ostapenko said...

Ну, на английском приходится читать много, а писать - нет. Вот и практикуюсь :)