Wednesday, July 14, 2004

Googlemania

My interest in Google continues.

Zeitgeist


Google has a wonderful site called Zeitgeist that collects interesting patterns of data based on the queries they get from around the world. Since Google is no longer something only the "geeks" use, there's more credibility to the diversity of data. Depending on what you like to believe, it shows up some surprising and some not so surprising data!

Take, for example, that 87% of the operating systems hitting Google are Windows and that MSIE still rules inspite of my ardent wish that we all used Mozilla Firefox which really is a brilliant browser.

Zeitgeist can probably do better though. Perhaps, top 10 hits for queries on technology - anyone has any better ideas??!!

Labs@Google


Google has even more interesting stuff. To get a sneak preview of their technologies that might appear in the future try labs.google.com. For example, the key board shortcuts technology they have in gmail was first previewed at the labs. While you are at it give Google Sets a try. If you were wondering how they do all this then try this.

HTTP Headers


If you stop to think about it for a second, its an amazing location to do an automatic survey. And it all happens because of something called HTTP Headers.
As the initiated amongst you have already guessed that the rest of this blog is going to be about me bragging about how much I know about HTTP headers you can give this a clean miss!

When you instruct your browser to go to a particular web resource, it asks the server at that location for the resource. As a part of the communication [which is not visible to the user], every browser sends information about itself, the operating system it is running on and which site it came from. It also returns a small piece of information called a cookie which is stored on your computer. This is a substantial amount of information and with 200 million hits a day, a wealth of information about usage. This information is sent through HTTP headers.

Here's an example of the HTTP headers sent by the browser [Mozilla Firebird] told to access Google's home page. Similar headers will be sent by other browsers like Internet Explorer.

GET /search?q=geek&sourceid=mozilla-search&start=0&start=0&ie=utf-8&oe=utf-8 HTTP/1.1
Host: www.google.com

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6) Gecko/20040206 Firefox/0.8
Accept: text/xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,
image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

Cookie: en_IN; PREF=ID=52eef62d74670ae2:
FF=4:LD=en:NR=10:CR=1:TM=1048623851:
LM=1069183668:S=c2RqGKqGOQXoZV8g


Note the parts marked bold. They specify the client (browser) you used to access Google and the Operating System you are running. The second line that is in bold is a piece of information that Google has put on your computer to identify you the next time you access Google. [this example does not have a referer header]

By the way, in case you are interested there are some interesting sites that discuss privacy issues regarding Google. Whether you would like to believe them or not is upto you :-)!!

No comments: