Saturday, August 26, 2006

Part 2: Jabbering with Google Talk over XMPP

This is Part 2 of the posts that describe my attempts to do something interesting (eventually) with Google Talk using XMPP. Part 1 is here.

So finally I had my test Google Talk account successfully send a test message to my regular Google Talk account. The XML for that is

<message from=\"XXXXXXXX@gmail.com/D922F673\" to=\"YYYY@gmail.com\" type='chat' xml:lang=\"en\"><body>TESTING!!!</body></message>

Note that the from value is the JID returned by the Gmail server.


I am still quite far from where I want this to go. However, I am glad I am making progress. The last time (in Part 1) my quickly-put-together test code was reaching a point where it could no longer be used because I needed to save conversation state like the jid which it didn't allow.

Now, with a design and a framework in place I hope I can move on to newer things with Google Talk. Here's a brief class design diagram. I am not very formal with UML diagrams - I pick and choose what I like from UML so if it is not classic UML I apologize to those who may get irked by it. Let me know what you think of the design. I haven't seen what the Jive XMPP library design looks like yet.


XML Parsing Challenges
An interesting implementation challenge was XML Pull Parsing. I tried (briefly) the Stax parser in J2SE but I found it to be a little inconvenient with its event id based mechanism. I might be quite wrong though because I am sure I didn't spend as much time exploring its fit into my solution as I should have. I found myself thinking in terms of iterating over available pieces of XML as required.

The XMPP response stream is like an XML document where each response is another child of the document element. And you don't get the next child until you have sent a request. (I don't know how one receives messages yet so there might be some twist to this story later).

So, an XMLIterator which allowed me get the details of the current element and then waited until I asked it move to the next one was what I wanted and this open source project does pretty much that. Thank you very much Mark.

What I also wanted was an API that returned DOM nodes (preferably) as it saw them in the stream. Of course, the document element node will remain incomplete until the end is reached but thats just a technicality because all that is required to "complete" it is the end-tag which has no real information. I was looking for something like this -

XmlIterator iter = new XmlIterator(source);
iter.advance();
// we are now on the document element node
XmlIterator children = iter.children();
while (children.advance()) {
Node node = children.next();
}

I have a simple XML Node structure built over XmlIterator which works for me - maybe that's a project for later.



Next post I aim to be able to maintain a conversation with a Google Talk client and hope to have enhanced the framework to be able to do more things than just chat.

Monday, August 07, 2006

Jabbering with Google Talk over XMPP

I am writing an XMPP client that can talk to Google and maybe implement some cool things on top of it. So this weekend I began exploring the specs and intend to maintain an account of how it is going. XMPP is a widely discussed/implemented topic and has many many clients - so it is no research topic. Why did I choose to do this? Just for fun to try something new (for me) out. I know I can get the XMPP library from Jive Software but thats no fun. Sometimes re-inventing the wheel has its own pleasures :-).

The XMPP specs are an open standard on which Jabber and Google Talk are based. There are a number of extensions which are under consideration and most of the cool things have already been thought of such as RPC over XMPP.

So far as a prototype, I have managed to get connected to talk.google.com, perform starttls and authenticate myself followed by resource binding and initiating a session. Now, my prototype code is hitting its limitations and I will have to spend some time fixing it to be able to do some serious talking with XMPP. I have also managed to get my account successfuly blocked for authenticating incorrectly as well - but got out of that mess. :-)

I was hoping to sniff out the conversation that the Google Talk client is having with the server using Ethereal as per this blog that describes X-GOOGLE-TOKEN, a Google mechanism for Single Sign-on. As per that blog, the Google Chat client does not do starttls but does an XMPP authentication over an un-secure socket using a Google generated token (the actual authentication with the Google token server is over https) so all the communication can be sniffed. However, my client seems to be doing a starttls and I can't sniff any details out after the proceed response. Too bad - appears that Google Talk has changed since the blog was written.

Here's the sequence of communication with the Google Talk server - (formatted for readability with text sent from me in this colour and the response in this colour and comments in this colour).



<stream:stream
to='gmail.com'
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
<?xml version="1.0" encoding="UTF-8"?>
<stream:stream from="gmail.com" id="X0B367FC8A9597BA4" version="1.0" xmlns:stream="http://etherx.jabber.org/streams" xmlns="jabber:client">
<stream:features>
<starttls xmlns="urn:ietf:params:xml:ns:xmpp-tls"/>
<mechanisms xmlns="urn:ietf:params:xml:ns:xmpp-sasl">
<mechanism>X-GOOGLE-TOKEN</mechanism>
</mechanisms>
</stream:features>

<starttls xmlns="urn:ietf:params:xml:ns:xmpp-tls" /> <--- Start TLS - basically the rest of the communication is over SSL
<proceed xmlns="urn:ietf:params:xml:ns:xmpp-tls"/>

TLS Succeeded - we are good to go...

<stream:stream
to='gmail.com'
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
<?xml version="1.0" encoding="UTF-8"?>
<stream:stream from="gmail.com" id="X1A565C1E8E3FD7CA" version="1.0" xmlns:stream="http://etherx.jabber.org/streams" xmlns="jabber:client">
<stream:features>
<mechanisms xmlns="urn:ietf:params:xml:ns:xmpp-sasl">
<mechanism>PLAIN</mechanism>\
<mechanism>X-GOOGLE-TOKEN</mechanism>
</mechanisms>
</stream:features>

Now we get the PLAIN auth mechanism which is basically base64 encoded \u0000username\u0000password string which I have blacked out here.
<auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl' mechanism='PLAIN'>XXXXXXXXXXXXXXXXXX</auth>
<success xmlns="urn:ietf:params:xml:ns:xmpp-sasl"/> <--- authenticated

<stream:stream
to='gmail.com'
xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
<?xml version="1.0" encoding="UTF-8"?>
<stream:stream from="gmail.com" id="X77D6827CD0B365BA" version="1.0" xmlns:stream="http://etherx.jabber.org/streams" xmlns="jabber:client">
<stream:features>
<bind xmlns="urn:ietf:params:xml:ns:xmpp-bind"/><session xmlns="urn:ietf:params:xml:ns:xmpp-session"/>
</stream:features>

<iq type='set' id='bind_1'>
<bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'/>
</iq>
<iq id="bind_1" type="result">
<bind xmlns="urn:ietf:params:xml:ns:xmpp-bind">
<jid>XXXXXXXX@gmail.com/D922F673</jid>
</bind>
</iq>


<iq to='gmail.com' type='set' id='sess_1'><session xmlns='urn:ietf:params:xml:ns:xmpp-session'/></iq>
<iq from="gmail.com" type="result" id="sess_1"/>

Authenticated, Resource bound and Session created. Now, I need to send a message!

</stream:stream>



Next article on this, I hope to have successfully sent a message.