Monday, August 02, 2004

Experiences In Java Performance Engineering

I have been spending copious amounts of time trying to speed up our Web server. The results have been heartening and the experience typical of performance engineering. It's one of the most amazing activities of software development. Incredibly frustrating when you cannot speed it up but an adrenaline-rush when it does and the first person I bother when it works is poor Rajiv.

I am trying to log the various performance enhancements - some are just tips and some are things I discovered or used. Some you know, perhaps, some you don't.


  • Socket Writes : The lesser the better. Buffer your socket writes so you never inadvertently write multiple times to the socket. Copying to the buffer is much less expensive than a socket write especially for small data.
  • Socket Reads : Similarly, read as much as you can from the socket in one shot. Smaller reads will result in lesser performance. These rules are true of File streams as well.
  • Socket Connections : Again the lesser the better. Creating a connection is extremely expensive. You can do several ten times of requests/second more on a kept-alive connection than if you have to create connections.
  • Socket properties : Setting socket properties is expensive. e.g. Avoid setting socket properties per request, see if you can change them to per connection? [multiple requests in a kept-alive connection]
  • Data Copies

  • String Operations : If you are writing extremely performance sensitive code such as a highly benchmarked web server :-) then keep Strings to a minimum. I cannot stress this enough. If you have a large character stream that you need to tokenize, parse, etc, use references into character arrays to reduce String operations. Create a class that takes a pointer to this character array, its start and length and you can create lots of instances of this class to point to pieces of the character stream instead of String objects which make copies of the characters. String concatenations can be quite expensive too.
  • Avoid data copies : This is obvious, isn't it? However, there are so many methods that try to be safe and make data copies it happens without even one knowing it. Examples of classes that do it are - String , ByteArrayOutputStream.toByteArray(). Don't get me wrong - there's nothing wrong with these classes, it's just that sometimes one doesn't realize that these methods are causing data copies which is affecting performance.
  • Data Structures

  • Set instead of List : A common programming mistake is the wrong choice of data structure to do a contains() . List.contains() is an O(n) operation versus Set.contains() for HashSet has a best case behaviour of O(1). So, if order does not matter to you, use Set.
  • Object Pooling : Some classes are expensive to create. eg. large arrays. Use pools of these objects so they can be reused, there by preventing GC lags for these, and primarily creation costs. Interestingly, PrintWriter, if pooled, shows considerable performance improvement. The creation of the object is expensive because of a call to get the line separator in its constructor.
  • Miscellaneous

  • You can buffer at unexpected places. e.g. You may have a logger thread that asynchronously logs certain frequently running activities (e.g. access logs) and notifying the writing thread every time will be expensive. Might help considerably to collect a few and then notify the thread.
  • Integer.toString() is much much faster than integer + "".
  • try-finallies result in interesting performance degradation in Sun JVMs. Read about it here and Rajiv's excellent follow-up on it here.
  • Lazy instantiation : Don't create something until you need it.
  • The usual optimizations always apply - loop optimizations, moving loop-invariant code out of the loop, unused variables, repetitive processing, etc.
  • Perceived performance : [I would love to collect my thoughts on that one sometime - we have had some very interesting experiences with perceived performance enhancements over the years ] The faster you respond so that the browser starts to refresh [if memory serves me right, a 50 milli- second response is perceived to be instantaneous by a human being] the more responsive the web-server looks although the total amount of time for the response might be the same as in the case if all data was returned in a single write.
Oh well, those are all I can think of right now. If I think of any I missed out, I'll post an addendum.


Gabriel Mihalache said...

First of, all suck-up aside, it shows that you work for pramati. Secondly, your simple advice on Set versus List is pure gold!

Regarding human perception... the human eye can do about 28 "frames per second", but you have no guarantee that they'll overlap with the refreshes of the CRT. 35-40 miliseconds should be more than enough.

Anonymous said...

Well, ordering (or lack thereof) isn't the sole reason to use Set instead of List. Recall that Set stipulates that no duplicates are allowed whereas List allows dupliates. You should ONLY use a Set where your data is, in fact, a Set and does not contain duplicates.

Anonymous said...

Doesn't the StringBuffer (starting with JDK1.2 or JDK1.3) use copy-on-write? So, given

String s1 = "This is a test";
String s2 = s1.substring(3,5);

s2 will share its character array with s1. In essense, the String object is already doing what you are recommending to implement with custom string reference objects.

Ramesh L. said...

Cool stuff Sachin!

Anonymous said...

StringBuffer's substring() method delegates to String(int,int,char[]) which does, as you say, just refer to the private char[] of the original String. (Incidentally, is this is a potential memory leak if you have long long-lived references to small substrings of large Strings?) I don't doubt Sachin, but I'm curious what causes the performance problems.

Anonymous said...

Hmm, i thought object creation wasn't that much of a problem with the new GC.

Anonymous said...

It is possible to write a server without any object creation, except for the objects Java creates for security checking. How? No matter what your server does, every request uses the same set of objects. Preallocate them, then reuse them. If there are some variations, preallocate the objects required for that too. If there are shared resources (files, db, other tiers) pool them. A server's tasks are very repetitive: there are never surprises, so you never need to use "new".

This a server - why do you need Strings? Strings are for humans. Pool char or byte arrays.

Are your arrays not exactly the right size? Doesn't matter: Pool big ones, and only use the number of bytes you need. Computers have lots of memory now days.


Sachin said...

Firstly, thanks to Dion for linking my article here. My primary intention was to chronicle my findings while involved in a performance tuning exercise. The effort was focused at brute throughput improvements and therefore involved more low-level tweaking. I would have put more effort into the article if I had known I was addressing an audience of this magnitude :-). So, being a victim of unexpected attention, I think it would be prudent to cover a couple of findings in more detail since quite a few people have mentioned them.

I would very strongly agree with people who have mentioned that 'clean' code is much better than 'clever' code. Some of the tunings are potential cases where encapsulation gets broken, code maintainability gets affected. Also, if performance is satisfactory, I would prefer not to go overboard tuning it. Sometimes though, when you are competing with other very similar [standard-based] products, you cannot explain to a potential customer that your code is slower than XYZ because our code is much 'cleaner'.

A lot of people have mentioned String.substring* making a reference to the original char array. Yes it does that, there are, however, some useful cases of using the character array and pointers to pieces of data inside it [lets call it a CharBuffer] without using String. Lets take the case of the Web Server again. HTTP request headers are read from the socket's input stream into a buffer. There is no way of creating a String from this buffer without a data copy. Subsequently there is no way to retrieve the character array back from String without a data copy.
There are cases where you need to convert the String to a char[] e.g. the request line needs to be written to access logs. CharBuffer saves data copies in these situations. Another situation arises because of the case insensitive nature of HTTP headers. It's cheaper to convert them to lower case or upper case in place and then process the request - again CharBuffer helps.

Object pooling does help considerably although it could again be considered as excessive optimization for most software. Although, GC performance has significantly improved version after version, if you really have long lasting allocations like buffers, pooling does help. PrintWriter, like I mentioned in the article, is a different case, where the cost of executing its constructor makes it a good choice for pooling.

* for business reasons, our code compiles with JDK 1.3 so I cannot make use of JDK 1.4 features. If some point is redundant as a result of newer classes in JDK 1.4 please bear with me :-)

Adrian said...

Make me a happy person. change your blogger settings to automatically ping weblogs. Then I'll know when you've updated. I look forward to more posts.

Ashish said...

Knock Knock.