[Building Sakai] better solution to chat performance problem

Charles Hedrick hedrick at rutgers.edu
Thu Nov 3 15:45:45 PDT 2011


I think I have found a more serious problem that was generating the others.

I've checked a new diff into https://jira.sakaiproject.org/browse/SAK-21353. You'll probably want both diffs, although the first one will no longer have much significance.

The code is doing premature optimization. 

In ChatManagerImpl:update, a message is distributed to all observers. However what is sent is the message ID

                        observer.receivedMessage(ref.getContainer(), ref.getId());

for each message. In receivedMessage, a ChatDelivery is generated for that address. The ChatDelivery has an argument that can be either the ID or the message itself. It sounds very clever to send the ID around, and only expand it to the actual message when needed. The problem is that a separate ChatDelivery is created for every window that is listening. By sending the ID, each window gets its own Delivery object, with the ID, and thus has to create its own actual message object by fetching it from the database. 

I believe the right approach is to fetch the message object once, in ChatManagerImpl:update, and pass the actual object to recuivedMessage. Then everyone would share a single message object, avoiding the multiple fetches from the database. 

One of the issues we saw was high GC rates. Using the Hibernate cache won't avoid creating a new object each time (I don't think). It will just keep them from having to come from the database. Hence the Hibernate cache alone wont' help with the excessive GC. It also won't eliminate the fairly significant overhead of calling Hibernate.

The problem with doing what I suggest is that the message object will have to be fetched in update, which is called inside the event delivery thread. However it's now clear that the problem before wasn't just that a DB query was being done in that thread, but that it was being done once per address. If we do it once per message, we can probably survive. 

Unfortunately this has to be done inside the event delivery thread, since on systems other than the originating one, that's the highest level of code that sees the message at all.

I think the cost of a single DB fetch is low enough that it's worth avoiding two objects being generated per delivery window, even out of the Hibernate cache.

I'm going to leave the cache in place, because it does still avoid some queries, such as the fairly complex fetch of the channel information.



More information about the sakai-dev mailing list