[Building Sakai] Java 6 seems to be a problem for us

Charles Hedrick hedrick at rutgers.edu
Wed Oct 14 04:58:28 PDT 2009


I'm not going to be able to do any more to answer, since my user  
population has run out of patience with testing this semester. The  
current settings are running well. Yes, I have spent hours watching  
java with jstat -gc and catalina.out, so I do know how the various  
settings behave.

I didn't let Java choose the garbage collector because the  
documentation suggested that it would always choose parallel. I don't  
agree that the priority is throughput. We have plenty of CPU, and  
adding more is cheap. I don't care about pauses of a second or two,  
but longer cause a bad user experience, and pauses of a minute are  
unacceptable. Users think the system is down (actually the load  
balancer thinks the system is down and switches people, thus requiring  
a new login). I saw long pauses with the parallel GC. I haven't seen  
any so far with CMS with this tuning.

Once having chosen that, I then have to specify survivor space, as you  
note. However this seems not to be true in 6 -- in 5 you have to say  
something or the survivor spaces aren't used, but in 6 the sizes  
chosen are reasonable; it's just that it uses only 1 or 2 generations,  
which may not be optimal.

Possibly the use of a large space is out of date. When I was doing the  
primary tuning, some of the code handled uploaded files by reading  
them into a byte array. So it needed to allocate an array the size of  
the file being loaded. We want to support 1 GB files. Until we used a  
large new, we'd get problems when someone uploaded a large file. I  
can't find the URL at the moment, but Sun published a white paper  
giving recommendations for 5 based on experience. They also  
recommended large new. Indeed my settings were taken almost directly  
from that paper.

It is certainly possible that old is too large. I've been slowly  
lowering it. But I've watched the usage after full GC's, and haven't  
been comfortable with much lower. I want a reasonable safety margin.  
With these settings, old is about 9 GB. Since the default starting  
point for CMS in Java 5 is 68%, that's really a maximum size of 6 GB.  
I've seen the heap after a full be 4 GB, so I don't think a smaller  
heap makes sense.

Java considers 64-bit Solaris systems to be server, so -server isn't  
needed for us.



On Oct 14, 2009, at 5:42:56 AM, Aaron Zeckoski wrote:

>> JAVA_OPTS=" -d64 -Xmx12000m -Xms12000m  -XX:+UseConcMarkSweepGC
>> -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=75 - 
>> XX:MaxPermSize=512m
>> -XX:PermSize=512m  -XX:+DisableExplicitGC
>>
>> For 6 I decided to let it tune as much as possible, so I started with
>>
>> JAVA_OPTS=" -d64 -Xmx12000m -Xms12000m  -XX:+UseConcMarkSweepGC
>> -XX:+UseParNewGC -XX:MaxPermSize=512m -XX:PermSize=64m "
>
> If you run with these settings you will effectively have no survivor
> spaces since the default survivor space ratio when using
> UseConcMarkSweepGC is a very high number. When I say allowing the JVM
> to autotune (via ergonomics) I mean something more like this (as a
> starting point):
> JAVA_OPTS="-server -d64 -Xmx12000m -XX:MaxPermSize=512m - 
> XX:GCTimeRatio=49"
>
> You may want to allocate the complete memory block by setting the Xms
> in production but you will want to know how much you truly need before
> you set a massive size like 12 GB. It takes a long time to garbage
> collect so much memory and if you can run with less you definitely
> will want to. You would also want to play with the ergonomics
> settings.
> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html
>
> I like this article:
> http://kirk.blog-city.com/advice_on_jvm_heap_tuning_dont_touch_that_dial.htm
>
> I would not leave out the -server option, it should be the first one
> according to Sun. I would also not use the Xmn setting and instead
> adjust ratios. Finally, I would use the parallel collector
> (UseParallelGC) and parallel compactor (UseParallelOldGC) rather than
> the low pause collector (UseConcMarkSweepGC) since in web apps the top
> priority is throughput and not pause times. That's where I would start
> anyway and then begin collecting numbers to see what settings are
> working in my environment.
>
> Did you get any charts of the memory spaces over time when you were
> running with the trimmed down settings? Do you have any for the
> current settings (over maybe 24 hours)?
> -AZ
>
>
> On Wed, Oct 14, 2009 at 5:16 AM, Charles Hedrick  
> <hedrick at rutgers.edu> wrote:
>> On Oct 13, 2009, at 11:18:37 AM, Aaron Zeckoski wrote:
>>
>>>> Have you tried running without the tuning params and seeing where  
>>>> the
>>>> auto tuning settles? You are currently forcing the JVM to run a  
>>>> series
>>>> of params that might be brutally inefficient and it can be eye- 
>>>> opening
>>>> to see how the JVM tunes itself. You may have already done this and
>>>> this might be how you settled on these numbers but I thought I  
>>>> would
>>>> ask anyway.
>>>
>>
>>
>>
>> With 5 I started with recommendations from a Sun tuning paper. In 5  
>> more
>> tuning is needed than in 6. Indeed 6 tends to default to something  
>> close to
>> the recommendations we followed for tuning 5.
>>
>> For 5:
>>
>> -d64 -Xmx12000m -Xms12000m  is necessary.
>>
>> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC  is the basic choice of  
>> GCs, which
>> won't default in 5.
>>
>> -Xmn3g  was done fairly early, to give us a large enough eden to  
>> survive
>> uploading large files. I believe we allow 1 GB uploads. We had  
>> problems
>> without this.
>>
>> -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=90 - 
>> XX:MaxTenuringThreshold=15
>>  is from a Sun tuning paper.  In 6 we tried without it, and in  
>> retrospect
>> I'm not sure it's needed in 6. However in 6, it tends to default to  
>> one or
>> two generations, which a number of people think is too small.
>>
>> -XX:MaxPermSize=512m -XX:PermSize=512m    max is certainly needed.  
>> I've
>> watched with jstat and seen both the full gc's caused by starting  
>> with a
>> small permsize, and the pauses caused by those GCs.
>>
>> -XX:+UseMembar -XX:-UseThreadPriorities     is probably unnecesary,  
>> as I've
>> said. However I'd tried it both way and not seen any performance  
>> problem.
>> I'd probably start without this, particularly in 6.
>>
>> -XX:+DisableExplicitGC   is recommended commonly. We really don't  
>> want audio
>> uploads in Samigo to trigger a full GC (this is fixed in trunk).
>>
>>
>> For 6, if we get the pauses fixed so we can go back I think the  
>> following
>> would do:
>>
>> JAVA_OPTS=" -d64 -Xmx12000m -Xms12000m  -XX:+UseConcMarkSweepGC
>> -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=75 - 
>> XX:MaxPermSize=512m
>> -XX:PermSize=512m  -XX:+DisableExplicitGC
>>
>> For 6 I decided to let it tune as much as possible, so I started with
>>
>> JAVA_OPTS=" -d64 -Xmx12000m -Xms12000m  -XX:+UseConcMarkSweepGC
>> -XX:+UseParNewGC -XX:MaxPermSize=512m -XX:PermSize=64m "
>>
>> I can tell you that none of the changes  caused a problem. Some are  
>> probably
>> just unnecessary. The things I feel strongly about in 6 based on our
>> experience are the CMSInitiatingOccupancyFraction and starting  
>> permsize
>> larger. I think you'll find a number of experts that think the  
>> default
>> initiating point is way too high in 6, and that too few survivor  
>> generations
>> are used by default. At any rate, with the default
>> CMSInitiatinngOccupancyFraction we saw slow full GCs. I suspected  
>> memory
>> fragmentation, but it's possible that whatever problem is resulting  
>> in our
>> slow minor GCs was actually at fault, and when that is fixed we  
>> won't need
>> this either.
>>
>>
>
>
>
> -- 
> Aaron Zeckoski (azeckoski (at) vt.edu)
> Senior Research Engineer - CARET - University of Cambridge
> https://twitter.com/azeckoski - http://www.linkedin.com/in/azeckoski
> http://aaronz-sakai.blogspot.com/ - http://tinyurl.com/azprofile

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai-dev/attachments/20091014/64cc3f7e/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2421 bytes
Desc: not available
Url : http://collab.sakaiproject.org/pipermail/sakai-dev/attachments/20091014/64cc3f7e/attachment.bin 


More information about the sakai-dev mailing list