[Deploying Sakai] Sakai crashing

Charles Hedrick hedrick at rutgers.edu
Tue Sep 14 12:22:57 PDT 2010


I doubt that there's much experience in the community with that kind of setup. I believe most people are using the concurrent GC. 768 perm gen is probably more than you need. We use 512M.  For reference, here's our settings:

JAVA_OPTS=" -d64 -Dsun.lang.ClassLoader.allowArraySyntax=true -Xmx10500m -Xms10500m -Xmn2500m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=80 -XX:MaxPermSize=512m -XX:PermSize=512m -XX:+DisableExplicitGC -XX:+DoEscapeAnalysis -Dhttp.agent=Sakai "

Given a machine with 16 GB (which is what we have) I wouldn't try 4 copies. I'd actually use 1, but I'm unusual.

We have 7000 simultaneous users. We split that across 3 to 4 systems, each a single JVM in 16 GB. Processors are old Opterons, slow by today's standards. We have 400+ threads active per machine, 


On Sep 8, 2010, at 7:36 PM, Mike De Simone wrote:

> what is your maxThreads set to in server.xml?  we normally use 500.  That however is not the likely culprit with what I can tell.
> 
> If you are setting max & min heap to 4g and you have 4 tomcats on a server with 16g RAM, that seems too high because once you add in the perm gen space and room for the OS and for apache, you could be oversubscribing your RAM and hence the OOM error 1 of the tomcats saw.
> 
> maybe change Xms to 1g to allow the heap to fluctuate.  Or reduce Xmx to 3g perhaps.
> 
> Otherwise, usually when I see these seg faults, I wind up installing a clean tomcat which more often than not fixes the problem.  Not sure why, but just that's what I've seen in the past.
>  
> 
> Thanks,
> 
> -------------------------------
> Mike DeSimone
> Sr. Technical Consultant
> rSmart | 602-490-0473
> 
> 
> On Wed, Sep 8, 2010 at 16:27, Benito J. Gonzalez <bgonzalez2 at ucmerced.edu> wrote:
> We are having problems keeping Sakai running in production.  We have
> seen a crash today and yesterday.  About to dig into the dumps.
> 
> Anyone else seen issues with
> 
> I was using JConsole at the time of the crashes.  Memory usage was
> between 800m and 1300m for both Tomcats that failed.  Thread counts were
> bouncing around 380, which is normal for us.  CPU usage did not spike.
> 
> Our environment:
> 2x Solaris: SunOS 5.10 Generic_125101-05, 16G phys mem, i86 CPU (not
> sure of details), JDK 1.5.0_20
> One box is running Apache 2.0, 4x Tomcats with the following memory
> settings:
> -d64 -server -XX:MaxNewSize=500m -XX:MaxPermSize=650m -XX:+UseParallelGC
> -Djava.awt.headless=true -Dhttp.agent=Sakai-News-Too
> -Xms4096 -Xmx4096
> 
> 3/4 Tomcats crashed (see log messages below).  The one that stayed up is
> an admin one that is not part of the load balancer.
> 
> #
> # An unexpected error has been detected by HotSpot Virtual Machine:
> #
> #  SIGBUS (0xa) at pc=0xfffffd7ff81813b0, pid=15481, tid=2335
> #
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (1.5.0_20-b02 mixed mode)
> # Problematic frame:
> # J  java.util.Hashtable.get(Ljava/lang/Object;)Ljava/lang/Object;
> #
> # An error report file with more information is saved as hs_err_pid15481.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> #
> 
> and
> 
> #
> # An unexpected error has been detected by HotSpot Virtual Machine:
> #
> #  SIGBUS (0xa) at pc=0xfffffd7ff8328d00, pid=15220, tid=2242
> #
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (1.5.0_20-b02 mixed mode)
> # Problematic frame:
> # J  sun.reflect.Reflection.getCallerClass(I)Ljava/lang/Class;
> #
> # An error report file with more information is saved as hs_err_pid15220.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> #
> 
> and
> 
> #
> # An unexpected error has been detected by HotSpot Virtual Machine:
> #
> #  SIGBUS (0xa) at pc=0xfffffd7ff800acca, pid=168, tid=69
> #
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (1.5.0_20-b02 mixed mode)
> # Problematic frame:
> # j  java.lang.OutOfMemoryError.<init>(Ljava/lang/String;)V+0
> #
> # An error report file with more information is saved as hs_err_pid168.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> #
> 
> 
> Second box is running one Tomcat with:
> -d64 -server -Xms2048m -Xmx2048m -XX:MaxNewSize=500m
> -XX:MaxPermSize=768m -XX:+UseParallelGC -Djava.awt.headless=true
> -Dhttp.agent=Sakai-News-Tool
> 
> The second box did not crash and seemed to handle all the load.
> 
> Can anyone shed some light on this?
> 
> Thanks,
> 
> --
> Benito J. Gonzalez
> Manager, Enterprise Web Application Development
> Information Technology Department
> University of California, Merced
> Desk: 209.228.2974
> Cell: 209.201.5052
> Email: bgonzalez2 at ucmerced.edu
> 
> _______________________________________________
> production mailing list
> production at collab.sakaiproject.org
> http://collab.sakaiproject.org/mailman/listinfo/production
> 
> TO UNSUBSCRIBE: send email to production-unsubscribe at collab.sakaiproject.org with a subject of "unsubscribe"
> 
> _______________________________________________
> production mailing list
> production at collab.sakaiproject.org
> http://collab.sakaiproject.org/mailman/listinfo/production
> 
> TO UNSUBSCRIBE: send email to production-unsubscribe at collab.sakaiproject.org with a subject of "unsubscribe"

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/production/attachments/20100914/311b98a1/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2669 bytes
Desc: not available
Url : http://collab.sakaiproject.org/pipermail/production/attachments/20100914/311b98a1/attachment.bin 


More information about the production mailing list