[Building Sakai] Sakai Memory Issues

David McIntyre david.mcintyre at stanford.edu
Wed Oct 9 14:28:28 PDT 2013


John,

I'm working on installing AppDynamics right now, and I'd like to see your
configuration for assigning users to a business request. Even better would
be tool identification.

Thanks,
  David


On Wed, Oct 9, 2013 at 1:52 PM, John Bush <jbush at anisakai.com> wrote:

> I think I just found the root cause.  I was going back and looking at
> requests right before heap was exhausted and I've been able to trace
> every single time back to loading the announcements tool or synoptic.
> Seeing sites that have 1500 and two that have 150,000 announcements in
> there, all created at the same time (presumably from a site dupe or
> import or something).  So these requests obviously take forever, cause
> memory to fill up, big GC pause comes in which backs up requests, and
> a death spiral ensues.
>
> I'm not sure exactly what defect/user activity is at play here, but
> something is going on to have created so many announcements.  I'd
> guess site dupe or import, or the announcement merge feature or
> something like that. If anyone else is still having trouble, maybe
> mine your database and see if you have any sites with ridiculous
> number of announcements.
>
> I have a data collector in appdynamics that gives me the user and site
> data for business requests.  If anyone else is using appdynamics I can
> tell you how I did that, its pretty handy.
>
> On Wed, Oct 9, 2013 at 7:27 AM, John Bush <jbush at anisakai.com> wrote:
> > Yes, we could take it early on.  We have an alert in place to be able
> > to do that next time, it hasn't occurred again so we tried that.  It
> > hasn't occurred again since Oct 4th, GC has been keeping up ok.
> >
> > In this particular the instance the server was just too far gone its
> > couldn't recover even by bleeding users.  We are testing IU's courier
> > fix with my jforum fix and pushing that out soon.  I'm not convinced
> > either of these fixes are the root or our problem, but its the best
> > we've got until we get more info.
> >
> > On Wed, Oct 9, 2013 at 6:24 AM, David Haines <dlhaines at umich.edu> wrote:
> >> John,
> >>
> >>    Would your setup allow getting a heap dump early on before the node
> get
> >> overloaded?  The server would be unavailable during the dump which may
> take
> >> many minutes.  At Michigan we've gotten heap dumps by taking a server
> out of
> >> the active cluster after running for a while, and then getting a heap
> dump
> >> the next day when people with active sessions on that server have
> drifted
> >> away.
> >>
> >> - Dave
> >>
> >>
> >> On Tue, Oct 8, 2013 at 12:10 PM, John Bush <jbush at anisakai.com> wrote:
> >>>
> >>> Ok this is interesting.  Jeremy, we tried to get a heap dump but our
> >>> node was so over loaded at that point it was impossible.  The patch
> >>> from IU is for a ConcurrentHashMap filling up.  I would have expected
> >>> appdynamics to find that.
> >>>
> >>> I was going to try to use yourkit to look at the heap, I think it can
> >>> do it, although I haven't used it for that purpose before.  Jeremy,
> >>> it might be helpful if you could review this patch and dig in your
> >>> heap and bit and see if it looks like that same thing.
> >>>
> >>>
> >>> On Tue, Oct 8, 2013 at 8:51 AM, Thomas, Gregory J <gjthomas at iu.edu>
> wrote:
> >>> > Indiana recently had a memory leak issue with the delivery objects in
> >>> > courier and the chat tool.  We have a patch here:
> >>> > https://jira.sakaiproject.org/browse/SAK-21398  It hasn't been
> committed
> >>> > to trunk yet as there were some changes that were preferred to be
> done
> >>> > first on its implementation.  However, we have the current patch in
> our
> >>> > production code as of Thursday last week.  It passed our load testing
> >>> > and
> >>> > it passed our dreaded Sunday evening for high server load/usage.
> >>> >
> >>> > Not sure if it's related to issues brought up here, but thought I'd
> >>> > throw
> >>> > it out there as something we've definitely ran into.
> >>> >
> >>> > Thanks,
> >>> > Greg
> >>> >
> >>> >
> >>> >
> >>> > On 10/8/13 10:57 AM, "Kusnetz, Jeremy" <JKusnetz at APUS.EDU> wrote:
> >>> >
> >>> >>We have been seeing similar memory problems since upgrading to 2.9.3
> >>> >>also.  The oldgen space just continually grows until it's out of
> space.
> >>> >>GC just doesn't seem to clean everything up.  We have tried lots of
> >>> >>different combinations of GC, the best we can do is slow down how
> >>> >> quickly
> >>> >>we run out of memory so we are forced to have to restart Sakai
> multiple
> >>> >>times a week.
> >>> >>
> >>> >>Interestingly we are running two instances of Sakai all running from
> the
> >>> >>same code base.  Once instance has the memory issue while the other
> does
> >>> >>not.
> >>> >>
> >>> >>The one that does NOT have the memory issues is primarily using
> jforums,
> >>> >>assignment1, and gradebook1.  The one that IS having memory issues is
> >>> >>primarily using msgcntr, assignment2, and gradebook2.  There is a
> small
> >>> >>amount of jforum usage in the one with memory issues, but it's only
> in a
> >>> >>couple of sites.
> >>> >>
> >>> >>There are a few other differences.  The one with memory issues has 35
> >>> >>sakai nodes, while the one without memory issues is only 2 sakai
> nodes.
> >>> >>But on average the one with 2 nodes has more sessions on each of
> those
> >>> >>nodes then the one with 35 nodes.  The one without memory issues is
> >>> >> using
> >>> >>LDAP, while the one without memory issues is not using LDAP, all
> users
> >>> >>are local but we do use the soap axis SakaiPortalLogin for most of
> it's
> >>> >>session creation.
> >>> >>
> >>> >>We are also using appdynamics to try to find the memory leaks, but
> it's
> >>> >>automatic leak detection only tracks Map and Collection libraries.
> >>> >> There
> >>> >>is another tool to map any custom object, but we would need to know
> what
> >>> >>object to look for.  The only thing appdynamics seems to think are
> >>> >> memory
> >>> >>leaks with it's automatic leak detection are the various ehcaches.
> >>> >>
> >>> >>We did try dumping the heap and opening it in the Eclipse Memory
> >>> >>Analyzer.  I don't have much experience with it.  Even though the
> heap
> >>> >>dump was close to 7GB, it only identified about 2.5GB, the rest were
> >>> >>unreferenced objects.  I'm not sure why those objects weren't GCed.
> >>> >>
> >>> >>
> >>> >>
> >>> >>-----Original Message-----
> >>> >>From: sakai-dev-bounces at collab.sakaiproject.org
> >>> >>[mailto:sakai-dev-bounces at collab.sakaiproject.org] On Behalf Of John
> >>> >> Bush
> >>> >>Sent: Tuesday, October 08, 2013 9:54 AM
> >>> >>To: Mike Jennings
> >>> >>Cc: sakai-dev Developers
> >>> >>Subject: Re: [Building Sakai] Sakai Memory Issues
> >>> >>
> >>> >>we are on the 2.9 tag.   I don't have a smokin' gun yet to really say
> >>> >>if that's the root cause or not.  We are using appdynamics to look at
> >>> >>things in production.  What we are seeing is a single request run
> this
> >>> >>type of query literary hundreds of times.
> >>> >>
> >>> >>[SELECT special_access_id, forum_id, topic_id, start_date,
> >>> >>hide_until_open, end_date, allow_until_date, override_start_date,
> >>> >>override_hide_until_open, override_end_date,
> override_allow_until_date,
> >>> >>password, lock_end_date, users FROM jforum_special_access WHERE
> forum_id
> >>> >>= '116964'  AND topic_id > 0]
> >>> >>
> >>> >>Each one is fast and our database has no issue.  Eventually we see
> >>> >>getting connections taking awhile and threads get blocked but we
> aren't
> >>> >>running into any type of bottleneck load or connection wise on the
> db.
> >>> >> I
> >>> >>think the appservers are simply running stupid amounts of queries and
> >>> >>just eating up connections and then blocking and keeping things in
> >>> >>memory.  Its just a hunch right now.  Its the best we've got so far.
> >>> >>
> >>> >>I wrote some code to cache this stuff yesterday, we are QA'ing today.
> >>> >>Again, I'm not sure its jforum, but jforum certainly isn't helping.
> >>> >>
> >>> >>On Tue, Oct 8, 2013 at 6:22 AM, Mike Jennings <mike_jennings at unc.edu
> >
> >>> >>wrote:
> >>> >>> John,
> >>> >>>
> >>> >>> We are using JForum and Sakai 2.9.2 and are not seeing any Memory
> >>> >>> Issues to date.  What version of JForum's do you suspect this
> memory
> >>> >>>leak in?
> >>> >>>
> >>> >>> Mike Jennings
> >>> >>>
> >>> >>>
> >>> >>> On 10/07/2013 05:39 PM, John Bush wrote:
> >>> >>>>
> >>> >>>> We've been having an issue like this with one client, we think it
> >>> >>>> might be related to jforum.  Do you use jforum ?
> >>> >>>>
> >>> >>>> On Mon, Oct 7, 2013 at 1:11 PM, William Karavites
> >>> >>>> <willkara at oit.rutgers.edu> wrote:
> >>> >>>>>
> >>> >>>>> Hello Community,
> >>> >>>>>
> >>> >>>>> I was wondering if there were any outstanding or recently fixed
> >>> >>>>> memory usage issues for Sakai. We recently came across one that
> took
> >>> >>>>> Sakai down twice in a night for 30 minutes each. We found the
> >>> >>>>> culprit for that one and want to try and be proactive for any
> others
> >>> >>>>> in the future. We are currently running Sakai 2.9.1 and Kernel
> >>> >>>>> 1.3.1.
> >>> >>>>>
> >>> >>>>>
> >>> >>>>> So if there are any improved memory usage patches in Sakai,
> please
> >>> >>>>> let me know.
> >>> >>>>>
> >>> >>>>>
> >>> >>>>> Thank you,
> >>> >>>>> William Karavites
> >>> >>>>>
> >>> >>>>> ------------------------------------
> >>> >>>>> William Karavites
> >>> >>>>> Application Programmer
> >>> >>>>> OIT/OIRT- Rutgers University
> >>> >>>>> Office: 732-445-8726
> >>> >>>>> Cell: 732-822-9405
> >>> >>>>> willkara at rutgers.edu
> >>> >>>>> ------------------------------------
> >>> >>>>>
> >>> >>>>>
> >>> >>>>> _______________________________________________
> >>> >>>>> sakai-dev mailing list
> >>> >>>>> sakai-dev at collab.sakaiproject.org
> >>> >>>>> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
> >>> >>>>>
> >>> >>>>> TO UNSUBSCRIBE: send email to
> >>> >>>>> sakai-dev-unsubscribe at collab.sakaiproject.org
> >>> >>>>> with a subject of "unsubscribe"
> >>> >>>>
> >>> >>>>
> >>> >>>>
> >>> >>>>
> >>> >>>
> >>> >>> --
> >>> >>> =========================================
> >>> >>> Mike Jennings
> >>> >>> Teaching and Learning Developer
> >>> >>> T: 919.843.5013
> >>> >>> E: mike_jennings at unc.edu
> >>> >>
> >>> >>
> >>> >>
> >>> >>--
> >>> >>John Bush
> >>> >>602-490-0470
> >>> >>
> >>> >>** This message is neither private nor confidential in fact the US
> >>> >>government is storing it in a warehouse located in Utah for future
> data
> >>> >>mining use cases should they arise. **
> >>> >>_______________________________________________
> >>> >>sakai-dev mailing list
> >>> >>sakai-dev at collab.sakaiproject.org
> >>> >>http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
> >>> >>
> >>> >>TO UNSUBSCRIBE: send email to
> >>> >>sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
> >>> >>"unsubscribe"
> >>> >>This message is private and confidential. If you have received it in
> >>> >>error, please notify the sender and remove it from your system.
> >>> >>
> >>> >>_______________________________________________
> >>> >>sakai-dev mailing list
> >>> >>sakai-dev at collab.sakaiproject.org
> >>> >>http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
> >>> >>
> >>> >>TO UNSUBSCRIBE: send email to
> >>> >>sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
> >>> >>"unsubscribe"
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> John Bush
> >>> 602-490-0470
> >>>
> >>> ** This message is neither private nor confidential in fact the US
> >>> government is storing it in a warehouse located in Utah for future
> >>> data mining use cases should they arise. **
> >>> _______________________________________________
> >>> sakai-dev mailing list
> >>> sakai-dev at collab.sakaiproject.org
> >>> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
> >>>
> >>> TO UNSUBSCRIBE: send email to
> >>> sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
> >>> "unsubscribe"
> >>
> >>
> >
> >
> >
> > --
> > John Bush
> > 602-490-0470
> >
> > ** This message is neither private nor confidential in fact the US
> > government is storing it in a warehouse located in Utah for future
> > data mining use cases should they arise. **
>
>
>
> --
> John Bush
> 602-490-0470
>
> ** This message is neither private nor confidential in fact the US
> government is storing it in a warehouse located in Utah for future
> data mining use cases should they arise. **
> _______________________________________________
> sakai-dev mailing list
> sakai-dev at collab.sakaiproject.org
> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>
> TO UNSUBSCRIBE: send email to
> sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
> "unsubscribe"
>



-- 
David McIntyre
Library Technology
CourseWork QA Lead
(650) 498-7209
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai-dev/attachments/20131009/2e745b24/attachment.html 


More information about the sakai-dev mailing list