[Building Sakai] Sakai Memory Issues

John Bush jbush at anisakai.com
Wed Oct 9 07:27:08 PDT 2013


Yes, we could take it early on.  We have an alert in place to be able
to do that next time, it hasn't occurred again so we tried that.  It
hasn't occurred again since Oct 4th, GC has been keeping up ok.

In this particular the instance the server was just too far gone its
couldn't recover even by bleeding users.  We are testing IU's courier
fix with my jforum fix and pushing that out soon.  I'm not convinced
either of these fixes are the root or our problem, but its the best
we've got until we get more info.

On Wed, Oct 9, 2013 at 6:24 AM, David Haines <dlhaines at umich.edu> wrote:
> John,
>
>    Would your setup allow getting a heap dump early on before the node get
> overloaded?  The server would be unavailable during the dump which may take
> many minutes.  At Michigan we've gotten heap dumps by taking a server out of
> the active cluster after running for a while, and then getting a heap dump
> the next day when people with active sessions on that server have drifted
> away.
>
> - Dave
>
>
> On Tue, Oct 8, 2013 at 12:10 PM, John Bush <jbush at anisakai.com> wrote:
>>
>> Ok this is interesting.  Jeremy, we tried to get a heap dump but our
>> node was so over loaded at that point it was impossible.  The patch
>> from IU is for a ConcurrentHashMap filling up.  I would have expected
>> appdynamics to find that.
>>
>> I was going to try to use yourkit to look at the heap, I think it can
>> do it, although I haven't used it for that purpose before.  Jeremy,
>> it might be helpful if you could review this patch and dig in your
>> heap and bit and see if it looks like that same thing.
>>
>>
>> On Tue, Oct 8, 2013 at 8:51 AM, Thomas, Gregory J <gjthomas at iu.edu> wrote:
>> > Indiana recently had a memory leak issue with the delivery objects in
>> > courier and the chat tool.  We have a patch here:
>> > https://jira.sakaiproject.org/browse/SAK-21398  It hasn't been committed
>> > to trunk yet as there were some changes that were preferred to be done
>> > first on its implementation.  However, we have the current patch in our
>> > production code as of Thursday last week.  It passed our load testing
>> > and
>> > it passed our dreaded Sunday evening for high server load/usage.
>> >
>> > Not sure if it's related to issues brought up here, but thought I'd
>> > throw
>> > it out there as something we've definitely ran into.
>> >
>> > Thanks,
>> > Greg
>> >
>> >
>> >
>> > On 10/8/13 10:57 AM, "Kusnetz, Jeremy" <JKusnetz at APUS.EDU> wrote:
>> >
>> >>We have been seeing similar memory problems since upgrading to 2.9.3
>> >>also.  The oldgen space just continually grows until it's out of space.
>> >>GC just doesn't seem to clean everything up.  We have tried lots of
>> >>different combinations of GC, the best we can do is slow down how
>> >> quickly
>> >>we run out of memory so we are forced to have to restart Sakai multiple
>> >>times a week.
>> >>
>> >>Interestingly we are running two instances of Sakai all running from the
>> >>same code base.  Once instance has the memory issue while the other does
>> >>not.
>> >>
>> >>The one that does NOT have the memory issues is primarily using jforums,
>> >>assignment1, and gradebook1.  The one that IS having memory issues is
>> >>primarily using msgcntr, assignment2, and gradebook2.  There is a small
>> >>amount of jforum usage in the one with memory issues, but it's only in a
>> >>couple of sites.
>> >>
>> >>There are a few other differences.  The one with memory issues has 35
>> >>sakai nodes, while the one without memory issues is only 2 sakai nodes.
>> >>But on average the one with 2 nodes has more sessions on each of those
>> >>nodes then the one with 35 nodes.  The one without memory issues is
>> >> using
>> >>LDAP, while the one without memory issues is not using LDAP, all users
>> >>are local but we do use the soap axis SakaiPortalLogin for most of it's
>> >>session creation.
>> >>
>> >>We are also using appdynamics to try to find the memory leaks, but it's
>> >>automatic leak detection only tracks Map and Collection libraries.
>> >> There
>> >>is another tool to map any custom object, but we would need to know what
>> >>object to look for.  The only thing appdynamics seems to think are
>> >> memory
>> >>leaks with it's automatic leak detection are the various ehcaches.
>> >>
>> >>We did try dumping the heap and opening it in the Eclipse Memory
>> >>Analyzer.  I don't have much experience with it.  Even though the heap
>> >>dump was close to 7GB, it only identified about 2.5GB, the rest were
>> >>unreferenced objects.  I'm not sure why those objects weren't GCed.
>> >>
>> >>
>> >>
>> >>-----Original Message-----
>> >>From: sakai-dev-bounces at collab.sakaiproject.org
>> >>[mailto:sakai-dev-bounces at collab.sakaiproject.org] On Behalf Of John
>> >> Bush
>> >>Sent: Tuesday, October 08, 2013 9:54 AM
>> >>To: Mike Jennings
>> >>Cc: sakai-dev Developers
>> >>Subject: Re: [Building Sakai] Sakai Memory Issues
>> >>
>> >>we are on the 2.9 tag.   I don't have a smokin' gun yet to really say
>> >>if that's the root cause or not.  We are using appdynamics to look at
>> >>things in production.  What we are seeing is a single request run this
>> >>type of query literary hundreds of times.
>> >>
>> >>[SELECT special_access_id, forum_id, topic_id, start_date,
>> >>hide_until_open, end_date, allow_until_date, override_start_date,
>> >>override_hide_until_open, override_end_date, override_allow_until_date,
>> >>password, lock_end_date, users FROM jforum_special_access WHERE forum_id
>> >>= '116964'  AND topic_id > 0]
>> >>
>> >>Each one is fast and our database has no issue.  Eventually we see
>> >>getting connections taking awhile and threads get blocked but we aren't
>> >>running into any type of bottleneck load or connection wise on the db.
>> >> I
>> >>think the appservers are simply running stupid amounts of queries and
>> >>just eating up connections and then blocking and keeping things in
>> >>memory.  Its just a hunch right now.  Its the best we've got so far.
>> >>
>> >>I wrote some code to cache this stuff yesterday, we are QA'ing today.
>> >>Again, I'm not sure its jforum, but jforum certainly isn't helping.
>> >>
>> >>On Tue, Oct 8, 2013 at 6:22 AM, Mike Jennings <mike_jennings at unc.edu>
>> >>wrote:
>> >>> John,
>> >>>
>> >>> We are using JForum and Sakai 2.9.2 and are not seeing any Memory
>> >>> Issues to date.  What version of JForum's do you suspect this memory
>> >>>leak in?
>> >>>
>> >>> Mike Jennings
>> >>>
>> >>>
>> >>> On 10/07/2013 05:39 PM, John Bush wrote:
>> >>>>
>> >>>> We've been having an issue like this with one client, we think it
>> >>>> might be related to jforum.  Do you use jforum ?
>> >>>>
>> >>>> On Mon, Oct 7, 2013 at 1:11 PM, William Karavites
>> >>>> <willkara at oit.rutgers.edu> wrote:
>> >>>>>
>> >>>>> Hello Community,
>> >>>>>
>> >>>>> I was wondering if there were any outstanding or recently fixed
>> >>>>> memory usage issues for Sakai. We recently came across one that took
>> >>>>> Sakai down twice in a night for 30 minutes each. We found the
>> >>>>> culprit for that one and want to try and be proactive for any others
>> >>>>> in the future. We are currently running Sakai 2.9.1 and Kernel
>> >>>>> 1.3.1.
>> >>>>>
>> >>>>>
>> >>>>> So if there are any improved memory usage patches in Sakai, please
>> >>>>> let me know.
>> >>>>>
>> >>>>>
>> >>>>> Thank you,
>> >>>>> William Karavites
>> >>>>>
>> >>>>> ------------------------------------
>> >>>>> William Karavites
>> >>>>> Application Programmer
>> >>>>> OIT/OIRT- Rutgers University
>> >>>>> Office: 732-445-8726
>> >>>>> Cell: 732-822-9405
>> >>>>> willkara at rutgers.edu
>> >>>>> ------------------------------------
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> sakai-dev mailing list
>> >>>>> sakai-dev at collab.sakaiproject.org
>> >>>>> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>> >>>>>
>> >>>>> TO UNSUBSCRIBE: send email to
>> >>>>> sakai-dev-unsubscribe at collab.sakaiproject.org
>> >>>>> with a subject of "unsubscribe"
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>> --
>> >>> =========================================
>> >>> Mike Jennings
>> >>> Teaching and Learning Developer
>> >>> T: 919.843.5013
>> >>> E: mike_jennings at unc.edu
>> >>
>> >>
>> >>
>> >>--
>> >>John Bush
>> >>602-490-0470
>> >>
>> >>** This message is neither private nor confidential in fact the US
>> >>government is storing it in a warehouse located in Utah for future data
>> >>mining use cases should they arise. **
>> >>_______________________________________________
>> >>sakai-dev mailing list
>> >>sakai-dev at collab.sakaiproject.org
>> >>http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>> >>
>> >>TO UNSUBSCRIBE: send email to
>> >>sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
>> >>"unsubscribe"
>> >>This message is private and confidential. If you have received it in
>> >>error, please notify the sender and remove it from your system.
>> >>
>> >>_______________________________________________
>> >>sakai-dev mailing list
>> >>sakai-dev at collab.sakaiproject.org
>> >>http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>> >>
>> >>TO UNSUBSCRIBE: send email to
>> >>sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
>> >>"unsubscribe"
>> >
>>
>>
>>
>> --
>> John Bush
>> 602-490-0470
>>
>> ** This message is neither private nor confidential in fact the US
>> government is storing it in a warehouse located in Utah for future
>> data mining use cases should they arise. **
>> _______________________________________________
>> sakai-dev mailing list
>> sakai-dev at collab.sakaiproject.org
>> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>>
>> TO UNSUBSCRIBE: send email to
>> sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
>> "unsubscribe"
>
>



-- 
John Bush
602-490-0470

** This message is neither private nor confidential in fact the US
government is storing it in a warehouse located in Utah for future
data mining use cases should they arise. **


More information about the sakai-dev mailing list