[Building Sakai] An odd behavior when adding participants under a high load

Jean-Francois Leveque jean-francois.leveque at upmc.fr
Mon Oct 8 02:47:36 PDT 2012


Hi guys,

At UPMC, we have at least twice been experiencing with 2.8.1 a lack of 
sync on different nodes between groups managed in site information. This 
may be a similar issue cache and multiple node issue.

How long have you been using your fix for KNL-600 in production, David?

Are there other cache issues that have been fixed in official trunk/2.9 
and used successfully in production with earlier versions?

Cheers,
J-F

On 05/10/2012 16:19, Shoji Kajita wrote:
> Hi Matt,
>
> Our server is also on a single node. I don't see any odd behavior under normal load.
>
> Best regards,
> Shoji
>
> At Fri, 5 Oct 2012 10:13:45 -0400,
> Matthew Jones wrote:
>>
>> [1<multipart/alternative (7bit)>]
>> [1.1<text/plain; ISO-8859-1 (7bit)>]
>> Changing group membership fires SECURE_UPDATE_AUTHZ_GROUP which was caught
>> by the original patch. i can see how load could slow it down a little but
>> not minutes. Testing locally works for me, but I'm only on a single node.
>> Is this also a single server? Do you see this under normal load?
>>
>> On Fri, Oct 5, 2012 at 9:42 AM, David Horwitz<david.horwitz at uct.ac.za>wrote:
>>
>>> Hi Shoji,
>>>
>>> What version of the Kernel are you running? This sounds like a caching
>>> issue - perhaps related to KNL-600. I know there has been some work on
>>> fixes in this code recently ...
>>>
>>> D
>>> On 10/05/2012 03:37 PM, Shoji Kajita wrote:
>>>> Hi Sakai Dev,
>>>>
>>>> We have started using our Sakai 2.9.x Pilot Server in real classes
>>>> since October 1st for 2012 second semester at Kyoto University, but we
>>>> have been struggling from an odd behavior.
>>>>
>>>> During a high load period (18:30-18:55 last evening as shown in the
>>>> jconsole summary screen shot http://db.tt/y8uNXzWM) produced by almost
>>>> 100 concurrent access from two physical class rooms, we saw the
>>>> following odd behavior when two instructors independently tried to add
>>>> students in their same course worksite:
>>>>
>>>>    1. the first instructor added about 50 students by using "Add
>>>>       Participants" of Site Info, but no students could not see the
>>>>       course worksite in their screen. Refreshing pages by Web browser
>>>>       or re-loging in did not help. After several minutes (maybe seven
>>>>       minutes or so), most of students could see the course worksite
>>>>       suddenly.
>>>>
>>>>       During the not-shown period, I (using admin account) sudo-ed a
>>>>       student who should have the right to access to the course
>>>>       worksite, and I could see the course worksite properly.
>>>>
>>>>    2. After 10 minutes, the second instructor did the almost same thing
>>>>       and no students could not see for five minutes or so.
>>>>
>>>> I have learned that the second instructor also had the same odd
>>>> behavior in his class during this evening class.
>>>>
>>>> I really appreciate if someone gives me any suggestions or pointers
>>>> for this odd behavior.
>>>>
>>>> Our Sakai 2.9.x Pilot Server is running under the following environment:
>>>>
>>>>     a. RedHat 6.2 on VMWare (16GB main memory, 2 CPUs)
>>>>
>>>>     b. Tomcat 7.0.29 with the following JAVA_OPTS:
>>>>
>>>>        #  12GB in total
>>>>        #    Eden size (3GB) = New size (4GB) - 2 x Survivor space size
>>> (512MB)
>>>>        #    Survivor space size (512MB) = New size (4GB) / survivor ratio
>>> (8)
>>>>
>>>>        export JAVA_OPTS='-server -Xms12288m -Xmx12288m -XX:PermSize=4096m
>>> -XX:MaxPermSize=4096m
>>>>        -XX:NewSize=4096m -XX:MaxNewSize=4096m -Djava.awt.headless=true
>>> -Dsun.lang.ClassLoader.allowArraySyntax=true
>>>>        -Dcom.sun.management.jmxremote
>>> -Dorg.apache.jasper.compiler.Parser.STRICT_WHITESPACE=false
>>>>        -Dorg.apache.jasper.compiler.Parser.STRICT_QUOTE_ESCAPING=false
>>> -Dhttp.agent=Sakai
>>>>        -Duser.language=ja -Duser.region=JP -Dfile.encoding=UTF-8
>>> -XX:+DisableExplicitGC -XX:+PrintGCTimeStamps
>>>>        -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC
>>> -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled
>>>>        -XX:+UseParNewGC -verbose:gc -XX:+PrintClassHistogram
>>> -XX:+HeapDumpOnOutOfMemoryError -XX:+PrintReferenceGC
>>>>        -Xloggc:$CH/logs/loggc.out'
>>>>
>>>>     c. Sakai 2.9.x (rev 113440) but kernel (rev 113905 with
>>> KNL-972.patch) and site-manage (rev 113912)
>>>>
>>>>     d. Oracle 11g R1 on RedHat 6.2/VMWare
>>>>
>>>> Best regards,
>>>> Shoji Kajita
>>>> ---
>>>> Shoji Kajita
>>>> IT Planning Office, IIMC, Kyoto University, Japan
>>>> http://about.me/shojikajita


More information about the sakai-dev mailing list