[Building Sakai] Sakai scalability monitoring and testing

Mon Mar 9 12:14:48 PDT 2009

On 3/9/2009 11:39 AM, Steven Githens wrote:
> Neat. It's cool to see that you've found a good-enough open source 
> library ( I totally sold out the scene and started using YourKit ).  I 
> know you're not actively adding features, but do you know how hard it 
> would be to start and stop the Jamon at run time? (from a Jamon API 
> perspective). So you could always have it there and occasionally turn it 
> on for just a bit to test something and then toggle it off.  Then it 
> could always be available on your pre-prod machines without always 
> slowing them down.

Super-easy so far as JAMon itself is concerned. For Sakai purposes, I 
guess we'd probably twiddle the JAMon calls via dynamic Sakai 
properties. The current code doesn't bother with that, though.

> I tried to run the DataLoader.groovy (which looks pretty self contained) 
> in Sash too, but it threw a few errors so I'll have to track that down a 
> bit more later.

One issue might be that it depends on Groovy 1.6 (which just came out).

> How did you determine what your production load should be like?   I've 
> been sort of meditating on this problem ever since I started working on 
> some Grinder tests and read this blog entry** last month, and realized 
> that ever since I've been working on Sakai, none of the load tests I've 
> ever done were really very well described, most likely flawed beyond 
> usefulness and were probably mathematically bankrupt. ( Also realizing 
> at the same time that I had never actually used the tools I learned in 
> all my statistics coursework ).

The initial job was a lot easier for us because we had a specific urgent 
problem to focus on: insane timeouts on our largest gradebooks.

Step 1: Get a very rough idea of number of active users and response 
times when things seemed at their worst.

Step 2: Do some very low impact analysis of the production site (CPU and 
memory stats, basically).

Step 3: Throw a multi-threaded stress-test against a test server and a 
copy of the production DB to see whether we can get similarly awful 
results. This is where we got lucky (for certain values of "luck"), 
because it turned out not to be hard to get similarly awful results.

Step 4: Turn on all the scalability monitors on the test server, and 
start analyzing / changing stuff / re-running.... The couple of fixes 
we've found and since put into production actually have improved 
production results, so apparently we're not yet *destructively* 
mathematically bankrupt.  :)

Future steps include stuff like selectively turning on more monitoring 
on production so that we can improve the honesty of the stress tests, 
and getting scalability regression tests into a continuous integration 
system so that we can maximize return on investment.

Best,
Ray

> Sustainablecheers,
> Steve
> 
> 
> ** It's pretty entertaining, but incredibly vulgar ( if you're easily 
> offended ).
> http://www.zedshaw.com/essays/programmer_stats.html
> 
> Ray Davis wrote:
>> Lately at UC Berkeley we've been trying to better understand (and 
>> hopefully reduce) our ongoing performance and scalability woes. Some 
>> sakai-dev readers might be interested in the code we've recently 
>> checked into the Subversion repository, since it's already uncovered a 
>> couple of bugs and cleared up some lingering questions. None of the 
>> logic is particularly Berkeley-specific, and if anyone's interested in 
>> adapting it or further generalizing it, the task shouldn't be very 
>> difficult. But we're not at this moment volunteering to support the 
>> world. :)
>>
>> All of scalability is divided into three parts:
>>
>> A. Collecting statistics (from production and QA environments for 
>> diagnosis; from test environments for continuous integration testing).
>> B. Mimicking production activity (so that we can repeat tests and we 
>> aren't completely dependent on production systems for data).
>> C. Mimicking production data (so that we can begin tests from a known 
>> starting point).
>>
>> For statistics, we're currently using a mix of standard Tomcat garbage 
>> collection logs (analyzable by gcviewer), JMX MBean statistics 
>> (periodically written out to the Tomcat log), and the JAMon open 
>> source monitoring library. JAMon's a bit stodgy, but the other options 
>> I tried weren't free, weren't flexible enough, or couldn't deal with 
>> Sakai's bizarre classloader environment. The bulk of the JMX and JAMon 
>> work is checked in at:
>>
>> https://source.sakaiproject.org/svn/msub/berkeley.edu/bspace/jamon/sakai_2-5-x 
>>
>>
>> For mimicking activity (AKA stress testing or load testing), we're 
>> currently using an easily configurable layer on JWebUnit / HtmlUnit. 
>> It doesn't do anything terribly sophisticated, but we don't need to do 
>> anything terribly sophisticated to bring our servers to their knees.
>>
>> For mimicking production data (AKA data loading), we're using a Groovy 
>> script, and I'm mostly running it from the command line via the 
>> test-harness's component manager emulator.
>>
>> (And here let me just inject an unsolicited enthusiastic plug for 
>> Thomas Amsler's fine, fine Sakai Groovy Shell: 
>> <http://confluence.sakaiproject.org/confluence/display/SGS>. Code a 
>> couple lines; run 'em; see what happened; edit 'em; repeat.... It's 
>> scarily like rapid development!)
>>
>> Since cleanly repeatable stress tests need to be matched to known test 
>> data, both of these aspects are currently checked into the same 
>> top-level project:
>>
>> https://source.sakaiproject.org/svn/msub/berkeley.edu/bspace/stress-test/sakai_2-5-x 
>>
>>
>> The stress testers are at:
>>
>> https://source.sakaiproject.org/svn/msub/berkeley.edu/stress-test/sakai_2-5-x/stresser 
>>
>>
>> The data loader is at:
>>
>> https://source.sakaiproject.org/svn/msub/berkeley.edu/stress-test/sakai_2-5-x/dataloader 
>>
>>
>> Best,
>> Ray