[Building Sakai] Building scalable Sakai tools

Adams, David da1 at vt.edu
Wed Mar 20 02:58:26 PDT 2013


It will be harder to bridge multi-select JOINs across disparate components (eg user service to Mneme or Assignments), but I'm going to guess that within each component you can make a lot of improvements. I've not worked with Mneme, but I'm assuming it uses hibernate. We log all our SQL statements and in pretty much every hibernate tool we see the pattern where each individual object gets loaded separately, one by one.

So, to take a perhaps-not-exact, but a realistic demonstration of the patterns we see example: to load a quiz for a single user, it might load the quiz, and then it will get a list of part IDs, load each part one by one, and from them get lists of question IDs, load each question one by one with individual SQL statements, then check for attachments to each question one by one with individual SQL statements, then load a list of answer IDs, and load the answers one by one with individual SQL statements. And then it will choose the three or four that even need to be displayed on the current page and just display those. So for a quiz with 50 questions, you might see 200-300 individual SQL queries on each page load, and that's only if they aren't somehow loaded twice, as the analysis you attach describes, once to count, another to display. When the instructor tries to grade the quiz this might all happen again, only times the number of students in the class. We've seen page loads with tens or hundreds of thousands of single-record queries that could have been compressed to two or three well-designed SQL queries.

I think the process to fix this would be to take each common activity in a tool, eg taking a quiz, grading a quiz, viewing a grade; running those activities on a test instance while logging all database activity, and then analyzing the relationship between the number of queries and the number of users in the site, the size and complexity of the quiz, etc, etc. Identify the areas where the relationship is exponential, and go in and replace the SQL-blind Java code that's letting hibernate make all the decisions about SQL with some carefully crafted and properly cached explicit joined statements.

To avoid this in other hibernate tools, the only solution is for developers not to let themselves be entranced by Hibernate's ability to hide the database activity from you. The analysis of what queries are being generated needs to be done no matter how you write the logic in the first place. If you've got a O(n^2) relationship between your number of objects and your number of queries, you're doing something wrong.

As for Assignments, what needs to be fixed there is to bring out the XML blob into plain fields in the assignment tables. Until then, performance will be terrible, as most of the things you want to search or sort on are hidden from the database.

-dave

________________________________________
From: sakai-dev-bounces at collab.sakaiproject.org [sakai-dev-bounces at collab.sakaiproject.org] On Behalf Of Mark Breuker [mbreuker at loi.nl]
Sent: Wednesday, March 20, 2013 5:11 AM
To: sakai-dev at collab.sakaiproject.org
Cc: Berg, Alan
Subject: [Building Sakai] Building scalable Sakai tools

Hi all,

We are experiencing performance issues with Mneme in a worksite that has around 2500 students. When an instructor wants to open the list of submissions per quiz / assignment the page takes around 1 minute to load :( We asked Edia to investigate the issue for us (see analysis attached) and found some parts in the code that can be improved.

We are also seeing similar performance issues in other tools. Assignments also performs very badly. Assignment 2 is a lot better but still takes around 6 seconds to load a similar page with the same amount of users/submissions. I know Alan Berg has also documented slow performance in a number of other tools here: https://confluence.sakaiproject.org/display/WGMOOC/MOOC+Scalabilty

In order to move forward and fix the issue in Mneme (and other tools) I would like to know if there are common design patterns that can (and should) be used when doing thinks like loading a list of all users in a site combined the date they submitted an assignment. Arguably the best way would be to perform a SQL JOIN query (that joins the site member info with the submission info) on the database but that would brake the service oriented design of Sakai.

Bottom line: I'm looking for some input to document design patterns for highly scalable Sakai tools. I've started a page on Confluence here: https://confluence.sakaiproject.org/x/owPzB

Cheers,

Mark

Mark Breuker
Product Owner
Tel.: +31 71 5451 203

Leidse Onderwijsinstellingen bv
Leidsedreef 2
2352 BA Leiderdorp
www.loi.nl

________________________________

[cid:nwss_loi29.gif]

De informatie verzonden met dit e-mailbericht (en bijlagen) is uitsluitend bestemd voor de geadresseerde(n) en zij die van de geadresseerde(n) toestemming hebben dit bericht te lezen. Gebruik door anderen dan geadresseerde(n) is verboden. De informatie in dit e-mailbericht (en de bijlagen) kan vertrouwelijk van aard zijn en kan binnen het bereik vallen van een wettelijke geheimhoudingsplicht. Indien u deze e-mail ten onrechte ontvangen hebt, wordt u verzocht ons daarvan zo spoedig mogelijk per e-mail of telefonisch op de hoogte te stellen, en het ontvangen bericht (en de bijlagen) te wissen zonder deze te lezen, te kopiëren of aan derden bekend te stellen.

P  Denk aan het milieu voordat u dit bericht print



More information about the sakai-dev mailing list