[Contrib: Evaluation System] issue with hierarchy initialization in a cluster

Jim Eng jimeng at umich.edu
Thu Jun 10 09:38:11 PDT 2010


Hi Jonathan,

Have you created a JIRA ticket to track this issue?

Thanks.

Jim


On Jun 9, 2010, at 8:38 PM, Cook, Jonathan wrote:

> Hello everyone,
> 
> We deployed EvalSys into production as a pilot on June 3 at Indiana University.  We had an issue with initialization of the hierarchy that I want to share with you.  Here are the specific tag and version of hierarchy and evaluation that we deployed to production.  
> 
> svn co https://source.sakaiproject.org/contrib/caret/hierarchy/tags/hierarchy-1.2.4/ hierarchy
> svn co -r66422 https://source.sakaiproject.org/contrib/evaluation/trunk evaluation
> 
> We deployed into a cluster of 9 application servers.  The issue was that 6 of the 9 app server aborted startup and JVM terminated.  I think the more servers you have in your cluster, the more likely this issue will occur. Here is what I found to be the cause:
> 
> In the ExternalHierarchyLogicImpl, initialization of the hierarchy occurs by creating a root node if one does not exist.  This will occur the first time it is deployed, but not subsequently.  As all 9 app servers were starting, 3 of them reached ExternalHierarchyLogicImpl at the same time, all detecting that no root node existed.  There is not a unique constraint at the db level, so all 3 app servers inserted a root node.  These 3 app servers continued startup and succeeded.  The remaining app servers, upon doing the root node check, found that multiple root nodes exist.  IllegalStateException was thrown and startup was aborted, leaving only 3 available servers in the cluster.
> 
> Our emergency solution was simply to delete the extraneous rows in the hierarchy table, leaving just the one required row which represents the root node.  After which, the servers started successfully.
> 
> I was able to recreate this in our test environment (9 app servers) by repeatedly restarting the cluster.  On the 3rd restart with zero rows in the hierarchy tables, 2 of the app servers inserted root nodes.  The other app servers failed to start.  I then did exactly what we did in production, deleted the extraneous root nodes.  With 1 root node in place, several restarts have been successful with the entire cluster coming up.
> 
> If you are deploying EvalSys for the first time and have a cluster environment, in lieu of some kind of synchronization fix, the preemptive solution would be to just add the required rows to the hierarchy tables in your production database before deployment.
> 
> Here is the actual error during startup:
> java.lang.IllegalStateException: Invalid hierarchy state: more than one root node for hierarchyId: evaluationHierarchyId
> 
> Thanks,
> Jonathan Cook
> Oncourse Team
> Indiana University
> 
> _______________________________________________
> evaluation mailing list
> evaluation at collab.sakaiproject.org
> http://collab.sakaiproject.org/mailman/listinfo/evaluation
> 
> TO UNSUBSCRIBE: send email to evaluation-unsubscribe at collab.sakaiproject.org with a subject of "unsubscribe"

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/evaluation/attachments/20100610/d627e38b/attachment.html 


More information about the evaluation mailing list