[Contrib: Evaluation System] issue with hierarchy initialization in a cluster

Cook, Jonathan jonrcook at indiana.edu
Thu Jul 1 13:10:54 PDT 2010


Sorry for the delayed response.  I've created a Jira.

http://jira.sakaiproject.org/browse/EVALSYS-957

-Jonathan


On Jun 28, 2010, at 8:48 PM, Sean DeMonner wrote:

I didn't see a response to JIm's question...did this get Jira'd by someone at Indiana?

SMD.




On Jun 10, 2010, at 12:38 PM, Jim Eng wrote:

Hi Jonathan,

Have you created a JIRA ticket to track this issue?

Thanks.

Jim


On Jun 9, 2010, at 8:38 PM, Cook, Jonathan wrote:

Hello everyone,

We deployed EvalSys into production as a pilot on June 3 at Indiana University.  We had an issue with initialization of the hierarchy that I want to share with you.  Here are the specific tag and version of hierarchy and evaluation that we deployed to production.

svn co https://source.sakaiproject.org/contrib/caret/hierarchy/tags/hierarchy-1.2.4/ hierarchy
svn co -r66422 https://source.sakaiproject.org/contrib/evaluation/trunk evaluation

We deployed into a cluster of 9 application servers.  The issue was that 6 of the 9 app server aborted startup and JVM terminated.  I think the more servers you have in your cluster, the more likely this issue will occur. Here is what I found to be the cause:

In the ExternalHierarchyLogicImpl, initialization of the hierarchy occurs by creating a root node if one does not exist.  This will occur the first time it is deployed, but not subsequently.  As all 9 app servers were starting, 3 of them reached ExternalHierarchyLogicImpl at the same time, all detecting that no root node existed.  There is not a unique constraint at the db level, so all 3 app servers inserted a root node.  These 3 app servers continued startup and succeeded.  The remaining app servers, upon doing the root node check, found that multiple root nodes exist.  IllegalStateException was thrown and startup was aborted, leaving only 3 available servers in the cluster.

Our emergency solution was simply to delete the extraneous rows in the hierarchy table, leaving just the one required row which represents the root node.  After which, the servers started successfully.

I was able to recreate this in our test environment (9 app servers) by repeatedly restarting the cluster.  On the 3rd restart with zero rows in the hierarchy tables, 2 of the app servers inserted root nodes.  The other app servers failed to start.  I then did exactly what we did in production, deleted the extraneous root nodes.  With 1 root node in place, several restarts have been successful with the entire cluster coming up.

If you are deploying EvalSys for the first time and have a cluster environment, in lieu of some kind of synchronization fix, the preemptive solution would be to just add the required rows to the hierarchy tables in your production database before deployment.

Here is the actual error during startup:
java.lang.IllegalStateException: Invalid hierarchy state: more than one root node for hierarchyId: evaluationHierarchyId

Thanks,
Jonathan Cook
Oncourse Team
Indiana University

_______________________________________________
evaluation mailing list
evaluation at collab.sakaiproject.org<mailto:evaluation at collab.sakaiproject.org>
http://collab.sakaiproject.org/mailman/listinfo/evaluation

TO UNSUBSCRIBE: send email to evaluation-unsubscribe at collab.sakaiproject.org<mailto:evaluation-unsubscribe at collab.sakaiproject.org> with a subject of "unsubscribe"

_______________________________________________
evaluation mailing list
evaluation at collab.sakaiproject.org<mailto:evaluation at collab.sakaiproject.org>
http://collab.sakaiproject.org/mailman/listinfo/evaluation

TO UNSUBSCRIBE: send email to evaluation-unsubscribe at collab.sakaiproject.org with a subject of "unsubscribe"


SMD.


==========================================================
Sean DeMonner, IT Senior Project Manager, CTools Implementation Group
Digital Media Commons @ The Duderstadt Center, U-M      (734) 615-9765








-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/evaluation/attachments/20100701/dfc1a0f9/attachment.html 


More information about the evaluation mailing list