[cle-release-team] Fwd: Cannot start tomcat in QA1

Karen Tsao ktsao at stanford.edu
Thu Nov 1 11:37:45 PDT 2012


Hi Rey,

Here is the instruction from Julian.

Thanks,
Karen

---------- Forwarded message ----------
From: Julian M. Morley <jmorley at stanford.edu>
Date: Tue, Oct 30, 2012 at 11:44 AM
Subject: Re: Cannot start tomcat in QA1
To: Karen Tsao <ktsao at stanford.edu>
Cc: Lydia Li <lydial at stanford.edu>


OK, -qa2 is now running the Stan2.6.x_2012_Sprint_20_Q2 tag.

Here's how I did it:

Went to https://cw-jenkins.stanford.edu

Ran 'Lumberjack' to cut a tag from the current QA branch. I named the
tag Stan2.6.x_2012_Sprint_20_Q2, and picked the
Stan2.6.x_2012_Sprint_20 branch for the source. I left the revision at
HEAD.

Once Lumberjack had completed I ran cw5_26_qa to make a build using
that tag. Since this build was going onto qa2 I picked 'qa2' for the
CONFIG_TAG.

Normally the cw5_26_qa job would have moved the build to
/afs/ir/dev/sakai/builds and created the NEXT_qa2 symlink, but our afs
dir was out of space so I had to do that manually.

Then I went to -qa2 and ran 'cwcntrl deploy branch=qa2', followed by
'cwcntrl tomcat start'.  It worked. :-)


On Tue, Oct 30, 2012 at 10:17 AM, Julian M. Morley <jmorley at stanford.edu>
wrote:
> aaaaaaagh.
>
> ...yes. ;) Caught it just in time.
>
> On Tue, Oct 30, 2012 at 10:16 AM, Karen Tsao <ktsao at stanford.edu> wrote:
>> Hi Julian,
>>
>> Can you deploy to coursework-qa2 only? Rey wants to verify his fix and I
>> will tell him to use qa1 for now.
>>
>> Thanks,
>> Karen
>>
>>
>> On Tue, Oct 30, 2012 at 9:23 AM, Lydia Li <lydial at stanford.edu> wrote:
>>>
>>>
>>>
>>> ----- Original Message -----
>>> | From: "Julian M. Morley" <jmorley at stanford.edu>
>>> | To: "Lydia Li" <lydial at stanford.edu>
>>> | Cc: "Karen Tsao" <ktsao at stanford.edu>
>>> | Sent: Tuesday, October 30, 2012 9:16:46 AM
>>> | Subject: Re: Cannot start tomcat in QA1
>>> |
>>> | On Tue, Oct 30, 2012 at 9:12 AM, Lydia Li <lydial at stanford.edu>
>>> | wrote:
>>> | >
>>> | >
>>> | > ----- Original Message -----
>>> | > | From: "Julian M. Morley" <jmorley at stanford.edu>
>>> | > | To: "Lydia Li" <lydial at stanford.edu>
>>> | > | Cc: "Karen Tsao" <ktsao at stanford.edu>
>>> | > | Sent: Tuesday, October 30, 2012 8:18:23 AM
>>> | > | Subject: Re: Cannot start tomcat in QA1
>>> | > |
>>> | > | On Tue, Oct 30, 2012 at 12:04 AM, Lydia Li <lydial at stanford.edu>
>>> | > | wrote:
>>> | > | > Karen/Julian,
>>> | > | >
>>> | > | >   I saw this error in catalina.out :
>>> | > | >
>>> | > | > ERROR: Application exception overridden by rollback exception
>>> | > | > (2012-10-29 21:46:02,664
>>> | > | >
>>> main_org.springframework.transaction.interceptor.TransactionInterceptor)
>>> | > | > org.springframework.jdbc.UncategorizedSQLException: Hibernate
>>> | > | > operation: could not execute query; uncategorized SQLException
>>> | > | > for
>>> | > | > SQL [select this_.id as id104_0_, this_.channel as
>>> | > | > channel104_0_,
>>> | > | > this_.ref as ref104_0_, this_.status as status104_0_,
>>> | > | > this_.creator_id as creator5_104_0_, this_.actor_id as
>>> | > | > actor6_104_0_, this_.fail_count as fail7_104_0_,
>>> | > | > this_.dateCreated
>>> | > | > as dateCrea8_104_0_, this_.dateLastAcq as dateLast9_104_0_,
>>> | > | > this_.dateLastSync as dateLas10_104_0_, this_.dateLastFail as
>>> | > | > dateLas11_104_0_, this_.dateStartWait as dateSta12_104_0_,
>>> | > | > this_.notes as notes104_0_ from stan_events this_]; SQL state
>>> | > | > [null]; error code [17011]; Exhausted Resultset; nested
>>> | > | > exception
>>> | > | > is java.sql.SQLException: Exhausted Resultset
>>> | > | >
>>> | > | >
>>> | > | >
>>> | > | >   I thought it might be caused by the large # of records in
>>> | > | >   stan_events so I deleted the records in that table, and I was
>>> | > | >   able to restart tomcat successfully after that (qa1 has
>>> | > | >   Karen's
>>> | > | >   new sprint20 build.  qa2 still has sprint19 build).
>>> | > | >
>>> | > | >    COUNT(*)
>>> | > | > --------
>>> | > | >   425313
>>> | > | >
>>> | > | >
>>> | > | > 425,313 rows deleted.
>>> | > | > COUNT(*)
>>> | > | > --------
>>> | > | >        0
>>> | > | >
>>> | > | >
>>> | > | >
>>> | > | >  I'm not sure if that's the right fix though.  We used to not
>>> | > | >  archive that table and I remember tomcat would take a long
>>> | > | >  time
>>> | > | >  to start up but it always started up eventually.   I checked
>>> | > | >  in
>>> | > | >  prod db, and it has more than 500k rows. Maybe the db settings
>>> | > | >  are different in prod?
>>> | > | >
>>> | > |
>>> | > | 500K is nothing for stan_events (last month we had 60 million
>>> | > | rows in
>>> | > | production and tomcat still started); whilst I'm glad that
>>> | > | deleting
>>> | > | all the rows fixed the issue, it's not the fix-fix. It looks like
>>> | > | there's some new code that has an issue with large resultsets?
>>> | >
>>> | >
>>> | > Just to clarify, this is stan_events table, not sakai_events table.
>>> | >
>>> | > Anyways, I also don't think it's the cause of tomcat not starting
>>> | > up because we used to have a lot more rows in stan_events too.  I
>>> | > thought maybe the c3p0 settings are different in QA. we've not
>>> | > changed code around stan_events for since 2008.
>>> | >
>>> |
>>> | Yeah, as far as I know there's no difference between prod and QA
>>> | there
>>> | - at least, there shouldn't be!
>>> |
>>> | >
>>> | > |
>>> | > | >   Anyways, on a separate topic, I remember Julian wanted to
>>> | > | >   test
>>> | > | >   out his new QA process on Jenkins , right?  Should we give
>>> | > | >   that
>>> | > | >   a try?
>>> | > |
>>> | > | I'd love to! :-) Is there a particular revision I should shoot
>>> | > | for?
>>> | >
>>> | >
>>> | > The latest tag is Stan2.6.x_2012_Sprint_20_Q1
>>> | >
>>> |
>>> | I'll see if I can build something with that, then. And I'll document
>>> | it. :)
>>> |
>>>
>>>
>>> The release branch is Stan2.6/branches/Stan2.6.x_2012_Sprint_20/
>>>
>>> So, if you also want to try cutting tags in Jenkins (I forgot how you
set
>>> up Jenkins, does it include cutting the tag, or can it work with an
already
>>> cut tag?), you can try cutting Stan2.6.x_2012_Sprint_20_Q2, and we can
>>> verify that  Q1 and Q2 are identical.
>>>
>>>
>>> | >
>>> | >
>>> | > |
>>> | > | >   BTW, Julian, thanks for offering to give the developers a
>>> | > | >   talk on
>>> | > | >   Jenkins.  I will schedule something after Hugh arrives.
>>> | > |
>>> | > | OK!
>>> | > |
>>> | > | >
>>> | > | >
>>> | > | > thanks,
>>> | > | > Lydia
>>> | > | >
>>> | > | > ----- Original Message -----
>>> | > | > | From: "Karen Tsao" <ktsao at stanford.edu>
>>> | > | > | To: "Julian M. Morley" <jmorley at stanford.edu>
>>> | > | > | Cc: "Lydia Li" <lydial at stanford.edu>
>>> | > | > | Sent: Monday, October 29, 2012 10:07:13 PM
>>> | > | > | Subject: Cannot start tomcat in QA1
>>> | > | > |
>>> | > | > | Hi Julian,
>>> | > | > |
>>> | > | > | After I deploy a new build to coursework-qa1, I got some db
>>> | > | > | error
>>> | > | > | and
>>> | > | > | tomcat didn't get started. Because I thought this might be a
>>> | > | > | build
>>> | > | > | error, I switched back to the previous build, but same error
>>> | > | > | still
>>> | > | > | occurred. Here is part of the error:
>>> | > | > |
>>> | > | > | INFO: A checked-out resource is overdue, and will be
>>> | > | > | destroyed:
>>> | > | > | com.mchange.v2.c3p0.impl.NewPooledConnection at e2942da
>>> | > | > | (2012-10-29
>>> | > | > | 21:18:54,373
>>> | > | > | Timer-0_com.mchange.v2.resourcepool.BasicResourcePool)
>>> | > | > | INFO: Logging the stack trace by which the overdue resource
>>> | > | > | was
>>> | > | > | checked-out. (2012-10-29 21:18:54,375
>>> | > | > | Timer-0_com.mchange.v2.resourcepool.BasicResourcePool)
>>> | > | > | java.lang.Exception: DEBUG ONLY: Overdue resource check-out
>>> | > | > | stack
>>> | > | > | trace.
>>> | > | > | at
>>> | > | > |
>>>
com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:506)
>>> | > | > | at
>>> | > | > |
>>>
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:525)
>>> | > | > | at
>>> | > | > |
>>>
com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:128)
>>> | > | > | at ...
>>> | > | > |
>>> | > | > | You can go to qa1 to see the full stack trace as there are a
>>> | > | > | lot
>>> | > | > | more. Do you have any idea what might happen?
>>> | > | > |
>>> | > | > | By the way, because of this error, I leave qa2 along as it
>>> | > | > | still
>>> | > | > | works fine.
>>> | > | > |
>>> | > | > | Thanks,
>>> | > | > | Karen
>>> | > | > |
>>> | > | > |
>>> | > | > |
>>> | > |
>>> | > |
>>> | > |
>>> | > | --
>>> | > | Julian M. Morley
>>> | > | CourseWork Systems Administrator
>>> | > | Stanford University
>>> | > |
>>> |
>>> |
>>> |
>>> | --
>>> | Julian M. Morley
>>> | CourseWork Systems Administrator
>>> | Stanford University
>>> |
>>
>>
>
>
>
> --
> Julian M. Morley
> CourseWork Systems Administrator
> Stanford University



--
Julian M. Morley
CourseWork Systems Administrator
Stanford University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/cle-release-team/attachments/20121101/c700d413/attachment-0006.html 


More information about the cle-release-team mailing list