[Building Sakai] quartz scheduler problem

Tue Jan 29 13:52:38 PST 2013

That's a good suggestion.  It does require applying SAK-13776 which never
made it into trunk.  This turns out not to solve our particular problem as
the ScheduledInvocationManagerImpl still tries to update the job even if
the quartz scheduler isn't running.  I'm thinking about patching that too.

- Dave

On Thu, Jan 24, 2013 at 12:42 PM, Mike De Simone <
michael.desimone at rsmart.com> wrote:

> We usually only run quartz on one server in a cluster.
>
> i.e., set startScheduler at org.sakaiproject.api.app.scheduler.SchedulerManager=false
> on all but one server, then you effectively have just one tomcat messing
> with the QRTZ tables.
>
> If that's not feasible for you to implement, then I suggest staggering
> restarts by at least 2 minutes (ideally 5) in order to serialize this kind
> of activity on the database.
>
> Absent a distributed transaction type of solution, these are the only 2
> things I can think of to help here.
>
>
>
> Thanks,
>
> -------------------------------
> Mike DeSimone
> Application Operations Manager
> *r**Smart* | 602-490-0473
>
>
> On Mon, Jan 14, 2013 at 9:38 AM, Zhen Qian <zqian at umich.edu> wrote:
>
>>  Hi, Dev team:
>>
>> We have found problems with our 2.9 cluster quartz job scheduler
>> deployment:
>>
>> 1. During server start up time, there are error log messages wrt
>> SchedulerInvocationManagerImpl init:
>>
>> 2013-01-14 05:01:34,645 [localhost-startStop-1] INFO
>>  org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl -
>> init()
>> 2013-01-14 05:01:34,667 [localhost-startStop-1] ERROR
>> org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl -
>> failed to schedule ScheduledInvocationRunner job
>> org.quartz.ObjectAlreadyExistsException: Unable to store Job with name:
>> 'org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl.runner'
>> and group:
>>  'org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl',
>> because one already exists with this identification.
>>         at
>> org.quartz.impl.jdbcjobstore.JobStoreSupport.storeJob(JobStoreSupport.java:1098)
>>         at
>> org.quartz.impl.jdbcjobstore.JobStoreSupport$3.execute(JobStoreSupport.java:1047)
>>         .....
>>
>> On the other hand, the quartz job is indeed defined in the database:
>>
>> select job_name, job_group, description, job_class_name, is_durable,
>> is_volatile, is_stateful, requests_recovery, jdb_data from qrtz_job_details
>> where
>> job_name='org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl.runner'
>>
>>
>> org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl.runner
>> org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl
>> org.sakaiproject.component.app.scheduler.jobs.ScheduledInvocationRunner 0
>> 0 1 0 (BLOB)
>>
>> But there is no trigger defined for this job:
>>
>> select * from qrtz_triggers where
>> job_name='org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl.runner'
>>
>> returns empty set.
>>
>> 2. The
>> "org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl.runner"
>> job had run successfully invocation rate of every 10 minutes, for a couple
>> of days. It only stopped running recently after one of our recent app
>> server restarts. Since then, tasks has been queued up, and one can view
>> them from the scheduler_delayed_invocation table in db:
>>
>> select count(*) from scheduler_delayed_invocation
>> 1011
>>
>>
>> =======================================================================================
>>
>> *So far, I cannot recreate the problems with single server deployment. *
>>
>> The following function is used for ScheduledInvocationManagerImpl init
>> call:
>>
>> *protected void registerScheduledInvocationRunner() throws
>> SchedulerException {*
>>
>> *   //trigger will not start immediately, wait until interval has passed
>> before 1st run*
>>
>> *   long startTime = System.currentTimeMillis() +
>> getScheduledInvocationRunnerInterval();*
>>
>> *       JobDetail detail = new JobDetail(
>> "org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl.runner"
>> ,*
>>
>> *
>> "org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl",
>> ScheduledInvocationRunner.class);*
>>
>> *       Trigger trigger = new SimpleTrigger(
>> "org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl.runner"
>> ,*
>>
>> *
>> "org.sakaiproject.component.app.scheduler.ScheduledInvocationManagerImpl",
>> new Date(startTime), null, SimpleTrigger.REPEAT_INDEFINITELY,*
>>
>> *          getScheduledInvocationRunnerInterval());       *
>>
>> *       m_schedulerManager.getScheduler().unscheduleJob(trigger.getName(),
>> trigger.getGroup());*
>>
>> *       m_schedulerManager.getScheduler().scheduleJob(detail, trigger);*
>>
>> *    }*
>>
>> Notice it tries to unscheduleJob first and add it back again. Will that
>> be a problem in a cluster environment, when all the servers are restating
>> at about the same time, and each server tries to unschedule and reschedule
>> the same job?
>>
>> Any suggestions? We will create a JIRA ticket for the jobscheduler
>> problems found.
>>
>> Thanks,
>>
>> - Zhen
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> sakai-dev mailing list
>> sakai-dev at collab.sakaiproject.org
>> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>>
>> TO UNSUBSCRIBE: send email to
>> sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
>> "unsubscribe"
>>
>
>
> _______________________________________________
> sakai-dev mailing list
> sakai-dev at collab.sakaiproject.org
> http://collab.sakaiproject.org/mailman/listinfo/sakai-dev
>
> TO UNSUBSCRIBE: send email to
> sakai-dev-unsubscribe at collab.sakaiproject.org with a subject of
> "unsubscribe"
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai-dev/attachments/20130129/e44e5cdd/attachment.html