[Building Sakai] [Deploying Sakai] Sakora-csv incremental upload?

Yi Zhu zhuy at wfu.edu
Thu May 23 09:46:10 PDT 2013


Aaron, 

Thanks a lot for those great suggestions! We were just too focused on making the post to work and completely ignored the manual run. My apology. I think we have enough information to help us move along at this point, will keep you posted on the progress for sure. 

Thanks again for your time and  guidance! 

YI

-----Original Message-----
From: azeckoski at gmail.com [mailto:azeckoski at gmail.com] On Behalf Of Aaron Zeckoski
Sent: Thursday, May 23, 2013 11:52 AM
To: Yi Zhu
Cc: sakai-dev at collab.sakaiproject.org
Subject: Re: [Deploying Sakai] Sakora-csv incremental upload?

On Thu, May 23, 2013 at 11:27 AM, Yi Zhu <zhuy at wfu.edu> wrote:
> 1. You are exactly right. The idea is to only remove things contained in the files from Sakai. We are thinking about using sakora-csv to load initial data, then to add/update (with ignore flags), and delete(delete mode plus ignore flags). This diagram may better explain what we are trying to accomplish.
> https://www.lucidchart.com/documents/view/435b-1550-519a592f-ac39-04ec
> 0a00913e

Sounds good. I think that would be a nice feature. Might be good to make sure it is really difficult to accidentally run it that way though. Perhaps just making sure there is lots of very noisy logging would be good enough (and of course default things to work in the "normal" mode.


> 2. This problem is not in the jobOverrides map, but due to the fact that JobDetail object is being reused across posts. Hence, the overrides are being retained in the same JobDataMap map. Though they get cleared when there is no override in the request. Because of that, the state of the running job could poetically be changed by a new post as well. Are we expected to only post new jobs when no CSV Loader job is running? I can see this could get tricky when we use different parameters between different loads.

In general, trying to post while another job is running is asking for trouble. That said, the settings are only checked for the instantiation and then they are stored in the service objects. I suppose we could clear the settings back to defaults every time but the tricky thing about that is that it means someone cannot set the settings and then just start the job using a run command.
Maybe a specific option which forces to default everything first? That way people could add that option in when they want to be sure it only picks up their new settings?

-AZ



> -----Original Message-----
> From: azeckoski at gmail.com [mailto:azeckoski at gmail.com] On Behalf Of 
> Aaron Zeckoski
> Sent: Wednesday, May 22, 2013 5:08 PM
> To: Yi Zhu
> Cc: sakai-dev at collab.sakaiproject.org
> Subject: Re: [Deploying Sakai] Sakora-csv incremental upload?
>
>> 1. Our understanding is that sakora-csv is not designed to handle 
>> deletion in batch mode, at least not yet. But we believe it would be 
>> a great feature, especially to avoid time consuming full load just for deletion purpose.
>> After doing some analysis, we think it is possible to do deletion 
>> with ignore flags in place and of course some code modifications. 
>> Here is our
>
> Is the idea here to basically put the entire processor into a "delete"
> mode so that things are removed instead of being added?
>
> If not, I am not sure how this is different from the usual processing that happens without the flags to disable deletions.
>
> Can you explain your goal here in more detail?
>
>
>> 2. Not sure if this is a bug or just because we were not supplying 
>> the parameters in the right way. Say, for example, run a Curl command 
>> with [-F "ignoreMissingSessions=true" -F 
>> "ignoreMembershipRemovals=true"], then the following command with 
>> parameter [-F "userRemovalMode=ignore"] only. In this scenario, the 
>> ignoreMissingSessions=true and ignoreMembershipRemovals=true would be 
>> carried over to the second request. To prevent this from happening, 
>> we think the existing parameters should be cleared first before checking new ones. So we modified CsvUploadServlet.java (line 235 - 244) as follow,
>>                 // remove any override
>>
>> jd.getJobDataMap().remove(OVERRIDE_IGNORE_MISSING_SESSIONS);
>> ..........
>
>
> Are you saying the state of those values is maintained across requests? Each post should reset the state so that should not be happening.
>
> Those values come from the jobOverrides map which is reset on each request so the state should be cleared each time a post is sent.
>
> If you are running the originally posted job over and over then the state will be maintained for all those future runs. I don't think that is a bug. If the state is being maintained across posts then something is very strange.
>
> -AZ
>
>
> --
> Aaron Zeckoski - Software Architect - http://tinyurl.com/azprofile
>



--
Aaron Zeckoski - Software Architect - http://tinyurl.com/azprofile



More information about the sakai-dev mailing list