[sakai2-tcc] Site caching

csev csev at umich.edu
Thu Feb 17 19:57:57 PST 2011


If the set of patches proposed has a week of production experience at UM (it would be nice to have other sites test it in production) - I think that the answer is that we *should* include it in 2.8.    Without production testing the risk to include is greater than the risk not to include so the more production testing, the better we can feel.  

Even after these patches go in, we need to be quite vigilant over the next months for anything weird happening in production - so even if sites don't run it this week - they should get it in as quickly as possible so a newly exposed problem could be fixed in 2.8.1 or 2.8.2.  Regular, random crashing is more embarrassing that a memory leak.

Lets move the conversation about the fix to a new thread.

/Chuck

On Feb 17, 2011, at 3:24 PM, David Haines wrote:

> UMichigan is running a lightly patched 2.7.1 so these comments are about  that version.  I've not looked to see what other versions are affected, but I assume that this applies to 2.6, 2.8 and trunk as well.  Some of these are fixed in trunk already.
> 
> The immediate Jira issues are:
> 
> KNL-293 - Can't configure ehache from Sakai.
> KNL-652 - SIte caching doesn't clean up its secondary page / tool / group caches when a site is removed from cache.
> KNL-654 - Time service use hashmap for caching.
> KNL-664 - LocalTZFormat caching broken in basic time server.
> 
> All but the last have patches available, if not fully tested.  A patch for the last is under construction.
> 
> I agree with Chuck that caching should be revisited as a whole, but I don't think it is possible to wait until some indefinite future time.
> 
> Aaron has a good point about replication being much harder to get right.  I'm also not clear that the case has been made that it would make much practical difference in Sakai.
> 
> - Dave
> 
> David Haines
> CTools Developer
> Digital Media Commons
> University of Michigan 
> dlhaines at umich.edu
> 
> 
> 
> On Feb 17, 2011, at 7:02 AM, Steve Swinsburg wrote:
> 
>> Can we get more specific about the issues here and then we can flesh out how the caching could be reimplemented? I don't know much about the Sakai caching, but I do about caching in general and implementing a high level Site cache that has built in invalidation based on certain events shouldn't be too difficult. 
>> 
>> I assume replication is a major issue here too?
>> 
>> regards,
>> Steve
>> 
>> 
>> On 17/02/2011, at 10:48 PM, David Haines wrote:
>> 
>>> Fragile is right, we just found a third serious caching bug in 2.7.  But waiting isn't much of an option.  Anyone running the current code with many users is going to run into problems that they will only find in production.  Personally I don't think 2.8 could be released in good conscience with these bugs and people should not run 2.7 until at least the obvious problems are fixed.
>>> 
>>> - Dave
>>> 
>>> David Haines
>>> CTools Developer
>>> Digital Media Commons
>>> University of Michigan 
>>> dlhaines at umich.edu
>>> 
>>> 
>>> 
>>> 
>>> On Feb 17, 2011, at 2:36 AM, csev wrote:
>>> 
>>>> This code is extremely fragile.  My recommendation is a complete rewrite - I even wrote myself a JIRA many years ago to that effect:
>>>> 
>>>> https://jira.sakaiproject.org/browse/KNL-89
>>>> 
>>>> If you recall, over the years, every time someone touches this code, it breaks badly - it would seem as though SAK-11440 is similar to all other attempts to tune Site Caching.
>>>> 
>>>> My plan in KNL-89 was to first remove *all* caching from Site and then test it thoroughly - and then make a nice, clean layer that should cache it nicely, touching the code at exactly one cut point and making very sure that invalidation was 100% perfect - and now of course - memory cleanup is perfect as well.
>>>> 
>>>> As more and more tools use properties for their primary storage - properties will become an increasing problem if not dealt with properly.
>>>> 
>>>> I fear that this is not something easily slid into 2.8 as a "quick fix" - the last thing we need is 2.8 shipping completely broken because of an unknown problem.     I would suggest that this needs a month of a person 100% dedicated to the problem - and we neither have the month nor the person.   So I also suggest that we *do not touch it at this time* and make this part of 2.9.
>>>> 
>>>> /Chuck
>>>> 
>>>> On Feb 16, 2011, at 8:10 PM, Matthew Jones wrote:
>>>> 
>>>>> This has probably been a problem for Sakai since ehcache was implemented in 2.5 (SAK-11440). This is not something someone would notice unless they had a high usage site, didn't restart often AND were looking at the heap dumps in a memory analyzer.
>>>>> 
>>>>> On Wed, Feb 16, 2011 at 2:04 PM, David Haines <dlhaines at umich.edu> wrote:
>>>>> It is not new to 2.8, we found it in 2.7.1.
>>>>> 
>>>>> - Dave
>>>>> 
>>>>> David Haines
>>>>> CTools Developer
>>>>> Digital Media Commons
>>>>> University of Michigan 
>>>>> dlhaines at umich.edu
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> sakai2-tcc mailing list
>>>> sakai2-tcc at collab.sakaiproject.org
>>>> http://collab.sakaiproject.org/mailman/listinfo/sakai2-tcc
>>> 
>>> _______________________________________________
>>> sakai2-tcc mailing list
>>> sakai2-tcc at collab.sakaiproject.org
>>> http://collab.sakaiproject.org/mailman/listinfo/sakai2-tcc
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai2-tcc/attachments/20110218/cde3bf9d/attachment.html 


More information about the sakai2-tcc mailing list