[Building Sakai] Elastic Search (SRCH-111)

John Bush john.bush at rsmart.com
Wed Mar 6 14:41:51 PST 2013


So I just committed a few weeks of work after doing some serious data
load testing on 4 nodes, indexing about 100,000 docs of 60GB of
textual data.  There are too many fixes to mention, but its pretty
solid now and performing well.  This change includes the security
stuff to filter out content you don't have access to following the
method discussed in my last post.

For implementation and configuration, I added a bunch of documentation
in confluence:
https://confluence.sakaiproject.org/display/~jbush/Elasticsearch

In addition, I put together some scripts for creating sites, users,
enrollments, and repository content using the arquillian framework
I've been experimenting with.  That stuff is in git, if anyone is
interested.  You can just setup how many sites, users, user per site
you want, and how big you want your repo to be and it will just
generate everything.

That stuff is here:

https://github.com/johntbush/cle-integration-suite


On Thu, Feb 28, 2013 at 9:30 AM, John Bush <john.bush at rsmart.com> wrote:
> For right now, I'm going with the way Ian was doing it in the legacy
> search, in the interest of time and making sure it is solid.  What he
> had was a filter he applied to the results after searching, it
> basically delegated to the entity content producer by calling the
> canRead method
>
> if (ecp == null || !ecp.canRead(reference))
> {
> result = new CensoredSearchResult();
> }
>
> The the CensoredSearchResult would remove all the content and just put
> a message that says  you don't have access.  So at least the result
> numbers don't get screwed up.  Its a bit of a kludge but its
> consistent with how things used to work.  I prefer to get things rock
> solid before introducing any more complexity otherwise this thing will
> never get out of the gate.  Once we are confident in its performance I
> think we can take this up again.  I think the method Ian has here that
> I'm using will address Matthews concern, b/c its really delegating
> back to the ECP which can enforce its own rules which may or may not
> be based on authz alone.
>
> On Thu, Feb 28, 2013 at 7:24 AM, Matthew Buckett
> <matthew.buckett at it.ox.ac.uk> wrote:
>> On Thu, Feb 28, 2013 at 3:04 AM, Steve Swinsburg
>> <steve.swinsburg at gmail.com> wrote:
>>> I think realms might change quite often, especially if people limit content
>>> based on groups and people move in and out of those groups.
>>
>> Services in Sakai also use more than plain authz groups. One example
>> is ContentHostingService's timed release:
>>
>> org.sakaiproject.content.impl.BaseContentService.availabilityCheck(String)
>>
>> and then you have forums which doesn't use Sakai's authz service. So
>> trying to embed the authz in search isn't going to be easy.
>>
>> --
>>   Matthew Buckett, VLE Developer, IT Services, University of Oxford
>
>
>
> --
> John Bush
> 602-490-0470



--
John Bush
602-490-0470


More information about the sakai-dev mailing list