[Using Sakai] Sakai Error on one of our two nodes

Matthew Jones matthew at longsight.com
Wed Sep 17 10:50:52 PDT 2014


I personally wouldn't worry too much about those open files, especially on
2.9 if you're running search and that seems to be what is causing it. Sakai
2.9 and 10 just starting up are near the 1024 file limit because of all the
tools included by default so it's only expected that you'd hit that error.

And search prior to switching over to elasticsearch (in 10) had a number of
issues and is completely removed from Sakai as of 10.1. I wouldn't be
completely surprised if it has to open up ~2000 files while it's indexing.
Ideally it would close them off Really that file limit is meant to protect
the developer and warn you if you have an actual file or socket leak in the
code, which we have had in the past.

This same conversation came up in 2011 and it just really needed a lot of
open files to complete the index.
http://collab.sakaiproject.org/pipermail/production/2011-November/001658.html

I'd really look to upgrading to at least 10.1 if you want a more reliable
search tool.

On Wed, Sep 17, 2014 at 9:57 AM, Anders Nordkvist <anders.nordqvist at his.se>
wrote:

>  Hi,
>
>
>
> I managed to set the limit to 10000. It worked after I added this line
>
>
>
> session required pam_limits.so
>
>
>
> to /etc/pam.d/common-session
>
>
>
> Whats strange in this case thought is that the nodes are so different in
> open files:
>
>
>
> Node 2
>
> lsof -u sakai |grep -i index | wc -l
>
> 2134
>
>
>
> Node 1
>
> lsof | grep index | grep sakai | wc -l
>
> 94
>
>
>
> And the second node is increasing a lot faster.
>
>
>
>
>
> Regards
>
> Anders Nordkvist
>
> System administrator
>
> University Of Skövde
>
> Sweden
>
>
>
>
>
>
>
>
>
> *From:* Matthew Jones [mailto:matthew at longsight.com]
> *Sent:* den 16 september 2014 14:56
> *To:* Anders Nordkvist
> *Cc:* Stephen Marquard; Sam Ottenhoff; steve.swinsburg at gmail.com;
> sakai-user at collab.sakaiproject.org
>
> *Subject:* Re: [Using Sakai] Sakai Error on one of our two nodes
>
>
>
> You'd want to set the hard and soft limits to make it easier. The soft
> limit is something that should be able to be changed later. You need to
> change that with the -S option. Without setting the soft limit, nothing
> changes.
>
>
> http://askubuntu.com/questions/162229/how-do-i-increase-the-open-files-limit-for-a-non-root-user
>
>
>
> In the file is Sakai capitalized? That would also be a problem but
> probably isn't the case.
>
>
>
> I'm on ubuntu 14.04 and my /etc/security/limits.conf file says at the end
> and this works.
>
>
>
> # End of file
>
>
>
> sakai hard nofile 65535
>
> sakai soft nofile 65535
>
>
>
> $ ulimit -n
>
> 65535
>
>
>
> On Tue, Sep 16, 2014 at 8:17 AM, Anders Nordkvist <anders.nordqvist at his.se>
> wrote:
>
>  Hi again,
>
>
>
> I can’t seem to get the ”ulimit –n 10000” to work. I only get no
> permission for the Sakai user:
>
>
>
> -su: ulimit: open files: cannot modify limit: Operation not permitted
>
>
>
> I have set the permission in “/etc/security/limit.conf” and rebooted.
>
>
>
> Sakai hard nofile 10000
>
>
>
> And Ive set the “ulimit” in the “tomcat/bin/setenv”
>
>
>
> Ulimit –n 10000
>
>
>
> Am I doin it wrong? Feels like ive read hundreds of pages from the net but
> can’t get it right anyhow :(
>
> Im using Ubuntu 12.04.4 LTS
>
>
>
> Regards Anders
>
>
>
> *From:* Stephen Marquard [mailto:stephen.marquard at uct.ac.za]
> *Sent:* den 16 september 2014 09:06
> *To:* Anders Nordkvist; Sam Ottenhoff; steve.swinsburg at gmail.com
> *Cc:* sakai-user at collab.sakaiproject.org
> *Subject:* RE: [Using Sakai] Sakai Error on one of our two nodes
>
>
>
> If your search indexes are somehow corrupt, then you should either disable
> search entirely (search.enable = false in sakai.properties), or delete all
> your search indexes, truncate the search tables, and do a full index
> rebuild.
>
>
>
> Regardless of that, I’d still suggest setting the open files limit in your
> Sakai startup script to at least 10000.
>
>
>
> Regards
>
> Stephen
>
>
>
> ---
> Stephen Marquard, Learning Technologies Co-ordinator,
> Centre for Innovation in Learning and Teaching (CILT)
> University of Cape Town
> http://www.cilt.uct.ac.za
> stephen.marquard at uct.ac.za
> Phone: +27-21-650-5037 Cell: +27-83-500-5290
>
>
>
> *From:* Anders Nordkvist [mailto:anders.nordqvist at his.se
> <anders.nordqvist at his.se>]
> *Sent:* 16 September 2014 08:54 AM
> *To:* Sam Ottenhoff; steve.swinsburg at gmail.com; Stephen Marquard
> *Cc:* sakai-user at collab.sakaiproject.org
> *Subject:* RE: [Using Sakai] Sakai Error on one of our two nodes
>
>
>
> Hi,
>
>
>
> If I delete or move the indexwork files in the sakai dir on node two would
> that be a solution for my problems or do you think I have to start over on
> node two with a clean tomcat? I don’t think the problems will go away by
> just increasing the open file limit because it seems like the index open
> files just keeps on increasing. I got the “to many open files” again this
> morning wih a:
>
>
>
> lsof -u sakai | grep -i indexwork | wc –l
>
>
>
> of 4300 files.
>
>
>
>
>
> Regards
>
> Anders Nordkvist
>
> System administrator
>
> University Of Skövde
>
> Sweden
>
>
>
>
>
>
>
> *From:* sakai-user-bounces at collab.sakaiproject.org [
> mailto:sakai-user-bounces at collab.sakaiproject.org
> <sakai-user-bounces at collab.sakaiproject.org>] *On Behalf Of *Anders
> Nordkvist
> *Sent:* den 15 september 2014 15:46
> *To:* Sam Ottenhoff
> *Cc:* sakai-user at collab.sakaiproject.org
> *Subject:* Re: [Using Sakai] Sakai Error on one of our two nodes
>
>
>
> Unfortunatley it seems like my index files are the files going up without
> decreasing and so it might be corrupted as Steve is writing:
>
>
>
> sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
>
> 1585
>
> sakai at scio2:~$ lsof -u sakai | wc -l
>
> 3260
>
> sakai at scio2:~$ lsof -u sakai | wc -l
>
> 3261
>
> sakai at scio2:~$ lsof -u sakai | wc -l
>
> 3262
>
> sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
>
> 1594
>
> sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
>
> 1594
>
> sakai at scio2:~$ lsof -u sakai | wc -l
>
> 3242
>
> sakai at scio2:~$ lsof -u sakai | wc -l
>
> 3235
>
> sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
>
> 1594
>
> sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
>
> 1594
>
> sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
>
> 1639
>
> sakai at scio2:~$ lsof -u sakai | wc -l
>
> 3315
>
> sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
>
> 1648
>
> sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
>
> 1657
>
> sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
>
> 1666
>
>
>
> Regards Anders
>
>
>
> *From:* Sam Ottenhoff [mailto:ottenhoff at longsight.com
> <ottenhoff at longsight.com>]
> *Sent:* den 15 september 2014 15:14
> *To:* Anders Nordkvist
> *Cc:* Steve Swinsburg; Stephen Marquard;
> sakai-user at collab.sakaiproject.org
> *Subject:* Re: [Using Sakai] Sakai Error on one of our two nodes
>
>
>
> The ulimit of 1024 is a per-process limit and your lsof output shows
> several different processes.
>
>
>
> On Mon, Sep 15, 2014 at 8:21 AM, Anders Nordkvist <anders.nordqvist at his.se>
> wrote:
>
>  Ok thanks,
>
>
>
> But isnt it strange that I have 1024  limit when I check with “ulimit –a”
> and when I run “lsof –u Sakai | wc –l” I now get 3067 and that is over the
> limit?
>
>
>
> Regards Anders
>
>
>
> *From:* Steve Swinsburg [mailto:steve.swinsburg at gmail.com]
> *Sent:* den 15 september 2014 13:26
> *To:* Stephen Marquard
> *Cc:* Anders Nordkvist; sakai-user at collab.sakaiproject.org
> *Subject:* Re: [Using Sakai] Sakai Error on one of our two nodes
>
>
>
> This is pretty much a standard step now that Sakai is so large. It's
> likely the OS update and subsequent restart has reset this down to a lower
> level. Increase it as much as you like - 10000 should get you out of
> trouble.
>
> The search error is directly related to this error as it cannot get
> another file descriptor open to write search indexes. Hopefully it has not
> corrupted the index.
>
> regards,
> Steve
>
>
>
> On Mon, Sep 15, 2014 at 9:03 PM, Stephen Marquard <
> stephen.marquard at uct.ac.za> wrote:
>
>  If you have more than one java process running, then that would be a
> factor. Are your 2 nodes on one server, or one node on two servers?
>
>
>
> I’d suggest you take a look at:
>
>
>
> lsof -u tomcat | grep -v jar
>
>
>
> and see if there’s anything unusual, and also add
>
>
>
> ulimit -n 5000
>
>
>
> to your Sakai startup script to see if that helps.
>
>
> Cheers
>
> Stephen
>
>
>
>
>
> ---
> Stephen Marquard, Learning Technologies Co-ordinator,
> Centre for Innovation in Learning and Teaching (CILT)
> University of Cape Town
> http://www.cilt.uct.ac.za
> stephen.marquard at uct.ac.za
> Phone: +27-21-650-5037 Cell: +27-83-500-5290
>
>
>
> *From:* Anders Nordkvist [mailto:anders.nordqvist at his.se]
> *Sent:* 15 September 2014 12:58 PM
> *To:* Stephen Marquard; sakai-user at collab.sakaiproject.org
>
>
> *Subject:* RE: Sakai Error on one of our two nodes
>
>
>
> Hi Stephen,
>
>
>
> Thanks for the tips. I get this when I run the commands:
>
>
>
> core file size          (blocks, -c) 0
>
> data seg size           (kbytes, -d) unlimited
>
> scheduling priority             (-e) 0
>
> file size               (blocks, -f) unlimited
>
> pending signals                 (-i) 63739
>
> max locked memory       (kbytes, -l) 64
>
> max memory size         (kbytes, -m) unlimited
>
> open files                      (-n) 1024
>
> pipe size            (512 bytes, -p) 8
>
> POSIX message queues     (bytes, -q) 819200
>
> real-time priority              (-r) 0
>
> stack size              (kbytes, -s) 8192
>
> cpu time               (seconds, -t) unlimited
>
> max user processes              (-u) 63739
>
> virtual memory          (kbytes, -v) unlimited
>
> file locks                      (-x) unlimited
>
> sakai at scio2:~$ lsof -u sakai | wc -l
>
> 2769
>
>
>
> If I understand this right we have a max of 1024 for open files a process
> but the actually open files are 2769. Is this because there is more
> processes running?
>
>
>
> Regards Anders
>
>
>
> *From:* Stephen Marquard [mailto:stephen.marquard at uct.ac.za
> <stephen.marquard at uct.ac.za>]
> *Sent:* den 15 september 2014 12:26
> *To:* Anders Nordkvist; sakai-user at collab.sakaiproject.org
> *Subject:* RE: Sakai Error on one of our two nodes
>
>
>
> Hi Anders
>
>
>
> You have 2 different problems; one from “Too many open files” and the
> other from the search service.
>
>
>
> For the “too many open files” issue, you should see how many are being
> used and what the OS limit is on your app server. For example if your Sakai
> process runs as the tomcat user, you can run:
>
>
>
> # lsof -u tomcat | wc -l
>
> 3821
>
>
>
> and run “ulimit -a” to see the per-process OS limits. You can change these
> in your Sakai startup script, e.g. we have:
>
>
>
> # Increase max open files
>
> ulimit -n 100000
>
>
>
> which is probably totally unnecessarily large, but we definitely had to
> increase it past the default 1024 in the early days. 5000 is perhaps
> reasonable.
>
>
>
> It’s possible the “too many open files” is a symptom of another problem
> rather than just an underlying limit that you’ve run into, in which case
> you need to see what those open files are (which could include socket
> connections) and why they are getting opened and not closed.
>
>
>
> Regards
>
> Stephen
>
>
>
> ---
> Stephen Marquard, Learning Technologies Co-ordinator,
> Centre for Innovation in Learning and Teaching (CILT)
> University of Cape Town
> http://www.cilt.uct.ac.za
> stephen.marquard at uct.ac.za
> Phone: +27-21-650-5037 Cell: +27-83-500-5290
>
>
>
> *From:* sakai-user-bounces at collab.sakaiproject.org [
> mailto:sakai-user-bounces at collab.sakaiproject.org
> <sakai-user-bounces at collab.sakaiproject.org>] *On Behalf Of *Anders
> Nordkvist
> *Sent:* 15 September 2014 12:07 PM
> *To:* sakai-user at collab.sakaiproject.org
> *Subject:* [Using Sakai] Sakai Error on one of our two nodes
>
>
>
> Hi,
>
>
>
> We have had problems with Sakai at the University of Skövde Sweden after
> an OS update and restart of systems last friday. We have 2.9.x and have two
> Sakai nodes and on top of that we have a netscaler distributing the load
> and behind a mysql server. The Sakai nodes collect information via LDAP
> from our Microsoft AD. The problem occurred several hours after the update
> of OS and restart of machines (about 11hours). During this time you only
> have a 50/50 % chance to login because the netscaler is not working
> properly and is not directing traffic to the working node. Can you guys
> please take a look at this and see if you can figure it out? This is the
> log from the beginning:
>
>
>
> 2014-09-12 22:08:07,941  WARN http-bio-8080-exec-121
> org.apache.myfaces.shared_impl.renderkit.html.HtmlImageRendererBase - ALT
> attribute is missing for : _idJsp64
>
> 2014-09-12 22:14:00,421  WARN http-bio-8080-exec-108
> com.sun.faces.renderkit.html_basic.HtmlBasicRenderer - Unable to find
> component with ID 'df_compose_title' in view.
>
> 2014-09-12 22:14:00,422  WARN http-bio-8080-exec-108
> com.sun.faces.renderkit.html_basic.HtmlBasicRenderer - Unable to find
> component with ID 'df_compose_body' in view.
>
> Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor
> run
>
> SEVERE: Socket accept failed
>
> java.net.SocketException: Too many open files
>
>         at java.net.PlainSocketImpl.socketAccept(Native Method)
>
>         at
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>
>         at java.net.ServerSocket.implAccept(ServerSocket.java:530)
>
>         at java.net.ServerSocket.accept(ServerSocket.java:498)
>
>         at
> org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
>
>         at
> org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
>
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
> Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor
> run
>
> SEVERE: Socket accept failed
>
> java.net.SocketException: Too many open files
>
>         at java.net.PlainSocketImpl.socketAccept(Native Method)
>
>         at
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>
>         at java.net.ServerSocket.implAccept(ServerSocket.java:530)
>
>         at java.net.ServerSocket.accept(ServerSocket.java:498)
>
>         at
> org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
>
>         at
> org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
>
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
> Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor
> run
>
> SEVERE: Socket accept failed
>
> java.net.SocketException: Too many open files
>
>         at java.net.PlainSocketImpl.socketAccept(Native Method)
>
>         at
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>
>         at java.net.ServerSocket.implAccept(ServerSocket.java:530)
>
>         at java.net.ServerSocket.accept(ServerSocket.java:498)
>
>         at
> org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
>
>         at
> org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
>
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
> Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor
> run
>
> SEVERE: Socket accept failed
>
> java.net.SocketException: Too many open files
>
>         at java.net.PlainSocketImpl.socketAccept(Native Method)
>
>         at
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>
>         at java.net.ServerSocket.implAccept(ServerSocket.java:530)
>
>         at java.net.ServerSocket.accept(ServerSocket.java:498)
>
>         at
> org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
>
>         at
> org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
>
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
> Sep 12, 2014 10:17:31 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor
> run
>
> SEVERE: Socket accept failed
>
> java.net.SocketException: Too many open files
>
>         at java.net.PlainSocketImpl.socketAccept(Native Method)
>
>         at
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>
>         at java.net.ServerSocket.implAccept(ServerSocket.java:530)
>
>         at java.net.ServerSocket.accept(ServerSocket.java:498)
>
>         at
> org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
>
>         at
> org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
>
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
>
>
> Later on I can see this:
>
>
>
>
>
> 2014-09-12 23:56:45,667 ERROR http-bio-8080-exec-121
> edu.amc.sakai.user.JLDAPDirectoryProvider - getUser() failed [eid: b14verca]
>
> LDAPException: Unable to connect to server hsdc1.hs.local:636 (91) Connect
> Error
>
> java.net.SocketException: Too many open files
>
>         at com.novell.ldap.Connection.connect(Unknown Source)
>
>         at com.novell.ldap.Connection.connect(Unknown Source)
>
>         at com.novell.ldap.LDAPConnection.connect(Unknown Source)
>
>         at
> edu.amc.sakai.user.SimpleLdapConnectionManager.connect(SimpleLdapConnectionManager.java:244)
>
>         at
> edu.amc.sakai.user.SimpleLdapConnectionManager.getConnection(SimpleLdapConnectionManager.java:65)
>
>         at
> edu.amc.sakai.user.JLDAPDirectoryProvider.searchDirectory(JLDAPDirectoryProvider.java:954)
>
>         at
> edu.amc.sakai.user.JLDAPDirectoryProvider.searchDirectoryForSingleEntry(JLDAPDirectoryProvider.java:902)
>
>         at
> edu.amc.sakai.user.JLDAPDirectoryProvider.getUserByEid(JLDAPDirectoryProvider.java:824)
>
>         at
> edu.amc.sakai.user.JLDAPDirectoryProvider.getUserByEid(JLDAPDirectoryProvider.java:778)
>
>         at
> edu.amc.sakai.user.JLDAPDirectoryProvider.getUser(JLDAPDirectoryProvider.java:603)
>
> ...
>
> [Message clipped]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai-user/attachments/20140917/0d0d8642/attachment-0001.html 


More information about the sakai-user mailing list