[Using Sakai] Sakai Error on one of our two nodes

Anders Nordkvist anders.nordqvist at his.se
Mon Sep 15 23:54:02 PDT 2014


Hi,

If I delete or move the indexwork files in the sakai dir on node two would that be a solution for my problems or do you think I have to start over on node two with a clean tomcat? I don’t think the problems will go away by just increasing the open file limit because it seems like the index open files just keeps on increasing. I got the “to many open files” again this morning wih a:

lsof -u sakai | grep -i indexwork | wc –l

of 4300 files.


Regards
Anders Nordkvist
System administrator
University Of Skövde
Sweden



From: sakai-user-bounces at collab.sakaiproject.org [mailto:sakai-user-bounces at collab.sakaiproject.org] On Behalf Of Anders Nordkvist
Sent: den 15 september 2014 15:46
To: Sam Ottenhoff
Cc: sakai-user at collab.sakaiproject.org
Subject: Re: [Using Sakai] Sakai Error on one of our two nodes

Unfortunatley it seems like my index files are the files going up without decreasing and so it might be corrupted as Steve is writing:

sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
1585
sakai at scio2:~$ lsof -u sakai | wc -l
3260
sakai at scio2:~$ lsof -u sakai | wc -l
3261
sakai at scio2:~$ lsof -u sakai | wc -l
3262
sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
1594
sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
1594
sakai at scio2:~$ lsof -u sakai | wc -l
3242
sakai at scio2:~$ lsof -u sakai | wc -l
3235
sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
1594
sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
1594
sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
1639
sakai at scio2:~$ lsof -u sakai | wc -l
3315
sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
1648
sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
1657
sakai at scio2:~$ lsof -u sakai | grep -i indexwork | wc -l
1666

Regards Anders

From: Sam Ottenhoff [mailto:ottenhoff at longsight.com]
Sent: den 15 september 2014 15:14
To: Anders Nordkvist
Cc: Steve Swinsburg; Stephen Marquard; sakai-user at collab.sakaiproject.org<mailto:sakai-user at collab.sakaiproject.org>
Subject: Re: [Using Sakai] Sakai Error on one of our two nodes

The ulimit of 1024 is a per-process limit and your lsof output shows several different processes.

On Mon, Sep 15, 2014 at 8:21 AM, Anders Nordkvist <anders.nordqvist at his.se<mailto:anders.nordqvist at his.se>> wrote:
Ok thanks,

But isnt it strange that I have 1024  limit when I check with “ulimit –a” and when I run “lsof –u Sakai | wc –l” I now get 3067 and that is over the limit?

Regards Anders

From: Steve Swinsburg [mailto:steve.swinsburg at gmail.com<mailto:steve.swinsburg at gmail.com>]
Sent: den 15 september 2014 13:26
To: Stephen Marquard
Cc: Anders Nordkvist; sakai-user at collab.sakaiproject.org<mailto:sakai-user at collab.sakaiproject.org>
Subject: Re: [Using Sakai] Sakai Error on one of our two nodes

This is pretty much a standard step now that Sakai is so large. It's likely the OS update and subsequent restart has reset this down to a lower level. Increase it as much as you like - 10000 should get you out of trouble.

The search error is directly related to this error as it cannot get another file descriptor open to write search indexes. Hopefully it has not corrupted the index.
regards,
Steve

On Mon, Sep 15, 2014 at 9:03 PM, Stephen Marquard <stephen.marquard at uct.ac.za<mailto:stephen.marquard at uct.ac.za>> wrote:
If you have more than one java process running, then that would be a factor. Are your 2 nodes on one server, or one node on two servers?

I’d suggest you take a look at:

lsof -u tomcat | grep -v jar

and see if there’s anything unusual, and also add

ulimit -n 5000

to your Sakai startup script to see if that helps.

Cheers
Stephen


---
Stephen Marquard, Learning Technologies Co-ordinator,
Centre for Innovation in Learning and Teaching (CILT)
University of Cape Town
http://www.cilt.uct.ac.za
stephen.marquard at uct.ac.za<mailto:stephen.marquard at uct.ac.za>
Phone: +27-21-650-5037<tel:%2B27-21-650-5037> Cell: +27-83-500-5290<tel:%2B27-83-500-5290>

From: Anders Nordkvist [mailto:anders.nordqvist at his.se<mailto:anders.nordqvist at his.se>]
Sent: 15 September 2014 12:58 PM
To: Stephen Marquard; sakai-user at collab.sakaiproject.org<mailto:sakai-user at collab.sakaiproject.org>

Subject: RE: Sakai Error on one of our two nodes

Hi Stephen,

Thanks for the tips. I get this when I run the commands:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63739
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 63739
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
sakai at scio2:~$ lsof -u sakai | wc -l
2769

If I understand this right we have a max of 1024 for open files a process but the actually open files are 2769. Is this because there is more processes running?

Regards Anders

From: Stephen Marquard [mailto:stephen.marquard at uct.ac.za]
Sent: den 15 september 2014 12:26
To: Anders Nordkvist; sakai-user at collab.sakaiproject.org<mailto:sakai-user at collab.sakaiproject.org>
Subject: RE: Sakai Error on one of our two nodes

Hi Anders

You have 2 different problems; one from “Too many open files” and the other from the search service.

For the “too many open files” issue, you should see how many are being used and what the OS limit is on your app server. For example if your Sakai process runs as the tomcat user, you can run:

# lsof -u tomcat | wc -l
3821

and run “ulimit -a” to see the per-process OS limits. You can change these in your Sakai startup script, e.g. we have:

# Increase max open files
ulimit -n 100000

which is probably totally unnecessarily large, but we definitely had to increase it past the default 1024 in the early days. 5000 is perhaps reasonable.

It’s possible the “too many open files” is a symptom of another problem rather than just an underlying limit that you’ve run into, in which case you need to see what those open files are (which could include socket connections) and why they are getting opened and not closed.

Regards
Stephen

---
Stephen Marquard, Learning Technologies Co-ordinator,
Centre for Innovation in Learning and Teaching (CILT)
University of Cape Town
http://www.cilt.uct.ac.za
stephen.marquard at uct.ac.za<mailto:stephen.marquard at uct.ac.za>
Phone: +27-21-650-5037<tel:%2B27-21-650-5037> Cell: +27-83-500-5290<tel:%2B27-83-500-5290>

From: sakai-user-bounces at collab.sakaiproject.org<mailto:sakai-user-bounces at collab.sakaiproject.org> [mailto:sakai-user-bounces at collab.sakaiproject.org] On Behalf Of Anders Nordkvist
Sent: 15 September 2014 12:07 PM
To: sakai-user at collab.sakaiproject.org<mailto:sakai-user at collab.sakaiproject.org>
Subject: [Using Sakai] Sakai Error on one of our two nodes

Hi,

We have had problems with Sakai at the University of Skövde Sweden after an OS update and restart of systems last friday. We have 2.9.x and have two Sakai nodes and on top of that we have a netscaler distributing the load and behind a mysql server. The Sakai nodes collect information via LDAP from our Microsoft AD. The problem occurred several hours after the update of OS and restart of machines (about 11hours). During this time you only have a 50/50 % chance to login because the netscaler is not working properly and is not directing traffic to the working node. Can you guys please take a look at this and see if you can figure it out? This is the log from the beginning:

2014-09-12 22:08:07,941  WARN http-bio-8080-exec-121 org.apache.myfaces.shared_impl.renderkit.html.HtmlImageRendererBase - ALT attribute is missing for : _idJsp64
2014-09-12 22:14:00,421  WARN http-bio-8080-exec-108 com.sun.faces.renderkit.html_basic.HtmlBasicRenderer - Unable to find component with ID 'df_compose_title' in view.
2014-09-12 22:14:00,422  WARN http-bio-8080-exec-108 com.sun.faces.renderkit.html_basic.HtmlBasicRenderer - Unable to find component with ID 'df_compose_body' in view.
Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 12, 2014 10:17:31 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)


Later on I can see this:


2014-09-12 23:56:45,667 ERROR http-bio-8080-exec-121 edu.amc.sakai.user.JLDAPDirectoryProvider - getUser() failed [eid: b14verca]
LDAPException: Unable to connect to server hsdc1.hs.local:636 (91) Connect Error
java.net.SocketException: Too many open files
        at com.novell.ldap.Connection.connect(Unknown Source)
        at com.novell.ldap.Connection.connect(Unknown Source)
        at com.novell.ldap.LDAPConnection.connect(Unknown Source)
        at edu.amc.sakai.user.SimpleLdapConnectionManager.connect(SimpleLdapConnectionManager.java:244)
        at edu.amc.sakai.user.SimpleLdapConnectionManager.getConnection(SimpleLdapConnectionManager.java:65)
        at edu.amc.sakai.user.JLDAPDirectoryProvider.searchDirectory(JLDAPDirectoryProvider.java:954)
        at edu.amc.sakai.user.JLDAPDirectoryProvider.searchDirectoryForSingleEntry(JLDAPDirectoryProvider.java:902)
        at edu.amc.sakai.user.JLDAPDirectoryProvider.getUserByEid(JLDAPDirectoryProvider.java:824)
        at edu.amc.sakai.user.JLDAPDirectoryProvider.getUserByEid(JLDAPDirectoryProvider.java:778)
        at edu.amc.sakai.user.JLDAPDirectoryProvider.getUser(JLDAPDirectoryProvider.java:603)
        at org.sakaiproject.user.impl.BaseUserDirectoryService.getProvidedUserByEid(BaseUserDirectoryService.java:656)
        at org.sakaiproject.user.impl.BaseUserDirectoryService.getUser(BaseUserDirectoryService.java:722)
        at org.sakaiproject.user.impl.BaseUserDirectoryService.getCurrentUser(BaseUserDirectoryService.java:890)
        at org.sakaiproject.authz.impl.SakaiSecurity.unlock(SakaiSecurity.java:222)
        at org.sakaiproject.authz.cover.SecurityService.unlock(SecurityService.java:91)
        at org.sakaiproject.portal.charon.site.PortalSiteHelperImpl.pageListToMap(PortalSiteHelperImpl.java:583)
        at org.sakaiproject.portal.charon.handlers.WorksiteHandler.includeWorksite(WorksiteHandler.java:195)
        at org.sakaiproject.portal.charon.handlers.WorksiteHandler.doWorksite(WorksiteHandler.java:165)
        at org.sakaiproject.portal.charon.SkinnableCharonPortal.doError(SkinnableCharonPortal.java:270)
        at org.sakaiproject.portal.charon.handlers.PresenceHandler.doPresence(PresenceHandler.java:117)
        at org.sakaiproject.portal.charon.handlers.PresenceHandler.doGet(PresenceHandler.java:70)
        at org.sakaiproject.portal.charon.SkinnableCharonPortal.doGet(SkinnableCharonPortal.java:894)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at org.sakaiproject.util.RequestFilter.doFilter(RequestFilter.java:695)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
        at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketException: Too many open files

I can also see this and I don’t know if its related:

00:00:13,955 ERROR IndexManager org.sakaiproject.search.indexer.impl.TransactionalIndexWorker - Failed to Add Documents
org.sakaiproject.search.transaction.api.IndexTransactionException: Cant Create Transaction Index working space
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getInternalIndexWriter(IndexUpdateTransactionImpl.java:205)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getIndexWriter(IndexUpdateTransactionImpl.java:168)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getIndexReader(IndexUpdateTransactionImpl.java:338)
        at org.sakaiproject.search.indexer.impl.TransactionalIndexWorker.processTransaction(TransactionalIndexWorker.java:229)
        at org.sakaiproject.search.indexer.impl.TransactionalIndexWorker.process(TransactionalIndexWorker.java:132)
        at org.sakaiproject.search.indexer.impl.ConcurrentSearchIndexBuilderWorkerImpl.runOnce(ConcurrentSearchIndexBuilderWorkerImpl.java:273)
        at org.sakaiproject.search.journal.impl.IndexManagementTimerTask.run(IndexManagementTimerTask.java:137)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505)
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock<mailto:NativeFSLock@/opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock>: java.io.FileNotFoundException: /opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock (Too many open files)
        at org.apache.lucene.store.Lock.obtain(Lock.java:85)
        at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1562)
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1090)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getInternalIndexWriter(IndexUpdateTransactionImpl.java:194)
        ... 8 more
Caused by: java.io.FileNotFoundException: /opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock (Too many open files)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
        at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:183)
        at org.apache.lucene.store.Lock.obtain(Lock.java:99)
        ... 11 more
Sep 14, 2014 12:00:15 AM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)00:00:13,955 ERROR IndexManager org.sakaiproject.search.indexer.impl.TransactionalIndexWorker - Failed to Add Documents
org.sakaiproject.search.transaction.api.IndexTransactionException: Cant Create Transaction Index working space
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getInternalIndexWriter(IndexUpdateTransactionImpl.java:205)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getIndexWriter(IndexUpdateTransactionImpl.java:168)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getIndexReader(IndexUpdateTransactionImpl.java:338)
        at org.sakaiproject.search.indexer.impl.TransactionalIndexWorker.processTransaction(TransactionalIndexWorker.java:229)
        at org.sakaiproject.search.indexer.impl.TransactionalIndexWorker.process(TransactionalIndexWorker.java:132)
        at org.sakaiproject.search.indexer.impl.ConcurrentSearchIndexBuilderWorkerImpl.runOnce(ConcurrentSearchIndexBuilderWorkerImpl.java:273)
        at org.sakaiproject.search.journal.impl.IndexManagementTimerTask.run(IndexManagementTimerTask.java:137)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505)
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock<mailto:NativeFSLock@/opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock>: java.io.FileNotFoundException: /opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock (Too many open files)
        at org.apache.lucene.store.Lock.obtain(Lock.java:85)
        at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1562)
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1090)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getInternalIndexWriter(IndexUpdateTransactionImpl.java:194)
        ... 8 more
Caused by: java.io.FileNotFoundException: /opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock (Too many open files)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
        at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:183)
        at org.apache.lucene.store.Lock.obtain(Lock.java:99)
        ... 11 more
Sep 14, 2014 12:00:15 AM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)


We restarted the server on Saturday about 10:05 and then about 20:33 we get it again:


2014-09-14 20:32:14,408  WARN http-bio-8080-exec-204 org.apache.myfaces.shared_impl.renderkit.html.HtmlImageRendererBase - ALT attribute is missing for : _idJsp64
2014-09-14 20:33:06,840  WARN http-bio-8080-exec-186 org.apache.myfaces.shared_impl.renderkit.html.HtmlImageRendererBase - ALT attribute is missing for : _idJsp64
Sep 14, 2014 8:33:32 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 14, 2014 8:33:32 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 14, 2014 8:33:32 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 14, 2014 8:33:32 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 14, 2014 8:33:32 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)




I restarted again and now its working so far but Im afraid it will go down again in the evening when nobody is working.

PS. On the firs node that have worked all the time we got this error:

2014-09-14 00:02:33,230 ERROR IndexManager org.sakaiproject.search.optimize.shared.impl.DbJournalOptimizationManager - This node already merging shared segments, index writer scio1:1395755228716      This node is currently optimizing the shared segments,  This is an error as only one copy of this node should be        Active in the clustersee http://jira.sakaiproject.org/browse/SRCH-38
2014-09-14 00:02:43,230 ERROR IndexManager org.sakaiproject.search.optimize.shared.impl.DbJournalOptimizationManager - This node already merging shared segments, index writer scio1:1395755228716      This node is currently optimizing the shared segments,  This is an error as only one copy of this node should be        Active in the clustersee http://jira.sakaiproject.org/browse/SRCH-38
2014-09-14 00:02:53,230 ERROR IndexManager org.sakaiproject.search.optimize.shared.impl.DbJournalOptimizationManager - This node already merging shared segments, index writer scio1:1395755228716      This node is currently optimizing the shared segments,  This is an error as only one copy of this node should be        Active in the clustersee http://jira.sakaiproject.org/browse/SRCH-38
2014-09-14 00:03:03,230 ERROR IndexManager org.sakaiproject.search.optimize.shared.impl.DbJournalOptimizationManager - This node already merging shared segments, index writer scio1:1395755228716      This node is currently optimizing the shared segments,  This is an error as only one copy of this node should be        Active in the clustersee http://jira.sakaiproject.org/browse/SRCH-38

I updated the database on this one according the Jira with (committed on all rows on database_journal). Now The error is gone.


Regards
Anders Nordkvist
System administrator
University Of Skövde
Sweden

________________________________
UNIVERSITY OF CAPE TOWN

This e-mail is subject to the UCT ICT policies and e-mail disclaimer published on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 21 650 9111<tel:%2B27%2021%20650%209111>. This e-mail is intended only for the person(s) to whom it is addressed. If the e-mail has reached you in error, please notify the author. If you are not the intended recipient of the e-mail you may not use, disclose, copy, redirect or print the content. If this e-mail is not related to the business of UCT it is sent by the sender in the sender's individual capacity.

_______________________________________________
sakai-user mailing list
sakai-user at collab.sakaiproject.org<mailto:sakai-user at collab.sakaiproject.org>
http://collab.sakaiproject.org/mailman/listinfo/sakai-user

TO UNSUBSCRIBE: send email to sakai-user-unsubscribe at collab.sakaiproject.org<mailto:sakai-user-unsubscribe at collab.sakaiproject.org> with a subject of "unsubscribe"


_______________________________________________
sakai-user mailing list
sakai-user at collab.sakaiproject.org<mailto:sakai-user at collab.sakaiproject.org>
http://collab.sakaiproject.org/mailman/listinfo/sakai-user

TO UNSUBSCRIBE: send email to sakai-user-unsubscribe at collab.sakaiproject.org<mailto:sakai-user-unsubscribe at collab.sakaiproject.org> with a subject of "unsubscribe"

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai-user/attachments/20140916/3b50c447/attachment-0001.html 


More information about the sakai-user mailing list