[Using Sakai] Sakai Error on one of our two nodes

Anders Nordkvist anders.nordqvist at his.se
Mon Sep 15 03:07:01 PDT 2014


Hi,

We have had problems with Sakai at the University of Skövde Sweden after an OS update and restart of systems last friday. We have 2.9.x and have two Sakai nodes and on top of that we have a netscaler distributing the load and behind a mysql server. The Sakai nodes collect information via LDAP from our Microsoft AD. The problem occurred several hours after the update of OS and restart of machines (about 11hours). During this time you only have a 50/50 % chance to login because the netscaler is not working properly and is not directing traffic to the working node. Can you guys please take a look at this and see if you can figure it out? This is the log from the beginning:

2014-09-12 22:08:07,941  WARN http-bio-8080-exec-121 org.apache.myfaces.shared_impl.renderkit.html.HtmlImageRendererBase - ALT attribute is missing for : _idJsp64
2014-09-12 22:14:00,421  WARN http-bio-8080-exec-108 com.sun.faces.renderkit.html_basic.HtmlBasicRenderer - Unable to find component with ID 'df_compose_title' in view.
2014-09-12 22:14:00,422  WARN http-bio-8080-exec-108 com.sun.faces.renderkit.html_basic.HtmlBasicRenderer - Unable to find component with ID 'df_compose_body' in view.
Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 12, 2014 10:17:00 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 12, 2014 10:17:31 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)


Later on I can see this:


2014-09-12 23:56:45,667 ERROR http-bio-8080-exec-121 edu.amc.sakai.user.JLDAPDirectoryProvider - getUser() failed [eid: b14verca]
LDAPException: Unable to connect to server hsdc1.hs.local:636 (91) Connect Error
java.net.SocketException: Too many open files
        at com.novell.ldap.Connection.connect(Unknown Source)
        at com.novell.ldap.Connection.connect(Unknown Source)
        at com.novell.ldap.LDAPConnection.connect(Unknown Source)
        at edu.amc.sakai.user.SimpleLdapConnectionManager.connect(SimpleLdapConnectionManager.java:244)
        at edu.amc.sakai.user.SimpleLdapConnectionManager.getConnection(SimpleLdapConnectionManager.java:65)
        at edu.amc.sakai.user.JLDAPDirectoryProvider.searchDirectory(JLDAPDirectoryProvider.java:954)
        at edu.amc.sakai.user.JLDAPDirectoryProvider.searchDirectoryForSingleEntry(JLDAPDirectoryProvider.java:902)
        at edu.amc.sakai.user.JLDAPDirectoryProvider.getUserByEid(JLDAPDirectoryProvider.java:824)
        at edu.amc.sakai.user.JLDAPDirectoryProvider.getUserByEid(JLDAPDirectoryProvider.java:778)
        at edu.amc.sakai.user.JLDAPDirectoryProvider.getUser(JLDAPDirectoryProvider.java:603)
        at org.sakaiproject.user.impl.BaseUserDirectoryService.getProvidedUserByEid(BaseUserDirectoryService.java:656)
        at org.sakaiproject.user.impl.BaseUserDirectoryService.getUser(BaseUserDirectoryService.java:722)
        at org.sakaiproject.user.impl.BaseUserDirectoryService.getCurrentUser(BaseUserDirectoryService.java:890)
        at org.sakaiproject.authz.impl.SakaiSecurity.unlock(SakaiSecurity.java:222)
        at org.sakaiproject.authz.cover.SecurityService.unlock(SecurityService.java:91)
        at org.sakaiproject.portal.charon.site.PortalSiteHelperImpl.pageListToMap(PortalSiteHelperImpl.java:583)
        at org.sakaiproject.portal.charon.handlers.WorksiteHandler.includeWorksite(WorksiteHandler.java:195)
        at org.sakaiproject.portal.charon.handlers.WorksiteHandler.doWorksite(WorksiteHandler.java:165)
        at org.sakaiproject.portal.charon.SkinnableCharonPortal.doError(SkinnableCharonPortal.java:270)
        at org.sakaiproject.portal.charon.handlers.PresenceHandler.doPresence(PresenceHandler.java:117)
        at org.sakaiproject.portal.charon.handlers.PresenceHandler.doGet(PresenceHandler.java:70)
        at org.sakaiproject.portal.charon.SkinnableCharonPortal.doGet(SkinnableCharonPortal.java:894)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at org.sakaiproject.util.RequestFilter.doFilter(RequestFilter.java:695)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
        at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketException: Too many open files

I can also see this and I don't know if its related:

00:00:13,955 ERROR IndexManager org.sakaiproject.search.indexer.impl.TransactionalIndexWorker - Failed to Add Documents
org.sakaiproject.search.transaction.api.IndexTransactionException: Cant Create Transaction Index working space
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getInternalIndexWriter(IndexUpdateTransactionImpl.java:205)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getIndexWriter(IndexUpdateTransactionImpl.java:168)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getIndexReader(IndexUpdateTransactionImpl.java:338)
        at org.sakaiproject.search.indexer.impl.TransactionalIndexWorker.processTransaction(TransactionalIndexWorker.java:229)
        at org.sakaiproject.search.indexer.impl.TransactionalIndexWorker.process(TransactionalIndexWorker.java:132)
        at org.sakaiproject.search.indexer.impl.ConcurrentSearchIndexBuilderWorkerImpl.runOnce(ConcurrentSearchIndexBuilderWorkerImpl.java:273)
        at org.sakaiproject.search.journal.impl.IndexManagementTimerTask.run(IndexManagementTimerTask.java:137)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505)
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock: java.io.FileNotFoundException: /opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock (Too many open files)
        at org.apache.lucene.store.Lock.obtain(Lock.java:85)
        at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1562)
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1090)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getInternalIndexWriter(IndexUpdateTransactionImpl.java:194)
        ... 8 more
Caused by: java.io.FileNotFoundException: /opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock (Too many open files)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
        at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:183)
        at org.apache.lucene.store.Lock.obtain(Lock.java:99)
        ... 11 more
Sep 14, 2014 12:00:15 AM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)00:00:13,955 ERROR IndexManager org.sakaiproject.search.indexer.impl.TransactionalIndexWorker - Failed to Add Documents
org.sakaiproject.search.transaction.api.IndexTransactionException: Cant Create Transaction Index working space
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getInternalIndexWriter(IndexUpdateTransactionImpl.java:205)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getIndexWriter(IndexUpdateTransactionImpl.java:168)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getIndexReader(IndexUpdateTransactionImpl.java:338)
        at org.sakaiproject.search.indexer.impl.TransactionalIndexWorker.processTransaction(TransactionalIndexWorker.java:229)
        at org.sakaiproject.search.indexer.impl.TransactionalIndexWorker.process(TransactionalIndexWorker.java:132)
        at org.sakaiproject.search.indexer.impl.ConcurrentSearchIndexBuilderWorkerImpl.runOnce(ConcurrentSearchIndexBuilderWorkerImpl.java:273)
        at org.sakaiproject.search.journal.impl.IndexManagementTimerTask.run(IndexManagementTimerTask.java:137)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505)
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock: java.io.FileNotFoundException: /opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock (Too many open files)
        at org.apache.lucene.store.Lock.obtain(Lock.java:85)
        at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1562)
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1090)
        at org.sakaiproject.search.indexer.impl.IndexUpdateTransactionImpl.getInternalIndexWriter(IndexUpdateTransactionImpl.java:194)
        ... 8 more
Caused by: java.io.FileNotFoundException: /opt/tomcat-7.0.42/sakai/indexwork/indexer-work/indextx-1410515455900/write.lock (Too many open files)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
        at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:183)
        at org.apache.lucene.store.Lock.obtain(Lock.java:99)
        ... 11 more
Sep 14, 2014 12:00:15 AM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)


We restarted the server on Saturday about 10:05 and then about 20:33 we get it again:


2014-09-14 20:32:14,408  WARN http-bio-8080-exec-204 org.apache.myfaces.shared_impl.renderkit.html.HtmlImageRendererBase - ALT attribute is missing for : _idJsp64
2014-09-14 20:33:06,840  WARN http-bio-8080-exec-186 org.apache.myfaces.shared_impl.renderkit.html.HtmlImageRendererBase - ALT attribute is missing for : _idJsp64
Sep 14, 2014 8:33:32 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 14, 2014 8:33:32 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 14, 2014 8:33:32 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 14, 2014 8:33:32 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)
        at java.lang.Thread.run(Thread.java:745)

Sep 14, 2014 8:33:32 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
SEVERE: Socket accept failed
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:60)
        at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:216)




I restarted again and now its working so far but Im afraid it will go down again in the evening when nobody is working.

PS. On the firs node that have worked all the time we got this error:

2014-09-14 00:02:33,230 ERROR IndexManager org.sakaiproject.search.optimize.shared.impl.DbJournalOptimizationManager - This node already merging shared segments, index writer scio1:1395755228716      This node is currently optimizing the shared segments,  This is an error as only one copy of this node should be        Active in the clustersee http://jira.sakaiproject.org/browse/SRCH-38
2014-09-14 00:02:43,230 ERROR IndexManager org.sakaiproject.search.optimize.shared.impl.DbJournalOptimizationManager - This node already merging shared segments, index writer scio1:1395755228716      This node is currently optimizing the shared segments,  This is an error as only one copy of this node should be        Active in the clustersee http://jira.sakaiproject.org/browse/SRCH-38
2014-09-14 00:02:53,230 ERROR IndexManager org.sakaiproject.search.optimize.shared.impl.DbJournalOptimizationManager - This node already merging shared segments, index writer scio1:1395755228716      This node is currently optimizing the shared segments,  This is an error as only one copy of this node should be        Active in the clustersee http://jira.sakaiproject.org/browse/SRCH-38
2014-09-14 00:03:03,230 ERROR IndexManager org.sakaiproject.search.optimize.shared.impl.DbJournalOptimizationManager - This node already merging shared segments, index writer scio1:1395755228716      This node is currently optimizing the shared segments,  This is an error as only one copy of this node should be        Active in the clustersee http://jira.sakaiproject.org/browse/SRCH-38

I updated the database on this one according the Jira with (committed on all rows on database_journal). Now The error is gone.


Regards
Anders Nordkvist
System administrator
University Of Skövde
Sweden

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai-user/attachments/20140915/2030af8b/attachment-0001.html 


More information about the sakai-user mailing list