Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing is not forgiving enough about bad data in database, should report what's wrong #2815

Closed
pdurbin opened this issue Dec 10, 2015 · 3 comments

Comments

@pdurbin
Copy link
Member

pdurbin commented Dec 10, 2015

In production we get a 500 error when visiting a certain dataset via either of these links:

500_internal_server_error_-harvard_dataverse-_2015-12-10_10 00 42

This probably indicates bad data in the database which should be cleaned up but I'm opening this issue to see what we can do on the indexing side to report what's bad about the data.

Here's the stacktrace from indexing, with line numbers from ba15284:

[2015-12-10T01:39:17.350-0500] [glassfish 4.1] [INFO] [] [edu.harvard.iq.dataverse.search.IndexAllServiceBean] [tid: _ThreadID=277 _ThreadName=__ejb-thre
ad-pool10] [timeMillis: 1449729557350] [levelValue: 800] [[
  indexing dataset 64895 of 64902 (id=2735375, persistentId=doi:10.7910/DVN/E2ULRS)]]

[2015-12-10T01:39:17.353-0500] [glassfish 4.1] [WARNING] [AS-EJB-00056] [javax.enterprise.ejb.container] [tid: _ThreadID=277 _ThreadName=__ejb-thread-pool10] [timeMillis: 1449729557353] [levelValue: 900] [[
  A system exception occurred during an invocation on EJB IndexServiceBean, method: public java.util.concurrent.Future edu.harvard.iq.dataverse.search.IndexServiceBean.indexDatasetInNewTransaction(edu.harvard.iq.dataverse.Dataset)]]

[2015-12-10T01:39:17.353-0500] [glassfish 4.1] [WARNING] [] [javax.enterprise.ejb.container] [tid: _ThreadID=277 _ThreadName=__ejb-thread-pool10] [timeMillis: 1449729557353] [levelValue: 900] [[

javax.ejb.EJBException
        at com.sun.ejb.containers.EJBContainerTransactionManager.processSystemException(EJBContainerTransactionManager.java:748)
        at com.sun.ejb.containers.EJBContainerTransactionManager.completeNewTx(EJBContainerTransactionManager.java:698)
        at com.sun.ejb.containers.EJBContainerTransactionManager.postInvokeTx(EJBContainerTransactionManager.java:515)
        at com.sun.ejb.containers.BaseContainer.postInvokeTx(BaseContainer.java:4566)
        at com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2074)
        at com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2044)
        at com.sun.ejb.containers.EJBLocalObjectInvocationHandler.invoke(EJBLocalObjectInvocationHandler.java:220)
        at com.sun.ejb.containers.EJBLocalObjectInvocationHandlerDelegate.invoke(EJBLocalObjectInvocationHandlerDelegate.java:88)
        at com.sun.proxy.$Proxy275.indexDatasetInNewTransaction(Unknown Source)
        at edu.harvard.iq.dataverse.search.__EJB31_Generated__IndexServiceBean__Intf____Bean__.indexDatasetInNewTransaction(Unknown Source)
        at edu.harvard.iq.dataverse.search.IndexAllServiceBean.indexAllOrSubset(IndexAllServiceBean.java:121)
        at edu.harvard.iq.dataverse.search.IndexAllServiceBean.indexAllOrSubset(IndexAllServiceBean.java:46)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.glassfish.ejb.security.application.EJBSecurityManager.runMethod(EJBSecurityManager.java:1081)
        at org.glassfish.ejb.security.application.EJBSecurityManager.invoke(EJBSecurityManager.java:1153)
        at com.sun.ejb.containers.BaseContainer.invokeBeanMethod(BaseContainer.java:4786)
        at com.sun.ejb.EjbInvocation.invokeBeanMethod(EjbInvocation.java:656)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
        at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73)
        at org.jboss.weld.ejb.SessionBeanInterceptor.aroundInvoke(SessionBeanInterceptor.java:52)
        at sun.reflect.GeneratedMethodAccessor112.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:883)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
        at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCall(SystemInterceptorProxy.java:163)
        at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.aroundInvoke(SystemInterceptorProxy.java:140)
        at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:883)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:369)
        at com.sun.ejb.containers.BaseContainer.__intercept(BaseContainer.java:4758)
        at com.sun.ejb.containers.BaseContainer.intercept(BaseContainer.java:4746)
        at com.sun.ejb.containers.EjbAsyncTask.call(EjbAsyncTask.java:101)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 0
        at java.util.Vector.get(Vector.java:744)
        at org.eclipse.persistence.indirection.IndirectList.get(IndirectList.java:410)
        at edu.harvard.iq.dataverse.Dataset.getLatestVersion(Dataset.java:198)
        at edu.harvard.iq.dataverse.search.IndexServiceBean.indexDataset(IndexServiceBean.java:328)
        at edu.harvard.iq.dataverse.search.IndexServiceBean.indexDatasetInNewTransaction(IndexServiceBean.java:243)
        at sun.reflect.GeneratedMethodAccessor1086.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.glassfish.ejb.security.application.EJBSecurityManager.runMethod(EJBSecurityManager.java:1081)
        at org.glassfish.ejb.security.application.EJBSecurityManager.invoke(EJBSecurityManager.java:1153)
        at com.sun.ejb.containers.BaseContainer.invokeBeanMethod(BaseContainer.java:4786)
        at com.sun.ejb.EjbInvocation.invokeBeanMethod(EjbInvocation.java:656)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
        at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:64)
        at org.jboss.weld.ejb.SessionBeanInterceptor.aroundInvoke(SessionBeanInterceptor.java:52)
        at sun.reflect.GeneratedMethodAccessor112.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:883)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
        at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCall(SystemInterceptorProxy.java:163)
        at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.aroundInvoke(SystemInterceptorProxy.java:140)
        at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:883)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:369)
        at com.sun.ejb.containers.BaseContainer.__intercept(BaseContainer.java:4758)
        at com.sun.ejb.containers.BaseContainer.intercept(BaseContainer.java:4746)
        at com.sun.ejb.containers.EJBLocalObjectInvocationHandler.invoke(EJBLocalObjectInvocationHandler.java:212)
        ... 38 more
]]

[2015-12-10T01:39:17.355-0500] [glassfish 4.1] [WARNING] [AS-EJB-00056] [javax.enterprise.ejb.container] [tid: _ThreadID=277 _ThreadName=__ejb-thread-pool10] [timeMillis: 1449729557355] [levelValue: 900] [[
  A system exception occurred during an invocation on EJB IndexAllServiceBean, method: public java.util.concurrent.Future edu.harvard.iq.dataverse.search.IndexAllServiceBean.indexAllOrSubset(long,long,boolean,boolean)]]

[2015-12-10T01:39:17.355-0500] [glassfish 4.1] [WARNING] [] [javax.enterprise.ejb.container] [tid: _ThreadID=277 _ThreadName=__ejb-thread-pool10] [timeMillis: 1449729557355] [levelValue: 900] [[

javax.ejb.EJBException
        at com.sun.ejb.containers.EJBContainerTransactionManager.processSystemException(EJBContainerTransactionManager.java:748)
        at com.sun.ejb.containers.EJBContainerTransactionManager.completeNewTx(EJBContainerTransactionManager.java:698)
        at com.sun.ejb.containers.EJBContainerTransactionManager.postInvokeTx(EJBContainerTransactionManager.java:515)
        at com.sun.ejb.containers.BaseContainer.postInvokeTx(BaseContainer.java:4566)
        at com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2074)
        at com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2044)
        at com.sun.ejb.containers.EJBLocalObjectInvocationHandler.invoke(EJBLocalObjectInvocationHandler.java:220)
        at com.sun.ejb.containers.EJBLocalObjectInvocationHandlerDelegate.invoke(EJBLocalObjectInvocationHandlerDelegate.java:88)
        at com.sun.proxy.$Proxy275.indexDatasetInNewTransaction(Unknown Source)
        at edu.harvard.iq.dataverse.search.__EJB31_Generated__IndexServiceBean__Intf____Bean__.indexDatasetInNewTransaction(Unknown Source)
        at edu.harvard.iq.dataverse.search.IndexAllServiceBean.indexAllOrSubset(IndexAllServiceBean.java:121)
        at edu.harvard.iq.dataverse.search.IndexAllServiceBean.indexAllOrSubset(IndexAllServiceBean.java:46)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.glassfish.ejb.security.application.EJBSecurityManager.runMethod(EJBSecurityManager.java:1081)
        at org.glassfish.ejb.security.application.EJBSecurityManager.invoke(EJBSecurityManager.java:1153)
        at com.sun.ejb.containers.BaseContainer.invokeBeanMethod(BaseContainer.java:4786)
        at com.sun.ejb.EjbInvocation.invokeBeanMethod(EjbInvocation.java:656)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
        at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73)
        at org.jboss.weld.ejb.SessionBeanInterceptor.aroundInvoke(SessionBeanInterceptor.java:52)
        at sun.reflect.GeneratedMethodAccessor112.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:883)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
        at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCall(SystemInterceptorProxy.java:163)
        at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.aroundInvoke(SystemInterceptorProxy.java:140)
        at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:883)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:369)
        at com.sun.ejb.containers.BaseContainer.__intercept(BaseContainer.java:4758)
        at com.sun.ejb.containers.BaseContainer.intercept(BaseContainer.java:4746)
        at com.sun.ejb.containers.EjbAsyncTask.call(EjbAsyncTask.java:101)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 0
        at java.util.Vector.get(Vector.java:744)
        at org.eclipse.persistence.indirection.IndirectList.get(IndirectList.java:410)
        at edu.harvard.iq.dataverse.Dataset.getLatestVersion(Dataset.java:198)
        at edu.harvard.iq.dataverse.search.IndexServiceBean.indexDataset(IndexServiceBean.java:328)
        at edu.harvard.iq.dataverse.search.IndexServiceBean.indexDatasetInNewTransaction(IndexServiceBean.java:243)
        at sun.reflect.GeneratedMethodAccessor1086.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.glassfish.ejb.security.application.EJBSecurityManager.runMethod(EJBSecurityManager.java:1081)
        at org.glassfish.ejb.security.application.EJBSecurityManager.invoke(EJBSecurityManager.java:1153)
        at com.sun.ejb.containers.BaseContainer.invokeBeanMethod(BaseContainer.java:4786)
        at com.sun.ejb.EjbInvocation.invokeBeanMethod(EjbInvocation.java:656)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
        at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:64)
        at org.jboss.weld.ejb.SessionBeanInterceptor.aroundInvoke(SessionBeanInterceptor.java:52)
        at sun.reflect.GeneratedMethodAccessor112.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:883)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.EjbInvocation.proceed(EjbInvocation.java:608)
        at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCall(SystemInterceptorProxy.java:163)
        at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.aroundInvoke(SystemInterceptorProxy.java:140)
        at sun.reflect.GeneratedMethodAccessor113.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at com.sun.ejb.containers.interceptors.AroundInvokeInterceptor.intercept(InterceptorManager.java:883)
        at com.sun.ejb.containers.interceptors.AroundInvokeChainImpl.invokeNext(InterceptorManager.java:822)
        at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:369)
        at com.sun.ejb.containers.BaseContainer.__intercept(BaseContainer.java:4758)
        at com.sun.ejb.containers.BaseContainer.intercept(BaseContainer.java:4746)
        at com.sun.ejb.containers.EJBLocalObjectInvocationHandler.invoke(EJBLocalObjectInvocationHandler.java:212)
        ... 38 more
]]

[2015-12-10T01:41:48.178-0500] [glassfish 4.1] [INFO] [] [edu.harvard.iq.dataverse.DatasetPage] [tid: _ThreadID=309 _ThreadName=jk-connector(11)] [timeMillis: 1449729708178] [levelValue: 800] [[
  retreived version: id: 77045, state: DEACCESSIONED]]
@pdurbin pdurbin self-assigned this Dec 10, 2015
@pdurbin pdurbin added this to the 4.2.2 milestone Dec 10, 2015
@pdurbin
Copy link
Member Author

pdurbin commented Dec 10, 2015

I spoke with @kcondon and he indicated that this issue can be moved out of the 4.2.2 milestone so I'll do that now. #2816 is related.

@pdurbin pdurbin removed this from the 4.2.2 milestone Dec 10, 2015
@pdurbin
Copy link
Member Author

pdurbin commented Dec 16, 2015

I haven't verified this yet but the thought is that a huge list of dataset ids is created at the start of "index all" and by the time indexing completes (slow in production per #50) it's possible that someone has deleted a dataset completely (no trace, unpublished draft deleted), causing an indexing failure since the dataset doesn't exist.

@pdurbin
Copy link
Member Author

pdurbin commented Jun 28, 2017

In practice what I do is clear out Solr and reindex: http://guides.dataverse.org/en/4.6/installation/administration.html#clear-and-reindex . Closing.

@pdurbin pdurbin closed this as completed Jun 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants