Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when saving the pipeline as described in help #2

Open
ghost opened this issue Nov 10, 2020 · 3 comments
Open

Error when saving the pipeline as described in help #2

ghost opened this issue Nov 10, 2020 · 3 comments

Comments

@ghost
Copy link

ghost commented Nov 10, 2020

I'm trying to create my first pipeline in Baleen.

I've followed the tutorial in the help to create my first pipeline (REST API as source, Email and Print as processors).

When I go to save the pipeline I get the following error:

ERROR: http://localhost:6413/api/v3/pipelines/Empty%20Pipeline2 404

I get the following in the console running Baleen:

2020-11-10 17:45:07.542 ERROR 21269 --- [   scheduling-1] u.g.d.baleen.services.PipelineService    : Unable to create pipeline Empty Pipeline2 from file pipelines/182bf282-ab13-4a67-addd-3079c4e38dd2.json

java.lang.NullPointerException: null
	at java.base/java.io.FileInputStream.<init>(FileInputStream.java:147) ~[na:na]
	at opennlp.tools.util.model.BaseModel.<init>(BaseModel.java:182) ~[annot8-components-opennlp-1.0.0-plugin.jar:na]
	at opennlp.tools.namefind.TokenNameFinderModel.<init>(TokenNameFinderModel.java:108) ~[annot8-components-opennlp-1.0.0-plugin.jar:na]
	at io.annot8.components.opennlp.processors.NER$Processor.<init>(NER.java:58) ~[annot8-components-opennlp-1.0.0-plugin.jar:na]
	at io.annot8.components.opennlp.processors.NER.createComponent(NER.java:38) ~[annot8-components-opennlp-1.0.0-plugin.jar:na]
	at io.annot8.components.opennlp.processors.NER.createComponent(NER.java:30) ~[annot8-components-opennlp-1.0.0-plugin.jar:na]
	at io.annot8.common.components.AbstractComponentDescriptor.create(AbstractComponentDescriptor.java:38) ~[annot8-components-base-text-1.0.0-plugin.jar:na]
	at io.annot8.common.components.AbstractComponentDescriptor.create(AbstractComponentDescriptor.java:10) ~[annot8-components-base-text-1.0.0-plugin.jar:na]
	at io.annot8.implementations.pipeline.SimplePipeline$Builder.lambda$build$1(SimplePipeline.java:319) ~[annot8-pipeline-implementation-1.0.1.jar!/:na]
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) ~[na:na]
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654) ~[na:na]
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[na:na]
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[na:na]
	at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[na:na]
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[na:na]
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:na]
	at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) ~[na:na]
	at io.annot8.implementations.pipeline.SimplePipeline$Builder.build(SimplePipeline.java:322) ~[annot8-pipeline-implementation-1.0.1.jar!/:na]
	at io.annot8.implementations.pipeline.InMemoryPipelineRunner.<init>(InMemoryPipelineRunner.java:76) ~[annot8-pipeline-implementation-1.0.1.jar!/:na]
	at uk.gov.dstl.baleen.services.PipelineService.createPipeline(PipelineService.java:263) ~[classes!/:3.0.1]
	at uk.gov.dstl.baleen.services.PipelineService.createPipelineFromFile(PipelineService.java:163) ~[classes!/:3.0.1]
	at uk.gov.dstl.baleen.services.PipelineService.detectChanges(PipelineService.java:202) ~[classes!/:3.0.1]
	at jdk.internal.reflect.GeneratedMethodAccessor87.invoke(Unknown Source) ~[na:na]
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
	at java.base/java.lang.reflect.Method.invoke(Method.java:567) ~[na:na]
	at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84) ~[spring-context-5.2.9.RELEASE.jar!/:5.2.9.RELEASE]
	at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) ~[spring-context-5.2.9.RELEASE.jar!/:5.2.9.RELEASE]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[na:na]
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[na:na]
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[na:na]
	at java.base/java.lang.Thread.run(Thread.java:835) ~[na:na]

I'm excited to try this tool - any help would be appreciated!

@jbaker-dstl
Copy link
Collaborator

This looks like you've also added the OpenNLP NER processor to your pipeline (presumably in addition to the ones you mention from the tutorial)? Is that the case?

If so, it looks like an issue with the configuration and/or the OpenNLP model you're using (this should be caught better and a useful error message displayed - we can add that in to a future release of the component). What configuration are you using?

@ghost
Copy link
Author

ghost commented Nov 11, 2020

Thanks for the super quick feedback. I've removed the OpenNLP NER processor - and have managed to get things working, but there's just an interesting quirk.

When saving or replacing a pipeline I still get the error message:

ERROR: http://localhost:6413/api/v3/pipelines/Empty%20Pipeline2 404

But only for ~5-10 seconds. After that the page refreshes and it starts to show the view of the pipeline running. This period while the error message persists corresponds to the period in the log below between:

Pipeline Empty Pipeline persisted to pipelines/24873bd0-f154-41bc-a883-8b96b80c9552.json
and
ENTRY_MODIFY event detected on path 24873bd0-f154-41bc-a883-8b96b80c9552.json

Having played with it a few times, it seems like the error will display for the period before the ENTRY_MODIFY event is detected, which seems to be 5-10 seconds on my system (Macbook Pro on OS X10.15 w/ Java 12.0.2)

2020-11-11 09:06:20.800  INFO 40807 --- [           main] uk.gov.dstl.baleen.Baleen                : Started Baleen in 4.928 seconds (JVM running for 5.868)
2020-11-11 09:06:33.025  INFO 40807 --- [nio-6413-exec-1] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring DispatcherServlet 'dispatcherServlet'
2020-11-11 09:06:33.026  INFO 40807 --- [nio-6413-exec-1] o.s.web.servlet.DispatcherServlet        : Initializing Servlet 'dispatcherServlet'
2020-11-11 09:06:33.033  INFO 40807 --- [nio-6413-exec-1] o.s.web.servlet.DispatcherServlet        : Completed initialization in 7 ms
2020-11-11 09:07:06.203  INFO 40807 --- [nio-6413-exec-3] u.g.d.baleen.services.PipelineService    : Persisting pipeline Empty Pipeline
2020-11-11 09:07:06.205  INFO 40807 --- [nio-6413-exec-3] u.g.d.baleen.services.PipelineService    : Pipeline Empty Pipeline persisted to pipelines/24873bd0-f154-41bc-a883-8b96b80c9552.json
2020-11-11 09:07:10.815  INFO 40807 --- [   scheduling-1] u.g.d.baleen.services.PipelineService    : ENTRY_MODIFY event detected on path 24873bd0-f154-41bc-a883-8b96b80c9552.json
2020-11-11 09:07:10.816  INFO 40807 --- [   scheduling-1] u.g.d.baleen.services.PipelineService    : Creating pipeline for file 24873bd0-f154-41bc-a883-8b96b80c9552.json
2020-11-11 09:07:10.820  INFO 40807 --- [   scheduling-1] u.g.d.baleen.services.PipelineService    : Creating pipeline Empty Pipeline
2020-11-11 09:07:10.855  INFO 40807 --- [   scheduling-1] u.g.d.baleen.services.PipelineService    : Pipeline Empty Pipeline created on thread Thread-2
2020-11-11 09:07:10.855  INFO 40807 --- [       Thread-2] i.a.i.pipeline.InMemoryPipelineRunner    : Pipeline Empty Pipeline started
2020-11-11 09:08:34.080  INFO 40807 --- [nio-6413-exec-5] u.g.d.baleen.services.PipelineService    : Deleting pipeline Empty Pipeline

Thanks for your help!

@jbaker-dstl
Copy link
Collaborator

Yes, this is a known bug that we haven't got round to fixing yet. Basically, the browser is too quick for the pipeline and tries to display the page before the pipeline has finished initialising. We ought to add a loading page for this period, I'll add that as a separate issue to be worked on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant