Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NullPointerException in natural_join #2897

Closed
chipkent opened this issue Sep 22, 2022 · 5 comments · Fixed by #2906
Closed

NullPointerException in natural_join #2897

chipkent opened this issue Sep 22, 2022 · 5 comments · Fixed by #2906
Assignees
Labels
bug Something isn't working query engine
Milestone

Comments

@chipkent
Copy link
Member

The query looks like:

    rst = rst \
        .join(dates_downsampled, joins="date") \
        .natural_join(raw, on=["date", "Sym1=symbol"], joins=f"Change1={col_ret}") \
        .natural_join(raw, on=["date", "Sym2=symbol"], joins=f"Change2={col_ret}") \
2022-09-22 07:44:59.653 MDTDeephaven Version: 0.16.1
Error
2022-09-22 07:44:59.655 MDTTraceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deephaven/table.py", line 757, in natural_join table.j_table, ",".join(on), ",".join(joins) RuntimeError: java.lang.NullPointerException
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.sources.AbstractLongArraySource.set(AbstractLongArraySource.java:57)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.by.typed.staticopen.gen.cm22093324970514590v55_0.StaticNaturalJoinHasherObjectObject.decorateLeftSide(StaticNaturalJoinHasherObjectObject.java:124)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.naturaljoin.StaticNaturalJoinStateManagerTypedBase$LeftProbeHandler.doProbe(StaticNaturalJoinStateManagerTypedBase.java:117)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.naturaljoin.StaticNaturalJoinStateManagerTypedBase.probeTable(StaticNaturalJoinStateManagerTypedBase.java:219)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.naturaljoin.StaticNaturalJoinStateManagerTypedBase.decorateLeftSide(StaticNaturalJoinStateManagerTypedBase.java:153)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.NaturalJoinHelper.naturalJoinInternal(NaturalJoinHelper.java:266)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.NaturalJoinHelper.naturalJoin(NaturalJoinHelper.java:44)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.NaturalJoinHelper.naturalJoin(NaturalJoinHelper.java:37)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.QueryTable.naturalJoinInternal(QueryTable.java:1767)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.QueryTable.lambda$naturalJoin$45(QueryTable.java:1758)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.perf.QueryPerformanceRecorder.withNugget(QueryPerformanceRecorder.java:445)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.QueryTable.naturalJoin(QueryTable.java:1756)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.TableDefaults.naturalJoin(TableDefaults.java:433)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.TableDefaults.naturalJoin(TableDefaults.java:34)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.api.TableOperationsDefaults.naturalJoin(TableOperationsDefaults.java:133)
2022-09-22 07:44:59.655 MDT{}
Error
2022-09-22 07:44:59.655 MDT{}
2022-09-22 07:44:59.655 MDTThe above exception was the direct cause of the following exception:
Error
2022-09-22 07:44:59.655 MDT{}
2022-09-22 07:44:59.655 MDTTraceback (most recent call last): File "/scripts/downloader.py", line 54, in <module> data_source_pairs=data_source_pairs) File "/usr/local/lib/python3.7/site-packages/cecropia/pairs/selection.py", line 106, in pairs_correlations .natural_join(raw, on=["date", "Sym1=symbol"], joins=f"Change1={col_ret}") \ File "/usr/local/lib/python3.7/site-packages/deephaven/table.py", line 765, in natural_join raise DHError(e, "table natural_join operation failed.") from e deephaven.dherror.DHError: table natural_join operation failed. : RuntimeError: java.lang.NullPointerException
Error
2022-09-22 07:44:59.655 MDTTraceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deephaven/table.py", line 757, in natural_join table.j_table, ",".join(on), ",".join(joins) RuntimeError: java.lang.NullPointerException
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.sources.AbstractLongArraySource.set(AbstractLongArraySource.java:57)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.by.typed.staticopen.gen.cm22093324970514590v55_0.StaticNaturalJoinHasherObjectObject.decorateLeftSide(StaticNaturalJoinHasherObjectObject.java:124)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.naturaljoin.StaticNaturalJoinStateManagerTypedBase$LeftProbeHandler.doProbe(StaticNaturalJoinStateManagerTypedBase.java:117)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.naturaljoin.StaticNaturalJoinStateManagerTypedBase.probeTable(StaticNaturalJoinStateManagerTypedBase.java:219)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.naturaljoin.StaticNaturalJoinStateManagerTypedBase.decorateLeftSide(StaticNaturalJoinStateManagerTypedBase.java:153)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.NaturalJoinHelper.naturalJoinInternal(NaturalJoinHelper.java:266)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.NaturalJoinHelper.naturalJoin(NaturalJoinHelper.java:44)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.NaturalJoinHelper.naturalJoin(NaturalJoinHelper.java:37)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.QueryTable.naturalJoinInternal(QueryTable.java:1767)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.QueryTable.lambda$naturalJoin$45(QueryTable.java:1758)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.perf.QueryPerformanceRecorder.withNugget(QueryPerformanceRecorder.java:445)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.QueryTable.naturalJoin(QueryTable.java:1756)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.TableDefaults.naturalJoin(TableDefaults.java:433)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.engine.table.impl.TableDefaults.naturalJoin(TableDefaults.java:34)
Error
2022-09-22 07:44:59.655 MDT at io.deephaven.api.TableOperationsDefaults.naturalJoin(TableOperationsDefaults.java:133)
@chipkent chipkent added bug Something isn't working triage labels Sep 22, 2022
@chipkent chipkent added this to the Sept 2022 milestone Sep 22, 2022
@rcaudy
Copy link
Member

rcaudy commented Sep 22, 2022

This is in decorating from the left after building from the right with a static RHS and a LHS that is either ticking or bigger than RHS.
We just fixed a bug here wherein we were sizing the hash table improperly. Maybe we have a new and exciting ensure capacity bug exposed by the prior fix?

@rcaudy
Copy link
Member

rcaudy commented Sep 22, 2022

It looks to me like io.deephaven.engine.table.impl.naturaljoin.StaticNaturalJoinStateManagerTypedBase.LeftProbeHandler#doProbe handles this correctly. Still digging.

@rcaudy
Copy link
Member

rcaudy commented Sep 22, 2022

@chipkent I'm going to need more help from you to reproduce this:

  1. Were both sides static, or just the RHS?
  2. How big are the tables?
  3. How many unique key values were there?
  4. What were the types of the key columns? It looks like 2 Object columns to me, from the stack trace.
  5. Was there a pattern to how the keys were distributed?

Ideally, you could give me a script that reproduces this.

@chipkent
Copy link
Member Author

chipkent commented Sep 23, 2022

@rcaudy After a few hours of messing with it, I have created a compact reproducer. The input files are large and proprietary. Talk to me tomorrow, and we can figure out how to get them to you.

from deephaven import parquet

diff_days = 4
col_ret = f"return_split_div_adj_{diff_days}d"

raw_filename = "/nfs/data/quandl/2022-09-22/DEBUG_RAW.parquet"
dbg1_filename = "/nfs/data/quandl/2022-09-22/DEBUG_DBG1.parquet"

raw = parquet.read(path=raw_filename)
dbg1 = parquet.read(path=dbg1_filename)

#dbg1.size = 3787803476
#raw.size = 40610683

rst = dbg1 \
    .natural_join(raw, on=["date", "Sym1=symbol"], joins=f"Change1={col_ret}") \
    .natural_join(raw, on=["date", "Sym2=symbol"], joins=f"Change2={col_ret}") 

error message:

r-Scheduler-Serial-1 | i.d.s.s.SessionState      | Internal Error '377a7cc5-eddc-49ac-bb76-88e0fec1f08c' java.lang.RuntimeException: Error in Python interpreter:
Type: <class 'deephaven.dherror.DHError'>
Value: table natural_join operation failed. : RuntimeError: java.lang.NullPointerException
Traceback (most recent call last):
  File "/opt/deephaven-venv/lib/python3.7/site-packages/deephaven/table.py", line 757, in natural_join
    table.j_table, ",".join(on), ",".join(joins)
RuntimeError: java.lang.NullPointerException
	at io.deephaven.engine.table.impl.sources.AbstractLongArraySource.set(AbstractLongArraySource.java:57)
	at io.deephaven.engine.table.impl.by.typed.staticopen.gen.cm22093324970514590v55_0.StaticNaturalJoinHasherObjectObject.decorateLeftSide(StaticNaturalJoinHasherObjectObject.java:124)
	at io.deephaven.engine.table.impl.naturaljoin.StaticNaturalJoinStateManagerTypedBase$LeftProbeHandler.doProbe(StaticNaturalJoinStateManagerTypedBase.java:117)
	at io.deephaven.engine.table.impl.naturaljoin.StaticNaturalJoinStateManagerTypedBase.probeTable(StaticNaturalJoinStateManagerTypedBase.java:219)
	at io.deephaven.engine.table.impl.naturaljoin.StaticNaturalJoinStateManagerTypedBase.decorateLeftSide(StaticNaturalJoinStateManagerTypedBase.java:153)
	at io.deephaven.engine.table.impl.NaturalJoinHelper.naturalJoinInternal(NaturalJoinHelper.java:266)
	at io.deephaven.engine.table.impl.NaturalJoinHelper.naturalJoin(NaturalJoinHelper.java:44)
	at io.deephaven.engine.table.impl.NaturalJoinHelper.naturalJoin(NaturalJoinHelper.java:37)
	at io.deephaven.engine.table.impl.QueryTable.naturalJoinInternal(QueryTable.java:1767)
	at io.deephaven.engine.table.impl.QueryTable.lambda$naturalJoin$45(QueryTable.java:1758)
	at io.deephaven.engine.table.impl.perf.QueryPerformanceRecorder.withNugget(QueryPerformanceRecorder.java:445)
	at io.deephaven.engine.table.impl.QueryTable.naturalJoin(QueryTable.java:1756)
	at io.deephaven.engine.table.impl.UncoalescedTable.naturalJoin(UncoalescedTable.java:337)
	at io.deephaven.engine.table.impl.TableDefaults.naturalJoin(TableDefaults.java:433)
	at io.deephaven.engine.table.impl.TableDefaults.naturalJoin(TableDefaults.java:34)
	at io.deephaven.api.TableOperationsDefaults.naturalJoin(TableOperationsDefaults.java:133)
	at org.jpy.PyLib.executeCode(Native Method)
	at org.jpy.PyObject.executeCode(PyObject.java:138)
	at io.deephaven.engine.util.PythonEvaluatorJpy.evalScript(PythonEvaluatorJpy.java:73)
	at io.deephaven.integrations.python.PythonDeephavenSession.lambda$evaluate$1(PythonDeephavenSession.java:185)
	at io.deephaven.util.locks.FunctionalLock.doLockedInterruptibly(FunctionalLock.java:49)
	at io.deephaven.integrations.python.PythonDeephavenSession.evaluate(PythonDeephavenSession.java:184)
	at io.deephaven.engine.util.AbstractScriptSession.lambda$evaluateScript$1(AbstractScriptSession.java:146)
	at io.deephaven.engine.context.ExecutionContext.lambda$apply$0(ExecutionContext.java:117)
	at io.deephaven.engine.context.ExecutionContext.apply(ExecutionContext.java:128)
	at io.deephaven.engine.context.ExecutionContext.apply(ExecutionContext.java:116)
	at io.deephaven.engine.util.AbstractScriptSession.evaluateScript(AbstractScriptSession.java:146)
	at io.deephaven.engine.util.DelegatingScriptSession.evaluateScript(DelegatingScriptSession.java:87)
	at io.deephaven.engine.util.ScriptSession.evaluateScript(ScriptSession.java:113)
	at io.deephaven.server.console.ConsoleServiceGrpcImpl.lambda$executeCommand$8(ConsoleServiceGrpcImpl.java:170)
	at io.deephaven.server.session.SessionState$ExportBuilder.lambda$submit$2(SessionState.java:1308)
	at io.deephaven.server.session.SessionState$ExportObject.doExport(SessionState.java:856)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at io.deephaven.server.runner.DeephavenApiServerModule$ThreadFactory.lambda$newThread$0(DeephavenApiServerModule.java:156)
	at java.base/java.lang.Thread.run(Thread.java:829)


Line: 765
Namespace: natural_join
File: /opt/deephaven-venv/lib/python3.7/site-packages/deephaven/table.py
Traceback (most recent call last):
  File "<string>", line 16, in <module>
  File "/opt/deephaven-venv/lib/python3.7/site-packages/deephaven/table.py", line 765, in natural_join

        at org.jpy.PyLib.executeCode(PyLib.java:-2)
        at org.jpy.PyObject.executeCode(PyObject.java:138)
        at io.deephaven.engine.util.PythonEvaluatorJpy.evalScript(PythonEvaluatorJpy.java:73)
        at io.deephaven.integrations.python.PythonDeephavenSession.lambda$evaluate$1(PythonDeephavenSession.java:185)
        at io.deephaven.util.locks.FunctionalLock.doLockedInterruptibly(FunctionalLock.java:49)
        at io.deephaven.integrations.python.PythonDeephavenSession.evaluate(PythonDeephavenSession.java:184)
        at io.deephaven.engine.util.AbstractScriptSession.lambda$evaluateScript$1(AbstractScriptSession.java:146)
        at io.deephaven.engine.context.ExecutionContext.lambda$apply$0(ExecutionContext.java:117)
        at io.deephaven.engine.context.ExecutionContext.apply(ExecutionContext.java:128)
        at io.deephaven.engine.context.ExecutionContext.apply(ExecutionContext.java:116)
        at io.deephaven.engine.util.AbstractScriptSession.evaluateScript(AbstractScriptSession.java:146)
        at io.deephaven.engine.util.DelegatingScriptSession.evaluateScript(DelegatingScriptSession.java:87)
        at io.deephaven.engine.util.ScriptSession.evaluateScript(ScriptSession.java:113)
        at io.deephaven.server.console.ConsoleServiceGrpcImpl.lambda$executeCommand$8(ConsoleServiceGrpcImpl.java:170)
        at io.deephaven.server.session.SessionState$ExportBuilder.lambda$submit$2(SessionState.java:1308)
        at io.deephaven.server.session.SessionState$ExportObject.doExport(SessionState.java:856)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at io.deephaven.server.runner.DeephavenApiServerModule$ThreadFactory.lambda$newThread$0(DeephavenApiServerModule.java:156)
        at java.lang.Thread.run(Thread.java:829)

@chipkent chipkent changed the title NullPointerException in `natural_join NullPointerException in natural_join Sep 23, 2022
@rcaudy
Copy link
Member

rcaudy commented Sep 23, 2022

This is an int overflow issue when dealing with static RHS + large LHS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working query engine
Projects
None yet
2 participants