Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out scaling limits of current prototype #310

Open
RickMoynihan opened this issue Dec 5, 2023 · 2 comments
Open

Figure out scaling limits of current prototype #310

RickMoynihan opened this issue Dec 5, 2023 · 2 comments
Assignees

Comments

@RickMoynihan
Copy link
Member

Create a synthetic dataset of a fixed schema and width W, with a large number of rows R.

Suggest initial sizes of

  • W = 15 columns wide (14 dims, 1 measure)
  • R = 10 million (can later adjust this by orders of magnitude)

The dataset should still conform to being a cube, i.e. all permutations of dimvals should be unique, with just one measure for each.

We can then use this as a basis for a number of tests:

  1. Can we load that dataset in a single commit, and get back out without error?
  2. Can we make that dataset an order of magnitude bigger (i.e. 100m) and still load it?
  3. If so, can we take that dataset of 100m rows split it into 10 append commits of 10m rows each and load it?
  4. Assuming the largest size that works (e.g. 10m) can we make it fall over by adding 10m rows, deleting every row, and adding them all back a number of times?
  5. How many commits (just appends) does it take to fall over? e.g. do we fall over at 100k append commits with 10 rows in each commit?

We can look at the above and profile it in tools like jvisualvm, to see if there are bugs that are causing problems, or if they are just limitations of our in memory approach.

This is a pre-cursor task to choosing a database for the table store.

@xdrcft8000
Copy link
Contributor

@xdrcft8000
Copy link
Contributor

xdrcft8000 commented Dec 14, 2023

Error 1
this is the stack trace for the error 500 you get when posting too big of a dataset:
java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.Arrays.copyOf(Arrays.java:3585)
at ham_fisted.ArrayLists$IntArrayList.ensureCapacity(ArrayLists.java:966)
at ham_fisted.ArrayLists$IntArrayList.addLong(ArrayLists.java:973)
at ham_fisted.LongMutList$2.invokePrim(LongMutList.java:72)
at ham_fisted.IFnDef$OLO.invoke(IFnDef.java:565)
at clojure.core.protocols$naive_seq_reduce.invokeStatic(protocols.clj:62)
at clojure.core.protocols$interface_or_naive_reduce.invokeStatic(protocols.clj:72)
at clojure.core.protocols$fn__8249.invokeStatic(protocols.clj:169)
at clojure.core.protocols$fn__8249.invoke(protocols.clj:124)
at clojure.core.protocols$fn__8204$G__8199__8213.invoke(protocols.clj:19)
at clojure.core.protocols$seq_reduce.invokeStatic(protocols.clj:31)
at clojure.core.protocols$fn__8236.invokeStatic(protocols.clj:75)
at clojure.core.protocols$fn__8236.invoke(protocols.clj:75)
at clojure.core.protocols$fn__8178$G__8173__8191.invoke(protocols.clj:13)
at ham_fisted.Reductions.serialRe
duction(Reductions.java:84)
at ham_fisted.LongMutList.addAllReducible(LongMutList.java:70)
at ham_fisted.ArrayLists$IntArrayList.addAllReducible(ArrayLists.java:999)
at tech.v3.datatype.array_buffer$array_sub_list.invokeStatic(array_buffer.clj:652)
at tech.v3.datatype.array_buffer$array_sub_list.invoke(array_buffer.clj:628)
at tech.v3.datatype.copy_make_container$eval45276$fn__45277.invoke(copy_make_container.clj:38)
at clojure.lang.MultiFn.invoke(MultiFn.java:244)
at tech.v3.datatype.copy_make_container$make_container.invokeStatic(copy_make_container.clj:105)
at tech.v3.datatype.copy_make_container$make_container.invoke(copy_make_container.clj:96)
at tech.v3.datatype.copy_make_container$__GT_array.invokeStatic(copy_make_container.clj:176)
at tech.v3.datatype.copy_make_container$__GT_array.invoke(copy_make_container.clj:158)
at tech.v3.datatype.copy_make_container$__GT_int_array.invokeStatic(copy_make_container.clj:212)
at tech.v3.datatype.copy_make_container$__GT_int_array.invoke(copy_make_contai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants