Identify and document scalability benchmarks #74

mortonjt · 2018-10-11T17:22:46Z

Empress needs to be run against a huge tree (> 1 million tips)

antgonza · 2020-02-03T13:28:58Z

Just wondering if there are any updates on this issue; thank you.

antgonza · 2020-03-22T00:24:50Z

Installed the latest version of empress and ran it on one of the large trees generated in Qiita within a 2020.2 Qiime2 conda environment; the mapping file, feature-table and taxonomies from the moving pictures dataset - only one dataset.

Note that this is a tree was created over a year ago (we could generate even larger today), is the 100bp fragments insertion tree and is ~8.8M tips:

In [1]: from skbio import TreeNode
In [2]: tree = TreeNode.read('../insertion_tree.relabelled.tre')
In [3]: print(tree.count(tips=True))     
8830174

I generated the no-taxonomy, GG and Silva added empress qzv's to test, each takes ~3hrs to generate the qzv and it works just fine (no error messages). However, when I try to open them in https://view.qiime2.org/, the browser fails with:

and if I unzip the qzv and try to open the index.html or empress.html, I get:

Anyway, here are the testing files.

cc: @ElDeveloper

kwcantrell · 2020-03-22T16:29:11Z

@antgonza I'm looking into this

fedarko · 2020-08-10T20:48:43Z

Once we identify upper bounds for what sorts of data sizes Empress can comfortably visualize, we should document this clearly in the README so that e.g. users with billion-tip trees know that they probably want to consult another tool and/or a priest ._.

ElDeveloper added the performance label Mar 31, 2020

This was referenced Apr 1, 2020

Support rectangular (and arbitrarily configurable) layouts; minor typo fixes / detail additions in CLI #145

Merged

Profiling Empress #151

Closed

fedarko mentioned this issue Apr 20, 2020

Storing JS BIOM data more efficiently #158

Closed

fedarko mentioned this issue Jun 16, 2020

Error when parsing Silva taxonomies #129

Closed

ElDeveloper added this to the Second Beta Release milestone Jul 28, 2020

fedarko added the documentation label Aug 10, 2020

fedarko changed the title ~~Scalability benchmarks~~ Identify and document scalability benchmarks Aug 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify and document scalability benchmarks #74

Identify and document scalability benchmarks #74

mortonjt commented Oct 11, 2018

antgonza commented Feb 3, 2020

antgonza commented Mar 22, 2020

kwcantrell commented Mar 22, 2020

fedarko commented Aug 10, 2020

Identify and document scalability benchmarks #74

Identify and document scalability benchmarks #74

Comments

mortonjt commented Oct 11, 2018

antgonza commented Feb 3, 2020

antgonza commented Mar 22, 2020

kwcantrell commented Mar 22, 2020

fedarko commented Aug 10, 2020