-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Render server resource recommendation #118
Comments
Hi Martin, Glad to hear you plan to set up dedicated resources for render. As you might guess, everything depends upon your expected usage. Do you have any idea at this point how much data you plan to manage and how many concurrent users/clients you expect to have accessing the data in render? For data sizing, how many tile specs you plan to support actively at one time (not the source image sizes) is what matters. Janelia's peak data level to date is: I recently archived a bunch of older data we no longer need to be actively available in preparation for our next big reconstruction effort. We currently have: At Janelia, there are really only a few (<5) folks working with render directly at any time. Once the reconstruction alignments are done by specialists, the aligned data is materialized to disk (png, n5, ...) and that data is imported into downstream systems that often support large numbers of users (e.g. tracers). The alignment specialists do however run batch jobs on our cluster that can issue many millions of requests to the render web services. So, I guess the key would be to get an estimate of how many concurrent alignments (of the "average" stack with ? tile specs) you plan to support. Unless your expected data and concurrent usage is very small, you will want to have a dedicated box for the mongodb database and then additional server(s) - physical and/or virtual machines - to host the render web services. Re: Do many CPUs help in handling the requests? Yes, more CPUs will help manage more concurrent requests. Our setup at Janelia has a beefy centralized database server with lots of CPUs and lots of memory, one not-quite-so-large centralized web services host for pure data requests (not rendering), and then several small and medium sized VMs for server-side rendering and other things. The large scale cluster jobs use clients that do heavy lifting (e.g. rendering) client-side. This is the key to scalability. Relatively small meta data is fetched from the web services and then the distributed clients read source pixels from network storage and do the hard work locally. Although the web services support server-side rendering for small use cases, that won't scale with a centralized host. So - I've put a lot of info here without giving you a specific answer. Let's start with some data and usage estimates so we can work towards an appropriately sized system. |
Hi Eric, thanks for the detailed explanations. Our use case would be stacks with sizes ranging from 10k to 1M tiles each, so already significantly smaller than yours. Parallel alignment tasks will also be maximum 2-3. I expect one alignment and export run to last a working day or less, so this reduces parallel load. We will do the client-side tasks on our cluster or on dedicated machines. Looking at your numbers and the difference in dataset sizes, I think I might start with a single machine running both render -WS and the DB. I see the your DB machine has rather large memory. Would it hold the DB there? Also, one goal of my work would be to provide a comprehensive entry package for using the Render environment for aligning rather generic volume EM data of different kind. Ideally this could be more or less easily deployed to other institutes/infrastructures that like to use it as well. To lower the starting obstacle I would suggest to keep the basic standard setup rather simple. What are the differences between the "renderer-data" large VMs and smaller ones? They all run separate instances of the Render WS and connect to the single database, correct? Do you connect to different instances for different client scripts or are they dedicated to taking over different tasks during a single client call? Also, I don't quite understand yet, what exactly is done server-side.
Thanks! |
Hi Martin, re: database hardware architecture ... Yes, the DB machine needs enough disk storage to hold everything in render and enough memory to hold the data (stacks and match collections) being actively accessed. Theoretically, you can reduce database disk storage requirements by creating bson dumps for inactive stack data, moving the dumps to cheaper "offline" storage, and removing the inactive data from the database. This is a bit of a hassle though, so I've only done it a couple times once I knew we were really done with a large project. Although it is tempting to co-host the database and the web server, I would not recommend this long-term unless you are managing a fairly small amount of data (in total). I suppose you could start with everything together and separate components later when things grow and break down. How long do you think you'll need to hold onto the 10k - 1M tile stacks? If most/all of your projects are small and independent, then it might make sense to have lots of small (co-located) database + web service deployments where each deployment is dedicated to a project. |
re: comprehensive entry package I like this idea and think a Docker containerized deployment is the best way to set this up. Maybe we can revisit render's current all-in-one Docker example to make it more user friendly and suitable for small scale use. To determine whether a particular setup is performing well, the easiest thing to do is run some typical jobs (import, match generation, materialize to disk) and then look at the web services access log (jetty_base/logs/-/-access.log). The log will show response times for each request which will typically be in the 0 to 5 ms range (20-50 ms for larger requests) if the system is running well. |
re: different sized web services VMs and server-side rendering Yes, each VM hosts a separate render web services instance and they all connect to one centralized database. I separated the web servers to handle different types of tasks. The small VMs are dedicated to simple data retrieval use - no server-side rendering. The larger VMs are used for server-side rendering. Finally there is a large web services instance on real hardware that provides reliable large scale data access with no (or only very small occasional) server-side rendering. The main idea is to separate resource intensive server-side rendering from pure data retrieval. The data retrieval services need to be always available while the server-side rendering is "extra-credit". Separating to different hosts means that too many server-side rendering requests won't cause trouble for concurrently running cluster jobs that only need data. Technically, almost everything can be done server-side (because the same libraries are used) but practically it works best when rendering is done client-side.
|
Hi, I have a related question regarding the N5 export through hotknife. |
Hi Martin, The SparkConvertRenderStackToN5 client in the hotknife repository only pulls JSON tile specs from the render web service and the transformed data is calculated client side. This is why the system scales up well for large data sets. Each task running on a worker node uses the JSON tile specs to read 2D pixel data (typically from network storage), transform it in memory, and write it back out to disk in 3D voxel blocks. Technically, the pixel data could be read from another web service (e.g. s3 buckets) if the tile spec source URLs reference services instead of 'file://' paths. |
That makes sense, |
You have to look at each specific use case, but for 3D voxel rendering the key constraint is that you usually have to (repeatedly) load many larger xy 2D tiles to render a smaller 3D block. So, it is likely that the process is IO bound and not CPU bound. Saalfeld included a tileSize parameter to improve the mismatch between source data slices and output blocks. At Janelia, we typically use fairly small block sizes (128x128x64: 800K-ish each compressed?) for the n5 volumes mostly because it makes viewing fast. No one worries too much about the compute cost to generate the volumes. The guy who designs and maintains our network file systems hates that we create zillions of tiny block files. |
Can I trust the memory display in the spark web UI (still the N5 export)? |
You can trust it, but it doesn't mean what you think it does. From https://spark.apache.org/docs/latest/tuning.html#memory-management-overview:
The n5 generation spark jobs (and most of the render spark jobs) don't use spark for it's traditional map-reduce features where data gets cached and shuffled around the cluster. The n5 and render jobs mostly just use spark for basic parallelization and fault tolerance (task retries), so things like storage memory aren't relevant. |
Hi,
we are about to deploy a dedicated machine now for render.
What would be the recommended resources in terms of CPU/memory for reasonable performance? Do many CPUs help in handling the requests?
I think, most crucial will be the network connectivity to both the storage and compute systems, correct?
The text was updated successfully, but these errors were encountered: