convert platybrowser data to zarr #1

tischi · 2020-10-16T08:16:24Z

All those xml point to n5s3 datasets: https://github.com/mobie/platybrowser-datasets/tree/master/data/1.0.1/images/remote
Within the xml you can see all information needed to access the object in the bucket.

Cool would be to have those converted to zarr:

em raw
em cell segmentation
- corresponding table
myosin expression

joshmoore · 2020-10-22T14:29:06Z

@tischi : want to add me to this repo so I can assign myself?

joshmoore · 2020-10-22T15:40:55Z

Do you have any info on the sizes of all these volumes? I'm running a recursive s3 ls now but it's taking ages.

tischi · 2020-10-22T16:08:27Z

I'm running a recursive s3 ls now but it's taking ages.

That's the issue with these millions of small files...doing anything with them but lazy loading chunks is not much fun.
I will try to check....

tischi · 2020-10-22T16:12:02Z

...running now du -sh sbem-6dpf-1-whole-raw.n5 but that also takes time........................

tischi · 2020-10-22T16:17:36Z

2.0T sbem-6dpf-1-whole-raw.n5
18G sbem-6dpf-1-whole-segmented-cells.n5

and the other one is much smaller.

joshmoore · 2020-10-22T16:48:05Z

Doh. My command was done after my 🏃 :

$ aws-embl-public s3 ls --summarize --human-readable --recursive s3://platybrowser/rawdata/sbem-6dpf-1-whole-raw.n5/
...
Total Objects: 4049381
   Total Size: 2.0 TiB

tischi · 2020-10-22T17:13:16Z

I am curious how long it will take to copy and zarrify this. Maybe would be interesting to time it for future reference.

joshmoore · 2020-10-25T09:31:52Z

The transfer is progressing extremely slowly. Do you have a small example dataset I could use to write a script and then you could transform locally?

Edit: actually, I'm now getting permission denied when I try to access s3.embl.de!

tischi · 2020-10-26T06:54:06Z

The myosin data set in the list above is small.

Edit: actually, I'm now getting permission denied when I try to access s3.embl.de!

Interesting, this may be related to this:
mobie/mobie#18

joshmoore · 2020-10-26T07:53:16Z

No luck:

$ aws --endpoint-url=https://s3.embl.de --no-sign-request s3 ls s3://platybrowser/

[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)

$ aws --no-verify-ssl --endpoint-url=https://s3.embl.de --no-sign-request s3 ls s3://platybrowser/
/usr/lib/fence-agents/bundled/botocore/vendored/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)

An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied.

tischi · 2020-10-26T08:02:34Z

I will write IT...

tischi · 2020-10-26T14:31:30Z

We also have the exact same data on a file system.
Should I zip it and then provide you a download link for this?
Alternative is that we zarrify it at EMBL, but then we still need to zip and send you I think...

joshmoore · 2020-10-26T15:16:48Z

Yeah, if you can provide me a small- to mid-size download, I'll get started on a script and/or docker you can run.

tischi · 2020-11-11T11:51:50Z

just to keep track what I do in case I have to repeat:

install aws: https://gist.github.com/stevenwaskey/3d7f0136051b437286608d6b8e2c87f2
aws: /home/tischer/.local/lib/aws/bin/aws
interactive job: srun --pty -c 2 -t 60:00 --mem 16000 bash -l
cd /g/cba/tischer/software/
ln -s /home/tischer/.local/lib/aws/bin/aws aws

bash-4.2$ ./aws configure --profile tischi
AWS Access Key ID [None]: tischi
AWS Secret Access Key [None]: xyz
Default region name [None]: 
Default output format [None]:

bash-4.2$ ./aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 ls s3://idr-upload/tischi/
2020-11-10 16:39:33         84 README.txt

 ./aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 cp --recursive /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/0.6.3/images/local/prospr-6dpf-1-whole-mhcl4.n5 s3://idr-upload/tischi/prospr-6dpf-1-whole-mhcl4.n5

note: is is important to add the root folder to the upload destination

@joshmoore
I uploaded one file: prospr-6dpf-1-whole-mhcl4.n5
Can you read it?

tischi · 2020-11-11T21:07:44Z

sbem-6dpf-1-whole-segmented-cells.n5

sbatch -c 2 -t 10:00:00 --mem 16000 -e /g/cba/tischer/tmp/err.txt -o /g/cba/tischer/tmp/out.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 sync --quiet /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/1.0.1/images/local/sbem-6dpf-1-whole-segmented-cells.n5 s3://idr-upload/tischi/sbem-6dpf-1-whole-segmented-cells.n5

started...

sacct --format="JobID,State,CPUTime,MaxRSS"

I think it finished:

-bash-4.2$ sacct --format="JobID,State,CPUTime,MaxRSS"
       JobID      State    CPUTime     MaxRSS 
------------ ---------- ---------- ---------- 
6315838       COMPLETED   06:23:16            
6315838.bat+  COMPLETED   06:23:16    119908K 
6315838.ext+  COMPLETED   06:23:24       352K

Took 6.5 hours, seems to have arrived.

-bash-4.2$ /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 ls s3://idr-upload/tischi/
                           PRE prospr-6dpf-1-whole-mhcl4.n5/
                           PRE sbem-6dpf-1-whole-segmented-cells.n5/
                           PRE test/
                           PRE test2/
                           PRE test3/
                           PRE test5/
2020-11-10 16:39:33         84 README.txt
2020-11-11 12:47:47         22 attributes.json

TODO:

sbatch -c 2 -t ???:00:00 --mem 16000 -e /g/cba/tischer/tmp/err.txt -o /g/cba/tischer/tmp/out.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 sync --quiet /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5

tischi · 2020-11-12T09:33:07Z

@joshmoore

Based on above experiment, if I extrapolate how long it would take to upload the 3D volume EM raw data using aws sync, I get:

2000 GB / 18 GB * 7 hours / 24 hours = 32 days

Any thoughts?

tischi · 2020-11-12T09:41:07Z

@constantinpape
Do you know how long it took to copy /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5 onto our local S3 storage?

@martinschorb
Do you know tricks to speed up copying to an S3 object store? I think you looked into this a bit, did you?

tischi · 2020-11-12T09:47:32Z

One idea could be to start several copy processes, e.g., parallelising over the resolution layers:

-bash-4.2$ ls /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/
attributes.json  s0  s1  s2  s3  s4  s5  s6  s7  s8  s9

I would think both our local file system and Josh's the receiving s3 storage should handle 10 parallel processes.

martinschorb · 2020-11-12T09:53:45Z

I found that it was much faster from a 3dcloud VM than from the cluster. But that could be specific to the network connnectivity to the s3 machines.

constantinpape · 2020-11-12T09:59:06Z

Do you know how long it took to copy /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5 onto our local S3 storage?

I think about a day. I used a cluster node (gpu6 or 7 probably).

constantinpape · 2020-11-12T10:03:27Z

@joshmoore
@tischi

I am not sure if this is helpful, but I could also convert the data to zarr on the EMBL side.

tischi · 2020-11-12T10:26:20Z

I am not sure if this is helpful, but I could also convert the data to zarr on the EMBL side.

I think this is very interesting indeed, but @joshmoore should comment, because I don't know whether he needs some specific zarr flavour.

joshmoore · 2020-11-12T10:27:40Z

Any thoughts?

Not immediately, unless you want to also try tar'ing it up.

I am not sure if this is helpful, but I could also convert the data to zarr on the EMBL side.

@constantinpape : if you want to kick off a n5-copy from n5 to zarr then I'll send you a script for the rest of the conversion. That being said, it would still be good to have the files on our servers for testing.

tischi · 2020-11-12T10:30:07Z

I think I'll just start it, resolution layer by resolution layer...

constantinpape · 2020-11-12T10:36:03Z

Edit: Sorry I wrote this before tischis last comment. If you want to do it Tischi, Go ahead.

@constantinpape : if you want to kick off a n5-copy from n5 to zarr then I'll send you a script for the rest of the conversion. That being said, it would still be good to have the files on our servers for testing.

I would probably use a python script I have set up for this.
If we want to do it on the embl side it would be best to test on one of the smaller volumes first, so I do the conversion then
run the script from @joshmoore and we see if the result matches.

~~Let's start with myosin, I will convert it later.~~

tischi · 2020-11-12T10:40:23Z

s9

sbatch -c 8 -t 100:00:00 --mem 16000 -e /g/cba/tischer/tmp/err_s9.txt -o /g/cba/tischer/tmp/out_s9.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 sync --quiet /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s9 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s9

This finished instantly...
@constantinpape
Could it be that this level is empty?

-bash-4.2$ ls /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s9/0/0/0
/g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s9/0/0/0

joshmoore · 2020-11-12T10:49:18Z

I would probably use a python script I have set up for this.

👍 for however it happens but the equivalent, yeah. 👍

constantinpape · 2020-11-12T10:55:07Z

I would probably use a python script I have set up for this.

+1 for however it happens but the equivalent, yeah. +1

@joshmoore ok, let's see if we can get Tischi's conversion to run first and then have this as a fallback.

This finished instantly...
@constantinpape
Could it be that this level is empty?

I can't log into VPN right now, will check later.
(But I can tell you already that the data is probably very small at s9 ;))

constantinpape · 2020-11-12T11:54:35Z

@tischi s9 has exactly one chunk, which is 41kb, so I would expect it to copy almost immediately:

pape@gpu7:/g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s9$ ls -lh 0/0/0 
-rw-r--r-- 1 pape kreshuk 41K 12. Feb 2020  0/0/0

tischi · 2020-11-12T12:23:01Z

sbatch -c 8 -t 100:00:00 --mem 16000 -e /g/cba/tischer/tmp/err_s3.txt -o /g/cba/tischer/tmp/out_s3.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 sync --quiet /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s3 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s3
...level s3 with sync took 1h 48min
// All levels above are done with sync, now proceeding with cp --recursive, maybe its faster since it does not have to check? We can then add missing chunks with sync later, I guess.
sbatch -c 8 -t 100:00:00 --mem 16000 -e /g/cba/tischer/tmp/err_s2.txt -o /g/cba/tischer/tmp/out_s2.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 cp --quiet --recursive /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s2 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s2
sbatch -c 8 -t 100:00:00 --mem 16000 -e /g/cba/tischer/tmp/err_s1.txt -o /g/cba/tischer/tmp/out_s1.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 cp --quiet --recursive /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1
sbatch -c 8 -t 100:00:00 --mem 16000 -e /g/cba/tischer/tmp/err_s0.txt -o /g/cba/tischer/tmp/out_s0.txt /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 cp --quiet --recursive /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s0 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s0

joshmoore · 2020-11-12T12:24:37Z

Here's a quick script which looks to be working locally. I'm unsure if setups are always channels for this data and if there's ever more than one channel and/or setup.

$ ./convert.py prospr-6dpf-1-whole-nachr.zarr prospr-6dpf-1-whole-nachr.ome.zarr
$ ome_zarr info prospr-6dpf-1-whole-nachr.ome.zarr/
/opt/data/tischi/prospr-6dpf-1-whole-nachr.ome.zarr [zgroup]
 - metadata
   - Multiscales
 - data
   - (1, 1, 519, 471, 500)
   - (1, 1, 260, 236, 250)
   - (1, 1, 130, 118, 125)
   - (1, 1, 65, 59, 63)

#!/usr/bin/env python

# This assumes that n5-copy has already been used

import argparse
import zarr

parser = argparse.ArgumentParser()
parser.add_argument("input")
parser.add_argument("output")
ns = parser.parse_args()

zin = zarr.open(ns.input)

sizes = []

def groups(z):
    rv = sorted(list(z.groups()))
    assert rv
    assert not list(z.arrays())
    return rv

def arrays(z):
    rv = sorted(list(z.arrays()))
    assert rv
    assert not list(z.groups())
    return rv

setups = groups(zin)
assert len(setups) == 1  # TODO: multiple channels?
for sname, setup in setups:
    timepoints = groups(setup)
    for tname, timepoint in timepoints:
        resolutions = arrays(timepoint)
        for idx, rtuple in enumerate(resolutions):
            rname, resolution = rtuple
            try:
                expected = sizes[idx]
                assert expected[0] == rname
                assert expected[1] == resolution.shape
                assert expected[2] == resolution.chunks
                assert expected[3] == resolution.dtype
            except:
                sizes.append((rname,
                              resolution.shape,
                              resolution.chunks,
                              resolution.dtype))


datasets = []
out = zarr.open(ns.output, mode="w")

for idx, size in enumerate(sizes):
    name, shape, chunks, dtype = size
    shape = tuple([len(timepoints), len(setups)] + list(shape))
    chunks = tuple([1, 1] + list(chunks))
    a = out.create_dataset(name, shape=shape, chunks=chunks, dtype=dtype)
    datasets.append({"path": name})
    for sidx, stuple in enumerate(groups(zin)):
        for tidx, ttuple in enumerate(groups(stuple[1])):
            resolutions = arrays(ttuple[1])
            a[tidx, sidx, :, :, :] = resolutions[idx][1]
out.attrs["multiscales"] = [
    {
        "version": "0.1",
        "datasets": datasets,
    }
]

constantinpape · 2020-11-12T12:59:35Z

I think I'll just start it, resolution layer by resolution l

I'm unsure if setups are always channels for this data and if there's ever more than one channel and/or setup.

For now we always have a single setup, corresponding to a single channel.

joshmoore · 2020-11-12T17:31:13Z

I don't know if this is a problem in zarr.n5.N5Store or in the data prospr-6dpf-1-whole-nachr.n5 data I've been looking at, but not having "n5": "2.0.0" in the intermediate groups leads to the exception:

In [43]: list(zarr.hierarchy.Group(store=zarr.n5.N5Store("/opt/data/tischi/prospr-6dpf-1-whole-nachr.n5")).groups())
...
ValueError: group not found at path 'setup0'

whereas if I edit the file I get:

In [44]: list(zarr.hierarchy.Group(store=zarr.n5.N5Store("/opt/data/tischi/prospr-6dpf-1-whole-nachr.n5")).groups())
Out[44]: [('setup0', <zarr.hierarchy.Group '/setup0'>)]

constantinpape · 2020-11-12T19:37:52Z

I have written these files with z5py, which for n5 only write the version attribute to the root, as specified in
https://github.com/saalfeldlab/n5#file-system-specification point 3.

Did this maybe change recently to be more in line with the zarr group metadata? (It shouldn't without changing major version because I think this would be a breaking change.)

Or is it just a bug in the zarr.n5store?

Anyway, for now we can fix it by adding the attributes to find the underlying issue.

constantinpape · 2020-11-12T21:13:51Z

@joshmoore
I had a closer look at the script you posted now, and I think we can do the same thing directly from our n5s and with much less copying around of the data. I implemented a script that should do this here:
https://github.com/constantinpape/i2k-2020-s3-zarr-workshop/blob/main/data-conversion/to_ome_zarr.py

Note that I am using z5py to read the n5 datasets because of the issue with the group level attributes, otherwise one could also use zarr.
Also, I am storing the zarr array with a NestedDirectoryStore; I would really prefer if we can do that otherwise the large datasets really overwhelm the FS. But if it's not supported yet we could also switch to the standard with flat hierarchy for the chunks.

joshmoore · 2020-11-12T21:16:48Z

Or is it just a bug in the zarr.n5store?

Yes. zarr-developers/zarr-python#651

I implemented a script that should do this here:

👍 I'll look more tomorrow.

But if it's not supported yet we could also switch to the standard with flat hierarchy for the chunks.

It previously wasn't on the zarr side, so in the ome-zarr spec it's prevented. I agree! I'd very much like to move to nested storage in the next version bump.

constantinpape · 2020-11-12T21:27:22Z

Ok, I updated it to support the flat chunk hierarchy.

tischi · 2020-11-17T09:09:34Z

@joshmoore
Feels like the storage is very slow for some reason. Maybe because you copy from it?
This only returns with a timeout for me:

aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 ls s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s0/

Does it work for you?

joshmoore · 2020-11-17T09:15:10Z

Ah, possibly. I've canceled my mirror command. Let me know if it looks to be faster.

...
...point0/s0/120/159/1:  159.27 GiB / 159.27 GiB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 1.83 MiB/s 24h44m10s
real    1484m16.708s
user    23m28.571s
sys     23m16.119s

tischi · 2020-11-17T09:58:37Z

@joshmoore
Still slow (see my mail for a theory) for me.
Is it faster for you?

joshmoore · 2020-11-17T13:38:58Z

I think you are right that the lower paths are struggling under the number of subelements. Certainly listing the top .n5 works (--> setup0). Listing it on the server returns fine (50 elements).

tischi · 2020-11-17T17:03:03Z

@joshmoore
If you can, could you please let me know the result of an ls for these three folders?

sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s0/
sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/
sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s2/

From the results I hope to deduce what has been copied already such that I do not start the sync in more subfolders than necessary.

joshmoore · 2020-11-17T17:05:07Z

@tischi Sure!

output

[jamoore@idrftp-ftp ~]$ mc ls idr-upload/idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s0/
[2020-11-17 17:04:15 UTC]      0B 0/
[2020-11-17 17:04:15 UTC]      0B 1/
[2020-11-17 17:04:15 UTC]      0B 10/
[2020-11-17 17:04:15 UTC]      0B 100/
[2020-11-17 17:04:15 UTC]      0B 101/
[2020-11-17 17:04:15 UTC]      0B 102/
[2020-11-17 17:04:15 UTC]      0B 103/
[2020-11-17 17:04:15 UTC]      0B 104/
[2020-11-17 17:04:15 UTC]      0B 105/
[2020-11-17 17:04:15 UTC]      0B 106/
[2020-11-17 17:04:15 UTC]      0B 107/
[2020-11-17 17:04:15 UTC]      0B 108/
[2020-11-17 17:04:15 UTC]      0B 109/
[2020-11-17 17:04:15 UTC]      0B 11/
[2020-11-17 17:04:15 UTC]      0B 110/
[2020-11-17 17:04:15 UTC]      0B 111/
[2020-11-17 17:04:15 UTC]      0B 112/
[2020-11-17 17:04:15 UTC]      0B 113/
[2020-11-17 17:04:15 UTC]      0B 114/
[2020-11-17 17:04:15 UTC]      0B 115/
[2020-11-17 17:04:15 UTC]      0B 116/
[2020-11-17 17:04:15 UTC]      0B 117/
[2020-11-17 17:04:15 UTC]      0B 118/
[2020-11-17 17:04:15 UTC]      0B 119/
[2020-11-17 17:04:15 UTC]      0B 12/
[2020-11-17 17:04:15 UTC]      0B 120/
[2020-11-17 17:04:15 UTC]      0B 121/
[2020-11-17 17:04:15 UTC]      0B 122/
[2020-11-17 17:04:15 UTC]      0B 123/
[2020-11-17 17:04:15 UTC]      0B 124/
[2020-11-17 17:04:15 UTC]      0B 125/
[2020-11-17 17:04:15 UTC]      0B 126/
[2020-11-17 17:04:15 UTC]      0B 127/
[2020-11-17 17:04:15 UTC]      0B 128/
[2020-11-17 17:04:15 UTC]      0B 129/
[2020-11-17 17:04:15 UTC]      0B 13/
[2020-11-17 17:04:15 UTC]      0B 130/
[2020-11-17 17:04:15 UTC]      0B 131/
[2020-11-17 17:04:15 UTC]      0B 132/
[2020-11-17 17:04:15 UTC]      0B 133/
[2020-11-17 17:04:15 UTC]      0B 134/
[2020-11-17 17:04:15 UTC]      0B 135/
[2020-11-17 17:04:15 UTC]      0B 136/
[2020-11-17 17:04:15 UTC]      0B 137/
[2020-11-17 17:04:15 UTC]      0B 138/
[2020-11-17 17:04:15 UTC]      0B 139/
[2020-11-17 17:04:15 UTC]      0B 14/
[2020-11-17 17:04:15 UTC]      0B 140/
[2020-11-17 17:04:15 UTC]      0B 141/
[2020-11-17 17:04:15 UTC]      0B 142/
[jamoore@idrftp-ftp ~]$ mc ls idr-upload/idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/
[2020-11-17 17:04:29 UTC]      0B 0/
[2020-11-17 17:04:29 UTC]      0B 1/
[2020-11-17 17:04:29 UTC]      0B 10/
[2020-11-17 17:04:29 UTC]      0B 100/
[2020-11-17 17:04:29 UTC]      0B 101/
[2020-11-17 17:04:29 UTC]      0B 102/
[2020-11-17 17:04:29 UTC]      0B 103/
[2020-11-17 17:04:29 UTC]      0B 104/
[2020-11-17 17:04:29 UTC]      0B 105/
[2020-11-17 17:04:29 UTC]      0B 106/
[2020-11-17 17:04:29 UTC]      0B 107/
[2020-11-17 17:04:29 UTC]      0B 108/
[2020-11-17 17:04:29 UTC]      0B 109/
[2020-11-17 17:04:29 UTC]      0B 11/
[2020-11-17 17:04:29 UTC]      0B 110/
[2020-11-17 17:04:29 UTC]      0B 111/
[2020-11-17 17:04:29 UTC]      0B 112/
[2020-11-17 17:04:29 UTC]      0B 113/
[2020-11-17 17:04:29 UTC]      0B 114/
[2020-11-17 17:04:29 UTC]      0B 115/
[2020-11-17 17:04:29 UTC]      0B 116/
[2020-11-17 17:04:29 UTC]      0B 117/
[2020-11-17 17:04:29 UTC]      0B 118/
[2020-11-17 17:04:29 UTC]      0B 119/
[2020-11-17 17:04:29 UTC]      0B 12/
[2020-11-17 17:04:29 UTC]      0B 120/
[2020-11-17 17:04:29 UTC]      0B 121/
[2020-11-17 17:04:29 UTC]      0B 122/
[2020-11-17 17:04:29 UTC]      0B 123/
[2020-11-17 17:04:29 UTC]      0B 124/
[2020-11-17 17:04:29 UTC]      0B 125/
[2020-11-17 17:04:29 UTC]      0B 126/
[2020-11-17 17:04:29 UTC]      0B 127/
[2020-11-17 17:04:29 UTC]      0B 128/
[2020-11-17 17:04:29 UTC]      0B 129/
[2020-11-17 17:04:29 UTC]      0B 13/
[2020-11-17 17:04:29 UTC]      0B 130/
[2020-11-17 17:04:29 UTC]      0B 131/
[2020-11-17 17:04:29 UTC]      0B 132/
[2020-11-17 17:04:29 UTC]      0B 133/
[2020-11-17 17:04:29 UTC]      0B 134/
[2020-11-17 17:04:29 UTC]      0B 135/
[2020-11-17 17:04:29 UTC]      0B 136/
[2020-11-17 17:04:29 UTC]      0B 137/
[2020-11-17 17:04:29 UTC]      0B 138/
[2020-11-17 17:04:29 UTC]      0B 139/
[2020-11-17 17:04:29 UTC]      0B 14/
[2020-11-17 17:04:29 UTC]      0B 140/
[2020-11-17 17:04:29 UTC]      0B 141/
[2020-11-17 17:04:29 UTC]      0B 142/
[2020-11-17 17:04:29 UTC]      0B 143/
[2020-11-17 17:04:29 UTC]      0B 15/
[2020-11-17 17:04:29 UTC]      0B 16/
[2020-11-17 17:04:29 UTC]      0B 17/
[2020-11-17 17:04:29 UTC]      0B 18/
[2020-11-17 17:04:29 UTC]      0B 19/
[2020-11-17 17:04:29 UTC]      0B 2/
[2020-11-17 17:04:29 UTC]      0B 20/
[2020-11-17 17:04:29 UTC]      0B 21/
[2020-11-17 17:04:29 UTC]      0B 22/
[2020-11-17 17:04:29 UTC]      0B 23/
[2020-11-17 17:04:29 UTC]      0B 24/
[2020-11-17 17:04:29 UTC]      0B 25/
[2020-11-17 17:04:29 UTC]      0B 26/
[2020-11-17 17:04:29 UTC]      0B 27/
[2020-11-17 17:04:29 UTC]      0B 28/
[2020-11-17 17:04:29 UTC]      0B 29/
[2020-11-17 17:04:29 UTC]      0B 3/
[2020-11-17 17:04:29 UTC]      0B 30/
[2020-11-17 17:04:29 UTC]      0B 31/
[2020-11-17 17:04:29 UTC]      0B 32/
[2020-11-17 17:04:29 UTC]      0B 33/
[2020-11-17 17:04:29 UTC]      0B 34/
[2020-11-17 17:04:29 UTC]      0B 35/
[2020-11-17 17:04:29 UTC]      0B 36/
[2020-11-17 17:04:29 UTC]      0B 37/
[2020-11-17 17:04:29 UTC]      0B 38/
[2020-11-17 17:04:29 UTC]      0B 39/
[2020-11-17 17:04:29 UTC]      0B 4/
[2020-11-17 17:04:29 UTC]      0B 40/
[2020-11-17 17:04:29 UTC]      0B 41/
[2020-11-17 17:04:29 UTC]      0B 42/
[2020-11-17 17:04:29 UTC]      0B 43/
[2020-11-17 17:04:29 UTC]      0B 44/
[2020-11-17 17:04:29 UTC]      0B 45/
[2020-11-17 17:04:29 UTC]      0B 46/
[2020-11-17 17:04:29 UTC]      0B 47/
[2020-11-17 17:04:29 UTC]      0B 48/
[2020-11-17 17:04:29 UTC]      0B 49/
[2020-11-17 17:04:29 UTC]      0B 5/
[2020-11-17 17:04:29 UTC]      0B 50/
[2020-11-17 17:04:29 UTC]      0B 51/
[2020-11-17 17:04:29 UTC]      0B 52/
[2020-11-17 17:04:29 UTC]      0B 53/
[2020-11-17 17:04:29 UTC]      0B 54/
[2020-11-17 17:04:29 UTC]      0B 55/
[2020-11-17 17:04:29 UTC]      0B 56/
[2020-11-17 17:04:29 UTC]      0B 57/
[2020-11-17 17:04:29 UTC]      0B 58/
[2020-11-17 17:04:29 UTC]      0B 59/
[2020-11-17 17:04:29 UTC]      0B 6/
[2020-11-17 17:04:29 UTC]      0B 60/
[2020-11-17 17:04:29 UTC]      0B 61/
[2020-11-17 17:04:29 UTC]      0B 62/
[2020-11-17 17:04:29 UTC]      0B 63/
[2020-11-17 17:04:29 UTC]      0B 64/
[2020-11-17 17:04:29 UTC]      0B 65/
[2020-11-17 17:04:29 UTC]      0B 66/
[2020-11-17 17:04:29 UTC]      0B 67/
[2020-11-17 17:04:29 UTC]      0B 68/
[2020-11-17 17:04:29 UTC]      0B 69/
[2020-11-17 17:04:29 UTC]      0B 7/
[2020-11-17 17:04:29 UTC]      0B 70/
[2020-11-17 17:04:29 UTC]      0B 71/
[2020-11-17 17:04:29 UTC]      0B 72/
[2020-11-17 17:04:29 UTC]      0B 73/
[2020-11-17 17:04:29 UTC]      0B 74/
[2020-11-17 17:04:29 UTC]      0B 75/
[2020-11-17 17:04:29 UTC]      0B 76/
[2020-11-17 17:04:29 UTC]      0B 77/
[2020-11-17 17:04:29 UTC]      0B 78/
[2020-11-17 17:04:29 UTC]      0B 79/
[2020-11-17 17:04:29 UTC]      0B 8/
[2020-11-17 17:04:29 UTC]      0B 80/
[2020-11-17 17:04:29 UTC]      0B 81/
[2020-11-17 17:04:29 UTC]      0B 82/
[2020-11-17 17:04:29 UTC]      0B 83/
[2020-11-17 17:04:29 UTC]      0B 84/
[2020-11-17 17:04:29 UTC]      0B 85/
[2020-11-17 17:04:29 UTC]      0B 86/
[2020-11-17 17:04:29 UTC]      0B 87/
[2020-11-17 17:04:29 UTC]      0B 88/
[2020-11-17 17:04:29 UTC]      0B 89/
[2020-11-17 17:04:29 UTC]      0B 9/
[2020-11-17 17:04:29 UTC]      0B 90/
[2020-11-17 17:04:29 UTC]      0B 91/
[2020-11-17 17:04:29 UTC]      0B 92/
[2020-11-17 17:04:29 UTC]      0B 93/
[2020-11-17 17:04:29 UTC]      0B 94/
[jamoore@idrftp-ftp ~]$ mc ls idr-upload/idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s2/
[2020-11-12 16:53:58 UTC]    143B attributes.json
[2020-11-17 17:04:32 UTC]      0B 0/
[2020-11-17 17:04:32 UTC]      0B 1/
[2020-11-17 17:04:32 UTC]      0B 10/
[2020-11-17 17:04:32 UTC]      0B 11/
[2020-11-17 17:04:32 UTC]      0B 12/
[2020-11-17 17:04:32 UTC]      0B 13/
[2020-11-17 17:04:32 UTC]      0B 14/
[2020-11-17 17:04:32 UTC]      0B 15/
[2020-11-17 17:04:32 UTC]      0B 16/
[2020-11-17 17:04:32 UTC]      0B 17/
[2020-11-17 17:04:32 UTC]      0B 18/
[2020-11-17 17:04:32 UTC]      0B 19/
[2020-11-17 17:04:32 UTC]      0B 2/
[2020-11-17 17:04:32 UTC]      0B 20/
[2020-11-17 17:04:32 UTC]      0B 21/
[2020-11-17 17:04:32 UTC]      0B 22/
[2020-11-17 17:04:32 UTC]      0B 23/
[2020-11-17 17:04:32 UTC]      0B 24/
[2020-11-17 17:04:32 UTC]      0B 25/
[2020-11-17 17:04:32 UTC]      0B 26/
[2020-11-17 17:04:32 UTC]      0B 27/
[2020-11-17 17:04:32 UTC]      0B 28/
[2020-11-17 17:04:32 UTC]      0B 29/
[2020-11-17 17:04:32 UTC]      0B 3/
[2020-11-17 17:04:32 UTC]      0B 30/
[2020-11-17 17:04:32 UTC]      0B 31/
[2020-11-17 17:04:32 UTC]      0B 32/
[2020-11-17 17:04:32 UTC]      0B 33/
[2020-11-17 17:04:32 UTC]      0B 34/
[2020-11-17 17:04:32 UTC]      0B 35/
[2020-11-17 17:04:32 UTC]      0B 36/
[2020-11-17 17:04:32 UTC]      0B 37/
[2020-11-17 17:04:32 UTC]      0B 38/
[2020-11-17 17:04:32 UTC]      0B 39/
[2020-11-17 17:04:32 UTC]      0B 4/
[2020-11-17 17:04:32 UTC]      0B 40/
[2020-11-17 17:04:32 UTC]      0B 41/
[2020-11-17 17:04:32 UTC]      0B 42/
[2020-11-17 17:04:32 UTC]      0B 43/
[2020-11-17 17:04:32 UTC]      0B 44/
[2020-11-17 17:04:32 UTC]      0B 45/
[2020-11-17 17:04:32 UTC]      0B 46/
[2020-11-17 17:04:32 UTC]      0B 47/
[2020-11-17 17:04:32 UTC]      0B 48/
[2020-11-17 17:04:32 UTC]      0B 49/
[2020-11-17 17:04:32 UTC]      0B 5/
[2020-11-17 17:04:32 UTC]      0B 50/
[2020-11-17 17:04:32 UTC]      0B 51/
[2020-11-17 17:04:32 UTC]      0B 52/
[2020-11-17 17:04:32 UTC]      0B 53/
[2020-11-17 17:04:32 UTC]      0B 54/
[2020-11-17 17:04:32 UTC]      0B 55/
[2020-11-17 17:04:32 UTC]      0B 56/
[2020-11-17 17:04:32 UTC]      0B 57/
[2020-11-17 17:04:32 UTC]      0B 58/
[2020-11-17 17:04:32 UTC]      0B 59/
[2020-11-17 17:04:32 UTC]      0B 6/
[2020-11-17 17:04:32 UTC]      0B 60/
[2020-11-17 17:04:32 UTC]      0B 61/
[2020-11-17 17:04:32 UTC]      0B 62/
[2020-11-17 17:04:32 UTC]      0B 63/
[2020-11-17 17:04:32 UTC]      0B 64/
[2020-11-17 17:04:32 UTC]      0B 65/
[2020-11-17 17:04:32 UTC]      0B 66/
[2020-11-17 17:04:32 UTC]      0B 67/
[2020-11-17 17:04:32 UTC]      0B 68/
[2020-11-17 17:04:32 UTC]      0B 69/
[2020-11-17 17:04:32 UTC]      0B 7/
[2020-11-17 17:04:32 UTC]      0B 70/
[2020-11-17 17:04:32 UTC]      0B 71/
[2020-11-17 17:04:32 UTC]      0B 8/
[2020-11-17 17:04:32 UTC]      0B 9/

tischi · 2020-11-17T19:01:44Z

@joshmoore
Using sync I am now getting this error:

bash-4.2$ /g/cba/tischer/software/aws --profile tischi --endpoint-url=https://idr-ftp.openmicroscopy.org s3 sync /g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/94 s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/94
fatal error: Read timeout on endpoint URL: "https://idr-ftp.openmicroscopy.org/idr-upload?list-type=2&prefix=tischi%2Fsbem-6dpf-1-whole-raw.n5%2Fsetup0%2Ftimepoint0%2Fs1%2F94%2F&encoding-type=url"

any ideas?

tischi · 2020-11-17T19:06:46Z

@joshmoore
And using cp I am also getting an error:

upload failed: ../../g/arendt/EM_6dpf_segmentation/platy-browser-data/data/rawdata/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/94/100/0 to s3://idr-upload/tischi/sbem-6dpf-1-whole-raw.n5/setup0/timepoint0/s1/94/100/0 An error occurred (ServiceUnavailable) when calling the PutObject operation (reached max retries: 4): Please reduce your request rate.

Maybe the server is kind of down?

tischi assigned joshmoore Oct 22, 2020

tischi added the critical label Oct 22, 2020

joshmoore mentioned this issue Nov 12, 2020

Add docker build zarr-developers/zarr_implementations#14

Merged

joshmoore mentioned this issue Nov 12, 2020

n5: prevent failure on missing group metadata zarr-developers/zarr-python#651

Merged

6 tasks

constantinpape mentioned this issue Nov 18, 2020

Plans for converting the data #2

Open

joshmoore mentioned this issue Nov 23, 2020

3D vs 5D array layout #10

Open

convert platybrowser data to zarr #1

convert platybrowser data to zarr #1

Comments

tischi commented Oct 16, 2020

joshmoore commented Oct 22, 2020

joshmoore commented Oct 22, 2020

tischi commented Oct 22, 2020

tischi commented Oct 22, 2020

tischi commented Oct 22, 2020 • edited Loading

joshmoore commented Oct 22, 2020 • edited Loading

tischi commented Oct 22, 2020

joshmoore commented Oct 25, 2020 • edited Loading

tischi commented Oct 26, 2020

joshmoore commented Oct 26, 2020

tischi commented Oct 26, 2020

tischi commented Oct 26, 2020

joshmoore commented Oct 26, 2020

tischi commented Nov 11, 2020 • edited Loading

tischi commented Nov 11, 2020 • edited Loading

sbem-6dpf-1-whole-segmented-cells.n5

tischi commented Nov 12, 2020

tischi commented Nov 12, 2020

tischi commented Nov 12, 2020

martinschorb commented Nov 12, 2020

constantinpape commented Nov 12, 2020

constantinpape commented Nov 12, 2020

tischi commented Nov 12, 2020

joshmoore commented Nov 12, 2020

tischi commented Nov 12, 2020

constantinpape commented Nov 12, 2020 • edited Loading

tischi commented Nov 12, 2020

s9

joshmoore commented Nov 12, 2020

constantinpape commented Nov 12, 2020

constantinpape commented Nov 12, 2020

tischi commented Nov 12, 2020 • edited Loading

joshmoore commented Nov 12, 2020

constantinpape commented Nov 12, 2020

joshmoore commented Nov 12, 2020

constantinpape commented Nov 12, 2020 • edited Loading

constantinpape commented Nov 12, 2020

joshmoore commented Nov 12, 2020

constantinpape commented Nov 12, 2020

tischi commented Nov 17, 2020

joshmoore commented Nov 17, 2020

tischi commented Nov 17, 2020

joshmoore commented Nov 17, 2020

tischi commented Nov 17, 2020

joshmoore commented Nov 17, 2020

tischi commented Nov 17, 2020

tischi commented Nov 17, 2020

tischi commented Oct 22, 2020 •

edited

Loading

joshmoore commented Oct 22, 2020 •

edited

Loading

joshmoore commented Oct 25, 2020 •

edited

Loading

tischi commented Nov 11, 2020 •

edited

Loading

tischi commented Nov 11, 2020 •

edited

Loading

constantinpape commented Nov 12, 2020 •

edited

Loading

tischi commented Nov 12, 2020 •

edited

Loading

constantinpape commented Nov 12, 2020 •

edited

Loading