Skip to content
This repository has been archived by the owner on Sep 16, 2020. It is now read-only.

Add benchmark result to README.md #17

Closed
AkihiroSuda opened this issue Jul 29, 2017 · 0 comments
Closed

Add benchmark result to README.md #17

AkihiroSuda opened this issue Jul 29, 2017 · 0 comments

Comments

@AkihiroSuda
Copy link
Owner

FILEgrain benchmark

FILEgrain version: 969524a

amount of blobs

openjdk:8@sha256:5da842d59f76009fa27ffde06888ebd560c7ad17607d7cd1e52fc0757c9a45fb

$ ../du.sh
Pure blobs (excludes continuity): 704288157 [671.6615266799927MiB]
Tarred blobs (excludes continuity): 718243840 [684.970703125MiB]
Tarred + Gzipped blobs (excludes continuity): 273990124 [261.2973442077637MiB]
FILEgrain (gzipped): 273344520 [260.68164825439453MiB]
  • Mount: 2 blobs, 5.416MiB
  • then sh: 8 blobs, 7.31MiB
  • then java -version: 30 blobs, 88.18MiB
  • then javac HelloWorld.java: 50 blobs, 137.3MiB

kdeneon/all@sha256:e3e7f216a5f8f1fdcff4eab8807d7afcd291c050099ab3e8a8355b7b28a19247

$ ../du.sh
Pure blobs (excludes continuity): 5123620038 [4.771743005141616GiB]
Tarred blobs (excludes continuity): 5228851200 [4.869747161865234GiB]
Tarred + Gzipped blobs (excludes continuity): 2236129577 [2.082557954825461GiB]
FILEgrain (gzipped): 2235640216 [2.0821022018790245GiB]
  • Mount: 2 blobs, 34.49MiB
  • then sh: 8 blobs, 36.73MiB
  • then DISPLAY=:1 startkde, with host-side Xephyr -screen 1024x768 :1: 4267 blobs, 742.7MiB
  • then start Firefox via the KDE start-menu: 4506 blobs, 866.6MiB

kaggle/python@sha256:335103c998aea22a5608c2eeca7dcf109e0828ed233b75f5098182c5b058fe98

$ ../du.sh
Pure blobs (excludes continuity): 8937194028 [8.323410551995039GiB]
Tarred blobs (excludes continuity): 9025382400 [8.405542373657227GiB]
Tarred + Gzipped blobs (excludes continuity): 3818209353 [3.5559845650568604GiB]
FILEgrain (gzipped): 3822555054 [3.5600318145006895GiB]
  • Mount: 2 blobs, 38.18MiB
  • then sh: 8 blobs, 40.14MiB
  • then ipython -c 'print("hello")': 1033 blobs, 75.4MiB
  • then ipython -c 'import nltk: 2779 blobs, 352MiB

deduplication benchmark

$ (cd kdeneon-all-sha256-e3e7f216a5f8f1fdcff4eab8807d7afcd291c050099ab3e8a8355b7b28a19247/blobs/sha256; find .) > /tmp/a
$ (cd kaggle-python-sha256-335103c998aea22a5608c2eeca7dcf109e0828ed233b75f5098182c5b058fe98/blobs/sha256; find .) > /tmp/b
$ wc -l /tmp/a /tmp/b
  156916 /tmp/a
  131552 /tmp/b
  288468 total
$ cat /tmp/a /tmp/b | sort | uniq | wc -l
279749
$ cat /tmp/a /tmp/b | sort | uniq -D | uniq | wc -l
8719
$ echo $((156916 + 131552 - 8719))
279749
$ sum=0; for f in $(cat /tmp/a /tmp/b | sort | uniq -D | uniq);do let s=$(stat -c %s kdeneon-all-sha256-e3e7f216a5f8f1fdcff4eab8807d7afcd291c050099ab3e8a8355b7b28a19247/blobs/sha256/$f); sum=$(($sum + $s)); done; echo $sum
79064496 [75.40177917480469MiB]

These are totally different images but have 75MiB of common Debian files.

FUSE

(on Fedora 26, 2 vCPUs, 2GB RAM, VMware Fusion on MacBookPro)

Result of export TIMEFORMAT=%R; for f in $(seq 1 10); do bash -c "cd /; time tar cf - usr | tar tvf - > /dev/null"; done on openjdk:8.

docker run -it --rm:

9.238
9.950
10.098
10.446
6.487
7.425
3.004
0.846
0.775
0.714

FILEgrain without FOPEN_KEEP_CACHE (old commit: b33bc29):

35.777 [pull & cache blobs]
20.870
13.877
19.071
18.319
18.053
19.357
14.154
22.630
17.400

FILEgrain with FOPEN_KEEP_CACHE (not so effective?):

28.318 [pull & cache blobs]
15.833
15.014
16.962
18.809
17.566
15.545
17.971
18.071
15.742

Docker Registry I/O (TODO)

N/A because current FILEgrain does not support Docker Registry API yet.
TODO: integrate FILEgrain into containerd and do real benchmark

Appendix

du.sh

#!/bin/sh
set -e

echo -n "Pure blobs (excludes continuity): "
du -bs $(../print-du-exclude-extra-blobs.py) ./blobs | awk '{print $1}'

echo -n "Tarred blobs (excludes continuity): "
tar cf - $(../print-du-exclude-extra-blobs.py) ./blobs | wc -c

echo -n "Tarred + Gzipped blobs (excludes continuity): "
tar cf - $(../print-du-exclude-extra-blobs.py) ./blobs | gzip -9 | wc -c

sum=0
for f in $(find ./blobs -type f); do
    sum=$(($sum + $(gzip -9c $f | wc -c )))
done
echo "FILEgrain (gzipped): $sum"

print-du-exclude-extra-blobs.py

#!/usr/bin/python3
# Usage: du -bs $(this.py) ./blobs
import json

def dig2blobpath(s):
    spl = s.split(':')
    algo, heks =spl[0], spl[1]
    return 'blobs/' + algo+'/'+heks

excludes = []
for m_entry in json.load(open('index.json'))['manifests']:
    m_blob = dig2blobpath(m_entry['digest'])
    excludes.append(m_blob)
    m = json.load(open(m_blob))
    excludes.append(dig2blobpath(m['config']['digest']))
    for l in m['layers']:
        excludes.append(dig2blobpath(l['digest']))

for f in excludes:
    print('--exclude '+f)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant