-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added blob table data structure. #677
Changes from 6 commits
1fa355e
af9e50f
1e8702a
ccfb9da
5db2284
cd04356
494ab5c
1054348
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
package table | ||
|
||
// BlobMetadata encapsulates various information about a blob written by the traffic generator. | ||
type BlobMetadata struct { | ||
// key of the blob, set when the blob is initially uploaded. | ||
key []byte | ||
|
||
// checksum of the blob. | ||
checksum *[16]byte | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this need to be a pointer? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed to a non-pointer variable. |
||
|
||
// size of the blob, in bytes. | ||
size uint | ||
|
||
// blobIndex of the blob. | ||
blobIndex uint | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is blob index? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Blob index is one of the arguments needed when retrieving a blob. I am unsure of the deeper meaning of this field, I figured out I needed it through reverse engineering.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yea, this index is the the position of the blob in the batch (the batch is a list of blobs) |
||
|
||
// remainingReadPermits describes the maximum number of remaining reads permitted against this blob. | ||
// If -1 then an unlimited number of reads are permitted. | ||
remainingReadPermits int | ||
} | ||
|
||
// NewBlobMetadata creates a new BlobMetadata instance. The readPermits parameter describes the maximum number of | ||
// remaining reads permitted against this blob. If -1 then an unlimited number of reads are permitted. | ||
func NewBlobMetadata( | ||
key []byte, | ||
checksum *[16]byte, | ||
size uint, | ||
blobIndex uint, | ||
readPermits int) *BlobMetadata { | ||
|
||
if readPermits == 0 { | ||
panic("readPermits must be greater than 0, or -1 for unlimited reads") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could return error instead of crashing when validation fails There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. change made |
||
} | ||
|
||
return &BlobMetadata{ | ||
key: key, | ||
checksum: checksum, | ||
size: size, | ||
blobIndex: blobIndex, | ||
remainingReadPermits: readPermits, | ||
} | ||
} | ||
|
||
// Key returns the key of the blob. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about making the member variables public? These getters are quite simple so seem not worth the verbosity. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Intention was to make it possible to read these values without the capability of updating them. But maybe that's more of a java design pattern. I've made the member variables public as you suggest. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yea it feels quite Java-ish |
||
func (blob *BlobMetadata) Key() []byte { | ||
return blob.key | ||
} | ||
|
||
// Checksum returns the checksum of the blob. | ||
func (blob *BlobMetadata) Checksum() *[16]byte { | ||
return blob.checksum | ||
} | ||
|
||
// Size returns the size of the blob, in bytes. | ||
func (blob *BlobMetadata) Size() uint { | ||
return blob.size | ||
} | ||
|
||
// BlobIndex returns the blobIndex of the blob. | ||
func (blob *BlobMetadata) BlobIndex() uint { | ||
return blob.blobIndex | ||
} | ||
|
||
// RemainingReadPermits returns the maximum number of remaining reads permitted against this blob. | ||
func (blob *BlobMetadata) RemainingReadPermits() int { | ||
return blob.remainingReadPermits | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
package table | ||
|
||
import "sync" | ||
|
||
// BlobStore is a thread safe data structure that tracks blobs written by the traffic generator. | ||
type BlobStore struct { | ||
|
||
// blobs contains all blobs currently tracked by the store. | ||
blobs map[uint64]*BlobMetadata | ||
|
||
// nextKey describes the next key to used for the blobs map. | ||
nextKey uint64 | ||
|
||
lock sync.Mutex | ||
} | ||
|
||
// NewBlobStore creates a new BlobStore instance. | ||
func NewBlobStore() *BlobStore { | ||
return &BlobStore{ | ||
blobs: make(map[uint64]*BlobMetadata), | ||
nextKey: 0, | ||
} | ||
} | ||
|
||
// Add a blob to the store. | ||
func (store *BlobStore) Add(blob *BlobMetadata) { | ||
store.lock.Lock() | ||
defer store.lock.Unlock() | ||
|
||
store.blobs[store.nextKey] = blob | ||
store.nextKey++ | ||
} | ||
|
||
// GetNext returns the next blob in the store. Decrements the blob's read permits, removing it | ||
// from the store if the permits are exhausted. This method makes no guarantees about the order | ||
// in which blobs are returned. Returns nil if no blobs are available. | ||
func (store *BlobStore) GetNext() *BlobMetadata { | ||
store.lock.Lock() | ||
defer store.lock.Unlock() | ||
|
||
for key, blob := range store.blobs { | ||
// Always return the first blob found. | ||
|
||
if blob.remainingReadPermits > 0 { | ||
blob.remainingReadPermits-- | ||
if blob.remainingReadPermits == 0 { | ||
delete(store.blobs, key) | ||
} | ||
} | ||
|
||
return blob | ||
} | ||
return nil | ||
} | ||
|
||
// Size returns the number of blobs currently stored. | ||
func (store *BlobStore) Size() uint { | ||
store.lock.Lock() | ||
defer store.lock.Unlock() | ||
|
||
return uint(len(store.blobs)) | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
package table | ||
|
||
import ( | ||
tu "github.com/Layr-Labs/eigenda/common/testutils" | ||
"github.com/stretchr/testify/assert" | ||
"golang.org/x/exp/rand" | ||
"testing" | ||
) | ||
|
||
// randomMetadata generates a random BlobMetadata instance. | ||
func randomMetadata(permits int) *BlobMetadata { | ||
key := make([]byte, 32) | ||
checksum := [16]byte{} | ||
_, _ = rand.Read(key) | ||
_, _ = rand.Read(checksum[:]) | ||
return NewBlobMetadata(key, &checksum, 1024, 0, permits) | ||
} | ||
|
||
// TestBasicOperation tests basic operations of the BlobTable. Adds blobs and iterates over them. | ||
func TestBasicOperation(t *testing.T) { | ||
tu.InitializeRandom() | ||
|
||
store := NewBlobStore() | ||
assert.Equal(t, uint(0), store.Size()) | ||
|
||
size := 1024 | ||
expectedMetadata := make([]*BlobMetadata, 0) | ||
for i := 0; i < size; i++ { | ||
metadata := randomMetadata(1) | ||
store.Add(metadata) | ||
expectedMetadata = append(expectedMetadata, metadata) | ||
assert.Equal(t, uint(i+1), store.Size()) | ||
} | ||
|
||
for i := 0; i < size; i++ { | ||
assert.Equal(t, expectedMetadata[i], store.blobs[uint64(i)]) | ||
} | ||
} | ||
|
||
// TestGetRandomWithRemoval tests getting a random blob data, but where the number of permits per blob is unlimited. | ||
func TestGetRandomNoRemoval(t *testing.T) { | ||
tu.InitializeRandom() | ||
|
||
table := NewBlobStore() | ||
assert.Equal(t, uint(0), table.Size()) | ||
|
||
// Requesting a random element from an empty table should return nil. | ||
element := table.GetNext() | ||
assert.Nil(t, element) | ||
|
||
expectedMetadata := make(map[string]*BlobMetadata) | ||
size := 128 | ||
for i := 0; i < size; i++ { | ||
metadata := randomMetadata(-1) // -1 == unlimited permits | ||
table.Add(metadata) | ||
expectedMetadata[string(metadata.key)] = metadata | ||
assert.Equal(t, uint(i+1), table.Size()) | ||
} | ||
|
||
randomIndices := make(map[string]bool) | ||
|
||
// Query more times than the number of blobs to ensure that blobs are not removed. | ||
for i := 0; i < size*8; i++ { | ||
metadata := table.GetNext() | ||
assert.NotNil(t, metadata) | ||
assert.Equal(t, expectedMetadata[string(metadata.key)], metadata) | ||
randomIndices[string(metadata.key)] = true | ||
} | ||
|
||
// Sanity check: ensure that at least 10 different blobs were returned. This check is attempting to verify | ||
// that we are actually getting random blobs. The probability of this check failing is extremely low if | ||
// the random number generator is working correctly. | ||
assert.GreaterOrEqual(t, len(randomIndices), 10) | ||
} | ||
|
||
// TestGetRandomWithRemoval tests getting a random blob data, where the number of permits per blob is limited. | ||
func TestGetRandomWithRemoval(t *testing.T) { | ||
tu.InitializeRandom() | ||
|
||
table := NewBlobStore() | ||
assert.Equal(t, uint(0), table.Size()) | ||
|
||
// Requesting a random element from an empty table should return nil. | ||
element := table.GetNext() | ||
assert.Nil(t, element) | ||
|
||
permitCount := 2 | ||
|
||
size := 1024 | ||
expectedMetadata := make(map[string]uint) | ||
for i := 0; i < size; i++ { | ||
metadata := randomMetadata(permitCount) | ||
table.Add(metadata) | ||
expectedMetadata[string(metadata.Key())] = 0 | ||
assert.Equal(t, uint(i+1), table.Size()) | ||
} | ||
|
||
// Requesting elements a number of times equal to the size times the number of permits should completely | ||
// drain the table and return all elements a number of times equal to the number of permits. | ||
for i := 0; i < size*permitCount; i++ { | ||
metadata := table.GetNext() | ||
assert.NotNil(t, metadata) | ||
|
||
k := string(metadata.Key()) | ||
permitsUsed := expectedMetadata[k] + 1 | ||
expectedMetadata[k] = permitsUsed | ||
assert.LessOrEqual(t, permitsUsed, uint(permitCount)) | ||
} | ||
|
||
assert.Equal(t, uint(0), table.Size()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the blobKey:
eigenda/disperser/disperser.go
Line 52 in 2566c21
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key, in this context, is the
[]byte
return value of this method:The
BlobKey
seems like it is holding the same data, although it is in string form, not byte array form. I originally had theMetadataHash
field as well, although that was removed based on a prior comment.Should I be using the
BlobKey
struct here?