Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added blob table data structure. #677

Merged
merged 8 commits into from
Aug 6, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions tools/traffic/table/blob_metadata.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
package table

// BlobMetadata encapsulates various information about a blob written by the traffic generator.
type BlobMetadata struct {
// key of the blob, set when the blob is initially uploaded.
key []byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the blobKey:

type BlobKey struct {
?

Copy link
Contributor Author

@cody-littley cody-littley Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key, in this context, is the []byte return value of this method:

func (c *disperserClient) DisperseBlob(ctx context.Context, data []byte, quorums []uint8) (*disperser.BlobStatus, []byte, error) {

The BlobKey seems like it is holding the same data, although it is in string form, not byte array form. I originally had the MetadataHash field as well, although that was removed based on a prior comment.

Should I be using the BlobKey struct here?


// checksum of the blob.
checksum *[16]byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to a non-pointer variable.


// size of the blob, in bytes.
size uint

// blobIndex of the blob.
blobIndex uint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is blob index?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blob index is one of the arguments needed when retrieving a blob. I am unsure of the deeper meaning of this field, I figured out I needed it through reverse engineering.

func (r *retrievalClient) RetrieveBlob(
	ctx context.Context,
	batchHeaderHash [32]byte,
	blobIndex uint32,  👈
	referenceBlockNumber uint,
	batchRoot [32]byte,
	quorumID core.QuorumID) ([]byte, error) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, this index is the the position of the blob in the batch (the batch is a list of blobs)


// remainingReadPermits describes the maximum number of remaining reads permitted against this blob.
// If -1 then an unlimited number of reads are permitted.
remainingReadPermits int
}

// NewBlobMetadata creates a new BlobMetadata instance. The readPermits parameter describes the maximum number of
// remaining reads permitted against this blob. If -1 then an unlimited number of reads are permitted.
func NewBlobMetadata(
key []byte,
checksum *[16]byte,
size uint,
blobIndex uint,
readPermits int) *BlobMetadata {

if readPermits == 0 {
panic("readPermits must be greater than 0, or -1 for unlimited reads")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could return error instead of crashing when validation fails

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change made

}

return &BlobMetadata{
key: key,
checksum: checksum,
size: size,
blobIndex: blobIndex,
remainingReadPermits: readPermits,
}
}

// Key returns the key of the blob.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making the member variables public? These getters are quite simple so seem not worth the verbosity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intention was to make it possible to read these values without the capability of updating them. But maybe that's more of a java design pattern. I've made the member variables public as you suggest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea it feels quite Java-ish

func (blob *BlobMetadata) Key() []byte {
return blob.key
}

// Checksum returns the checksum of the blob.
func (blob *BlobMetadata) Checksum() *[16]byte {
return blob.checksum
}

// Size returns the size of the blob, in bytes.
func (blob *BlobMetadata) Size() uint {
return blob.size
}

// BlobIndex returns the blobIndex of the blob.
func (blob *BlobMetadata) BlobIndex() uint {
return blob.blobIndex
}

// RemainingReadPermits returns the maximum number of remaining reads permitted against this blob.
func (blob *BlobMetadata) RemainingReadPermits() int {
return blob.remainingReadPermits
}
62 changes: 62 additions & 0 deletions tools/traffic/table/blob_store.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
package table

import "sync"

// BlobStore is a thread safe data structure that tracks blobs written by the traffic generator.
type BlobStore struct {

// blobs contains all blobs currently tracked by the store.
blobs map[uint64]*BlobMetadata

// nextKey describes the next key to used for the blobs map.
nextKey uint64

lock sync.Mutex
}

// NewBlobStore creates a new BlobStore instance.
func NewBlobStore() *BlobStore {
return &BlobStore{
blobs: make(map[uint64]*BlobMetadata),
nextKey: 0,
}
}

// Add a blob to the store.
func (store *BlobStore) Add(blob *BlobMetadata) {
store.lock.Lock()
defer store.lock.Unlock()

store.blobs[store.nextKey] = blob
store.nextKey++
}

// GetNext returns the next blob in the store. Decrements the blob's read permits, removing it
// from the store if the permits are exhausted. This method makes no guarantees about the order
// in which blobs are returned. Returns nil if no blobs are available.
func (store *BlobStore) GetNext() *BlobMetadata {
store.lock.Lock()
defer store.lock.Unlock()

for key, blob := range store.blobs {
// Always return the first blob found.

if blob.remainingReadPermits > 0 {
blob.remainingReadPermits--
if blob.remainingReadPermits == 0 {
delete(store.blobs, key)
}
}

return blob
}
return nil
}

// Size returns the number of blobs currently stored.
func (store *BlobStore) Size() uint {
store.lock.Lock()
defer store.lock.Unlock()

return uint(len(store.blobs))
}
111 changes: 111 additions & 0 deletions tools/traffic/table/blob_store_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
package table

import (
tu "github.com/Layr-Labs/eigenda/common/testutils"
"github.com/stretchr/testify/assert"
"golang.org/x/exp/rand"
"testing"
)

// randomMetadata generates a random BlobMetadata instance.
func randomMetadata(permits int) *BlobMetadata {
key := make([]byte, 32)
checksum := [16]byte{}
_, _ = rand.Read(key)
_, _ = rand.Read(checksum[:])
return NewBlobMetadata(key, &checksum, 1024, 0, permits)
}

// TestBasicOperation tests basic operations of the BlobTable. Adds blobs and iterates over them.
func TestBasicOperation(t *testing.T) {
tu.InitializeRandom()

store := NewBlobStore()
assert.Equal(t, uint(0), store.Size())

size := 1024
expectedMetadata := make([]*BlobMetadata, 0)
for i := 0; i < size; i++ {
metadata := randomMetadata(1)
store.Add(metadata)
expectedMetadata = append(expectedMetadata, metadata)
assert.Equal(t, uint(i+1), store.Size())
}

for i := 0; i < size; i++ {
assert.Equal(t, expectedMetadata[i], store.blobs[uint64(i)])
}
}

// TestGetRandomWithRemoval tests getting a random blob data, but where the number of permits per blob is unlimited.
func TestGetRandomNoRemoval(t *testing.T) {
tu.InitializeRandom()

table := NewBlobStore()
assert.Equal(t, uint(0), table.Size())

// Requesting a random element from an empty table should return nil.
element := table.GetNext()
assert.Nil(t, element)

expectedMetadata := make(map[string]*BlobMetadata)
size := 128
for i := 0; i < size; i++ {
metadata := randomMetadata(-1) // -1 == unlimited permits
table.Add(metadata)
expectedMetadata[string(metadata.key)] = metadata
assert.Equal(t, uint(i+1), table.Size())
}

randomIndices := make(map[string]bool)

// Query more times than the number of blobs to ensure that blobs are not removed.
for i := 0; i < size*8; i++ {
metadata := table.GetNext()
assert.NotNil(t, metadata)
assert.Equal(t, expectedMetadata[string(metadata.key)], metadata)
randomIndices[string(metadata.key)] = true
}

// Sanity check: ensure that at least 10 different blobs were returned. This check is attempting to verify
// that we are actually getting random blobs. The probability of this check failing is extremely low if
// the random number generator is working correctly.
assert.GreaterOrEqual(t, len(randomIndices), 10)
}

// TestGetRandomWithRemoval tests getting a random blob data, where the number of permits per blob is limited.
func TestGetRandomWithRemoval(t *testing.T) {
tu.InitializeRandom()

table := NewBlobStore()
assert.Equal(t, uint(0), table.Size())

// Requesting a random element from an empty table should return nil.
element := table.GetNext()
assert.Nil(t, element)

permitCount := 2

size := 1024
expectedMetadata := make(map[string]uint)
for i := 0; i < size; i++ {
metadata := randomMetadata(permitCount)
table.Add(metadata)
expectedMetadata[string(metadata.Key())] = 0
assert.Equal(t, uint(i+1), table.Size())
}

// Requesting elements a number of times equal to the size times the number of permits should completely
// drain the table and return all elements a number of times equal to the number of permits.
for i := 0; i < size*permitCount; i++ {
metadata := table.GetNext()
assert.NotNil(t, metadata)

k := string(metadata.Key())
permitsUsed := expectedMetadata[k] + 1
expectedMetadata[k] = permitsUsed
assert.LessOrEqual(t, permitsUsed, uint(permitCount))
}

assert.Equal(t, uint(0), table.Size())
}
Loading