Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added blob table data structure. #677

Merged
merged 8 commits into from
Aug 6, 2024

Conversation

cody-littley
Copy link
Contributor

@cody-littley cody-littley commented Jul 31, 2024

Why are these changes needed?

This is code split off from another PR that got too big: #666

This PR adds a data structure called a BlobTable, which is used by the traffic generator to track blobs.

Checks

  • I've made sure the lint is passing in this PR.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

// checksum of the blob.
checksum *[16]byte

// batchHeaderHash of the blob in bytes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment should be updated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, fixed

defer table.lock.Unlock()

blob.index = table.size
table.blobs[table.size] = blob
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the table.size is out of bound?

Copy link
Contributor Author

@cody-littley cody-littley Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Good catch. It just so happened that the original capacity was 1024, which exceeded the number of blobs I was testing with.

I also made the initial capacity 0, thus making existing unit tests sensitive to this problem (as well as eliminating a magic number).

// NewBlobTable creates a new BlobTable instance.
func NewBlobTable() BlobTable {
return BlobTable{
blobs: make([]*BlobMetadata, 1024),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we make the size of this slice variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would work as long as we only add blobs. When blobs are removed from the table, the size of the table may actually be smaller than the size of this slice.

size uint

// lock is used to synchronize access to the requiredReads.
lock sync.Mutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use RWMutex for more granular control

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

table.size++
}

// AddOrReplace adds a blob to the requiredReads if there is capacity or replaces an existing blob at random
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the requiredReads?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation was out of date. Fixed.

// AddOrReplace is equivalent to Add if there is capacity, or replaces an existing blob at random
// if the is no remaining capacity. This method is a no-op if maximumCapacity is 0.

defer table.lock.Unlock()

if table.size >= maximumCapacity {
// replace random existing blob
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the existing blob hasn't been retrieved the required number of times?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it is removed.

This is not a problem for our use case though, since we keep two blob tables: one for required blobs, and another for optional blobs. The code never calls AddOrReplace() on the table that contains the required blobs.

blob := table.blobs[rand.Int31n(int32(table.size))]

removed := false
if decrement && blob.remainingReadPermits != -1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the condition be blob.remainingReadPermits != -1?
Say blob.remainingReadPermits is 0 before this call. Then it gets decremented to -1 inside this block and is never removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read permits for blobs in the table are never 0. They are always -1 or greater than 0, and removed the moment the count reaches 0.

Just in case, I added an assertion in the NewBlobMetadata() method to validate that this invariant is not violated.

	if readPermits == 0 {
		panic("readPermits must be greater than 0, or -1 for unlimited reads")
	}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh got it

}

// remove a blob from the requiredReads.
func (table *BlobTable) remove(blob *BlobMetadata) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason why we need table to be a slice?
I feel like removing/adding elements would be a lot simpler and less brittle if we used a map

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primary reason why I use a slice is to give an O(1) implementation of GetRandom(). Can you think of a good way to do this backed by a map?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to access a random element from the slice?
For sampling, could we just use the first element from the map?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire reason why this complex data structure exists in the first place is to facilitate random access. 😜

For the blobs with a number of required reads: yes, we could get away without random access. But I'm under the impression a random access pattern is preferable to a fixed one when simulating workloads like this.

For the pool of optional blobs to read, I think random access is necessary. Otherwise, we'd just be reading the same blob over and over until we get a new blob to start reading.

I'm open to discussing this more in depth if you are not convinced by my reasoning.

Copy link
Contributor

@ian-shim ian-shim Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the blobs with a number of required reads: yes, we could get away without random access. But I'm under the impression a random access pattern is preferable to a fixed one when simulating workloads like this.

We could make get semi random access with a map in constant time (just generate random number n < 10 and pick _n_th element from map) if the access doesn't require sampling from uniform distribution.

For the pool of optional blobs to read, I think random access is necessary. Otherwise, we'd just be reading the same blob over and over until we get a new blob to start reading.

I don't think optional blob reads were ever part of the spec. The primary goal for this observability tool is to monitor if the network can handle a given retrieval traffic. Are there benefits of saturating the network with optional reads?

Not a big deal since you have it implemented already, but would bias toward simplicity vs. optimization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to use a map based implementation.

// BlobMetadata encapsulates various information about a blob written by the traffic generator.
type BlobMetadata struct {
// key of the blob, set when the blob is initially uploaded.
key *[]byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here and below: may not need the pointer as slice is already a pointer and cheap to copy around

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, TIL. Will simplify code to not pass slice pointers.

// checksum of the blob.
checksum *[16]byte

// batchHeaderHash of the blob in bytes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batchHeaderHash is fixed 32 bytes so no need to track it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Field removed.


// Size returns the total number of blobs currently tracked by the requiredReads.
func (table *BlobTable) Size() uint {
table.lock.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and a few places below: can just use read lock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I've switched to a read/write lock pattern.

Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
// remove a blob from the requiredReads.
func (table *BlobTable) remove(blob *BlobMetadata) {
if table.blobs[blob.index] != blob {
panic(fmt.Sprintf("blob %x is not not present in the requiredReads at index %d", blob.Key(), blob.index))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log at error level and handle this case gracefully vs. crashing the whole program?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was replaced with the simpler implementation.

}

// remove a blob from the requiredReads.
func (table *BlobTable) remove(blob *BlobMetadata) {
Copy link
Contributor

@ian-shim ian-shim Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the blobs with a number of required reads: yes, we could get away without random access. But I'm under the impression a random access pattern is preferable to a fixed one when simulating workloads like this.

We could make get semi random access with a map in constant time (just generate random number n < 10 and pick _n_th element from map) if the access doesn't require sampling from uniform distribution.

For the pool of optional blobs to read, I think random access is necessary. Otherwise, we'd just be reading the same blob over and over until we get a new blob to start reading.

I don't think optional blob reads were ever part of the spec. The primary goal for this observability tool is to monitor if the network can handle a given retrieval traffic. Are there benefits of saturating the network with optional reads?

Not a big deal since you have it implemented already, but would bias toward simplicity vs. optimization

blob := table.blobs[rand.Int31n(int32(table.size))]

removed := false
if decrement && blob.remainingReadPermits != -1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh got it

readPermits int) *BlobMetadata {

if readPermits == 0 {
panic("readPermits must be greater than 0, or -1 for unlimited reads")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could return error instead of crashing when validation fails

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change made

Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Copy link
Contributor

@ian-shim ian-shim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Can we address this comment from last review?

Signed-off-by: Cody Littley <[email protected]>
key []byte

// checksum of the blob.
checksum *[16]byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to a non-pointer variable.

size uint

// blobIndex of the blob.
blobIndex uint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is blob index?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blob index is one of the arguments needed when retrieving a blob. I am unsure of the deeper meaning of this field, I figured out I needed it through reverse engineering.

func (r *retrievalClient) RetrieveBlob(
	ctx context.Context,
	batchHeaderHash [32]byte,
	blobIndex uint32,  👈
	referenceBlockNumber uint,
	batchRoot [32]byte,
	quorumID core.QuorumID) ([]byte, error) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, this index is the the position of the blob in the batch (the batch is a list of blobs)

// BlobMetadata encapsulates various information about a blob written by the traffic generator.
type BlobMetadata struct {
// key of the blob, set when the blob is initially uploaded.
key []byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the blobKey:

type BlobKey struct {
?

Copy link
Contributor Author

@cody-littley cody-littley Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key, in this context, is the []byte return value of this method:

func (c *disperserClient) DisperseBlob(ctx context.Context, data []byte, quorums []uint8) (*disperser.BlobStatus, []byte, error) {

The BlobKey seems like it is holding the same data, although it is in string form, not byte array form. I originally had the MetadataHash field as well, although that was removed based on a prior comment.

Should I be using the BlobKey struct here?

}, nil
}

// Key returns the key of the blob.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making the member variables public? These getters are quite simple so seem not worth the verbosity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intention was to make it possible to read these values without the capability of updating them. But maybe that's more of a java design pattern. I've made the member variables public as you suggest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea it feels quite Java-ish

Signed-off-by: Cody Littley <[email protected]>
@cody-littley cody-littley merged commit 7020083 into Layr-Labs:master Aug 6, 2024
6 checks passed
@cody-littley cody-littley deleted the blob-table-fragment branch August 6, 2024 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants