Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

share!(shwap): implement GetRange over Shwap #3773

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

vgonkivs
Copy link
Member

@vgonkivs vgonkivs commented Sep 25, 2024

Implement GetRange over Shwap

@vgonkivs vgonkivs force-pushed the implement_range_container branch 3 times, most recently from a89eaf5 to 7df200c Compare December 3, 2024 14:34
@vgonkivs vgonkivs self-assigned this Dec 3, 2024
@codecov-commenter
Copy link

codecov-commenter commented Dec 6, 2024

Codecov Report

Attention: Patch coverage is 43.77990% with 470 lines in your changes missing coverage. Please review.

Project coverage is 44.93%. Comparing base (2469e7a) to head (daf1162).
Report is 433 commits behind head on main.

Files with missing lines Patch % Lines
share/shwap/pb/shwap.pb.go 36.76% 155 Missing and 29 partials ⚠️
share/shwap/range_namespace_data.go 52.22% 56 Missing and 19 partials ⚠️
share/shwap/range_namespace_data_id.go 52.68% 31 Missing and 13 partials ⚠️
...re/shwap/p2p/bitswap/range_namespace_data_block.go 52.63% 24 Missing and 12 partials ⚠️
share/shwap/p2p/bitswap/getter.go 0.00% 28 Missing ⚠️
share/shwap/namespace_data.go 0.00% 18 Missing ⚠️
share/shwap/getters/cascade.go 0.00% 14 Missing ⚠️
nodebuilder/share/share.go 0.00% 13 Missing ⚠️
share/eds/proofs_cache.go 75.00% 8 Missing and 4 partials ⚠️
store/getter.go 0.00% 12 Missing ⚠️
... and 8 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3773      +/-   ##
==========================================
+ Coverage   44.83%   44.93%   +0.10%     
==========================================
  Files         265      311      +46     
  Lines       14620    23302    +8682     
==========================================
+ Hits         6555    10471    +3916     
- Misses       7313    11669    +4356     
- Partials      752     1162     +410     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vgonkivs vgonkivs marked this pull request as ready for review December 6, 2024 11:40
@vgonkivs vgonkivs added kind:misc Attached to miscellaneous PRs kind:break! Attached to breaking PRs labels Dec 6, 2024
@vgonkivs vgonkivs changed the title Implement range container share!(shwap): implement GetRange over Shwap Dec 6, 2024
@vgonkivs vgonkivs force-pushed the implement_range_container branch from 6f8e471 to 1327371 Compare December 6, 2024 11:45
ns libshare.Namespace,
from, to shwap.SampleCoords,
) (shwap.RangeNamespaceData, error) {
rawShares := make([][]libshare.Share, 0, to.Row-from.Row+1)
Copy link
Contributor

@cristaloleg cristaloleg Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we verify to.Row > from.Row ?

Copy link
Member Author

@vgonkivs vgonkivs Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it makes sense. I have checked this but on a deeper level. I guess the best place to check it in the cascade getter(sanity check before passing the request further)
cc @Wondertan

namespace libshare.Namespace,
from, to shwap.SampleCoords,
) (shwap.RangeNamespaceData, error) {
ctx, span := tracer.Start(ctx, "cascade/get-shares-by-namespace", trace.WithAttributes(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong tag? Just cascade/get-shares ?

Copy link
Member

@Wondertan Wondertan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass. Looks good

Comment on lines -92 to -94
namespacedShares := make([]libshare.Share, count)
copy(namespacedShares, shares[from:from+count])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a subtle reason for doing that, and I don't remember exactly which one. @walldiss had to debug it for quite some time because initially, we avoided the copy as well. The comment explaining this is definitely missing.
Let's recover what is going on here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need an extra copy here? Maybe it was left as an artifact? cc @walldiss

store/getter.go Outdated
Comment on lines 105 to 132
acc, err := g.store.GetByHeight(ctx, h.Height())
if err != nil {
return shwap.RangeNamespaceData{}, err
}
return acc.RangeNamespaceData(
ctx,
ns,
from, to,
)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error wrappings are helpful here. Let's keep them as with other methods.

Comment on lines 31 to 32
// Verify checks the integrity of the NamespaceData against a provided root and namespace.
func (nd NamespaceData) Verify(root *share.AxisRoots, namespace libshare.Namespace) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule of thumb we have: Validate - stateless, Verify - stateful.

In this case, it is stateful as it expects additional states to do verification, so it should be kept Verify.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reworked

@@ -143,3 +143,18 @@ jobs:

- name: run sync tests
run: make test-integration SHORT=true TAGS=pruning

share_tests:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract to a separate PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 58 to 62
GetRange(
ctx context.Context, ns libshare.Namespace, height uint64, from, to uint32, proofsOnly bool,
) (*Range, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to use shwap,SampleCoords in the node API. As we use them already in other methods and internally it is still SampleCoords, I propose to move use them here as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Akchually, the SampleCoords are for the whole EDS, while this index is ODS only. For that reason, we should consider having another type, particularly for ODS coordinates or indices. We could also try to have them both in a single type, but we should be careful with documentation and input validation. (This is for all API, Getters, Accessors)

Copy link
Member Author

@vgonkivs vgonkivs Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding SampleCoords can serve for both cases as it contains RowIndex and ColIndex which will be same. We should only check that range request asks for ODS only shares(RowIndex < len(dah.RowRoots)/2 && ColIndex < len(dah.ColRoots)/2).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to be consistent with all other methods. Reworked to coordinates

for offset := range shrs[from.Col:end] {
if !namespace.Equals(shrs[from.Col+offset].Namespace()) {
return RangeNamespaceData{},
fmt.Errorf("targeted namespace was not found in share at {Row: %d, Col:%d}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Errorf("targeted namespace was not found in share at {Row: %d, Col:%d}",
fmt.Errorf("targeted namespace was not found in share at {Row: %d, Col: %d}",


// Validate performs a validation of the incoming data. It ensures that the response contains proofs and performs
// data validation in case the user has requested it.
func (rngdata *RangeNamespaceData) Validate(root *share.AxisRoots, req *RangeNamespaceDataID) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use Verify

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

Comment on lines 281 to 285
for _, blk := range blks {
rnd := blk.(*RangeNamespaceDataBlock).Container
return rnd, nil
}
return shwap.RangeNamespaceData{}, errors.New("no data inside block")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Fetch guarantees having all blocks populated if error is nil, so we can simply do

Suggested change
for _, blk := range blks {
rnd := blk.(*RangeNamespaceDataBlock).Container
return rnd, nil
}
return shwap.RangeNamespaceData{}, errors.New("no data inside block")
return blks[0].(*RangeNamespaceDataBlock).Container, nil

Comment on lines +29 to +32
message NamespaceData {
repeated RowNamespaceData namespaceData = 1;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about also adding RangeNamespaceData in proto? It would embed NamespaceData. The main motivation would be keeping all the containers part of proto "spec" for consistency and clarity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree

message Share {
bytes data = 1;
}

enum AxisType {
ROW = 0;
COL = 1;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get back the \n

Copy link
Member

@rach-id rach-id left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some questions to understand more

@@ -56,7 +55,9 @@ type Module interface {
ctx context.Context, height uint64, namespace libshare.Namespace,
) (shwap.NamespaceData, error)
// GetRange gets a list of shares and their corresponding proof.
GetRange(ctx context.Context, height uint64, start, end int) (*GetRangeResult, error)
GetRange(
ctx context.Context, ns libshare.Namespace, height uint64, from, to uint32, proofsOnly bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[docs]
would be good to know if the from and to are inclusive or exclusive. FYI, for core we use [from:to[

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to coordinates which represents the first and the last share of the range

}

// RangedNamespaceDataFromShares builds a range of namespaced data for the given coordinates:
// shares is a list of shares(grouped by the rows) relative to the data square and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// shares is a list of shares(grouped by the rows) relative to the data square and
// shares is a list of shares (grouped by the rows) relative to the data square and

fmt.Errorf("start row must be less or equal to end row: %d-%d", from.Row, to.Row)
}

odsSize := len(shares[0]) / 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the set of shares is smaller than a row?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shares[0] is a full row

// from is the coordinates of the first share of the range within the EDS.
// to is the coordinates of the last inclusive share of the range within the EDS.
func RangedNamespaceDataFromShares(
shares [][]libshare.Share,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do shares include all the row shares or just the shares we want to prove?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RangedNamespaceDataFromShares accepts all shares from the row but the data structure contains only shares that we want to prove

Comment on lines 51 to 74
nmtTree := nmt.New(
appconsts.NewBaseHashFunc(),
nmt.NamespaceIDSize(libshare.NamespaceSize),
nmt.IgnoreMaxNamespace(true),
)
tree.SetTree(nmtTree)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
nmtTree := nmt.New(
appconsts.NewBaseHashFunc(),
nmt.NamespaceIDSize(libshare.NamespaceSize),
nmt.IgnoreMaxNamespace(true),
)
tree.SetTree(nmtTree)

I don't think creating the tree manually or setting the tree here is necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdym? I want to build nmt proof

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line:

tree := wrapper.NewErasuredNamespacedMerkleTree(uint64(odsSize), uint(row))

does that for you. you don't need those other lines I deleted

)
tree.SetTree(nmtTree)

shrs := shares[i]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]

Suggested change
shrs := shares[i]
rowShares := shares[i]

Comment on lines +69 to +93

outside, err := share.IsOutsideRange(namespace, root, root)
if err != nil {
return RangeNamespaceData{}, err
}
if outside {
return RangeNamespaceData{}, ErrNamespaceOutsideRange
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check could be moved to the top of the function to something like:

Suggested change
outside, err := share.IsOutsideRange(namespace, root, root)
if err != nil {
return RangeNamespaceData{}, err
}
if outside {
return RangeNamespaceData{}, ErrNamespaceOutsideRange
}
outside, err := share.IsOutsideRange(namespace, shares[0][0], shares[len(shares)-1][len(shares[len(shares)-1])-1])
if err != nil {
return RangeNamespaceData{}, err
}
if outside {
return RangeNamespaceData{}, ErrNamespaceOutsideRange
}

This if we assume that the provided list of shares is ordered by namespace. If this is the case, we should document it as a precondition in the function. Otherwise, we should add more checks on the list of shares provided

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the shares are out of order, then tree.Push() fails with

if nID.Less(n.leaves[curSize-1][:nidSize]) {
			return nil, fmt.Errorf(
				"%w: last namespace: %x, pushed: %x",
				ErrInvalidPushOrder,
				n.leaves[curSize-1][:nidSize],
				nID,
			)
		}

Also, this function is called from the accessors(within 1 machine), so there is no reason not to trust them.

end := odsSize
if i == len(shares)-1 {
// `to.Col` is an inclusive index
end = to.Col + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]
would be nice to reflect this in the name and rename it to exclusiveEnd

Comment on lines 87 to 94
for offset := range shrs[from.Col:end] {
if !namespace.Equals(shrs[from.Col+offset].Namespace()) {
return RangeNamespaceData{},
fmt.Errorf("targeted namespace was not found in share at {Row: %d, Col:%d}",
row, from.Col+offset,
)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also this would be nice to do at the beginning of the function to avoid running all the logic to find out the provided range of shares is malformed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving it to the beginning of the function could allow us to avoid extra checks of the share.IsOutsideRange. wdyt?

}

// ProveRange proves that a range of shares exist in the set of rows that are part of the merkle tree of the data root.
func (rngdata *RangeNamespaceData) ProveRange(roots *share.AxisRoots, startRow int) *types.ShareProof {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this proving? how do you specify the set of shares to be proved?

Copy link
Member Author

@vgonkivs vgonkivs Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RangeNamespaceData is a slice of the RowNamespaceData, that contains shares and nmt proofs for these shares:

type RowNamespaceData struct {
	Shares []libshare.Share `json:"shares"` // Shares within the namespace.
	Proof  *nmt.Proof       `json:"proof"`  // Proof of the shares' inclusion in the namespace.
}

ProveRange builds a proof of the shares to the data root

@vgonkivs vgonkivs force-pushed the implement_range_container branch from 6229518 to bf00550 Compare January 29, 2025 15:47
@vgonkivs vgonkivs force-pushed the implement_range_container branch from a8f5dde to fcc89e3 Compare January 29, 2025 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:break! Attached to breaking PRs kind:misc Attached to miscellaneous PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants