Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per topic validation #56

Closed
Stebalien opened this issue Jan 13, 2018 · 8 comments
Closed

Per topic validation #56

Stebalien opened this issue Jan 13, 2018 · 8 comments
Assignees

Comments

@Stebalien
Copy link
Member

In #55, validation is per-subscription. Unfortunately, this means:

  1. If we have multiple subscribers subscribing with the same validator, we'll run it multiple times (e.g., check a signature multiple times). This is really inefficient.
  2. If we have multiple subscribers subscribing with different validators, these subscriptions will affect each other. That's not what a user will suspect.

Really, validation is a property of the topic, not the subscription. Eventually, I'd like to be able to tell pubsub to relay a topic without explicitly subscribing (#28).


In terms of implementation, we have a few choices:

  1. Require that all topics be namespaced and use the namespace to pick the validator.
  2. Allow validators to decide whether or not they care about a topic.

2 is probably the easiest but not the cleanest. Possible interface:

// Register a validator with the pubsub.
func (p *PubSub) RegisterValidator(v Validator)

type Validator interface {
    // Apply this validator to this topic.
    Validates(topic string) bool // or `ValidatorFor(topic string) TopicValidator`?
    // Apply this validator to this topic by descriptor. Do we even need topic descriptors anymore? We don't use them... Having both of these is annoying...
    ValidatesTopicDescriptor(topic *pb.TopicDescriptor) bool
    Validate(msg) bool
}
@Stebalien
Copy link
Member Author

/cc @whyrusleeping thoughts?

@Stebalien
Copy link
Member Author

Note: if we want to do this on a per-subscription basis, we'd probably want to forward messages that validate against at least one of the validators ("or", not "and"). We'd also probably want to deduplicate duplicate validators (store them in a set).

@vyzo
Copy link
Collaborator

vyzo commented Jan 14, 2018

I think the simplest possible interface is to support direct registration per topic:

type Validator func(ctx context.Context, msg *Message) bool // this could be error for reversed semantics
func (p *PubSub) registerValidator(topic string, val Validator)

I don't like the idea of having a complex validator interface that we have to query to find topic association, let's just keep it a function.

@vyzo
Copy link
Collaborator

vyzo commented Jan 14, 2018

One thing I don't like with per-topic validation, compared to per subscription validation, is that we have to make two api calls in order to have validation: first to register the validator, and then to subscribe.

There is also the issue of multiple registered validators that we have with the per subscription interface; we still have to make a decision on whether to broadcast a message or not based on all the validators which suffers with the same semantics problem of per subscription validators -- do we accept a message based on a conjunction or a disjunction of validator outputs?

Perhaps we should only allow a single validator per topic.

@vyzo
Copy link
Collaborator

vyzo commented Jan 14, 2018

One final point: as @Stebalien suggests, we can keep per subscription validators but make the decision to broadcast based on the disjunction of validator output. We'd have to be careful to dispatch only to subscriptions for which the validator succeeded though.

@Stebalien
Copy link
Member Author

One thing I don't like with per-topic validation, compared to per subscription validation, is that we have to make two api calls in order to have validation: first to register the validator, and then to subscribe.

Only with the "register a validator per topic approach". The other approach of registering global validators doesn't have this limitation (you register a validator up-front for the topic family).

There is also the issue of multiple registered validators that we have with the per subscription interface; we still have to make a decision on whether to broadcast a message or not based on all the validators which suffers with the same semantics problem of per subscription validators -- do we accept a message based on a conjunction or a disjunction of validator outputs?

If multiple validators want to validate a message, we should take the and.

Note: we could also just require that all topics have a namespace prefix. I'd actually prefer this. Then, we'd register one validator per namespace. This would be consistent with the DHT.

we can keep per subscription validators but make the decision to broadcast based on the disjunction of validator output.

My one worry with this approach is that peers may end up disagreeing on what messages are valid on a topic. In that case, we effectively have different topics.

It also means that we can't disconnect from peers that send us many bad messages (assuming they're spamming us). We'd be able to determine if a message is valid but not if the message is definitely invalid.

@vyzo
Copy link
Collaborator

vyzo commented Jan 15, 2018

The other approach of registering global validators doesn't have this limitation (you register a validator up-front for the topic family).

that's a big plus -- and we can ship validators for well known channels in the future.

If multiple validators want to validate a message, we should take the and.

That's what we do with the per subscription based validator (it's effectively a registerValidator and subscribe)

Note: we could also just require that all topics have a namespace prefix. I'd actually prefer this. Then, we'd register one validator per namespace. This would be consistent with the DHT.

It makes sense to do this for well known or ipfs-the-daemon-used channels, but I don't want to restrict userland.
Users should be free to name their channels however they like.

My one worry with this approach is that peers may end up disagreeing on what messages are valid on a topic. In that case, we effectively have different topics.
It also means that we can't disconnect from peers that send us many bad messages (assuming they're spamming us). We'd be able to determine if a message is valid but not if the message is definitely invalid.

Agreed, and that's a really big issue. So we can throw away the validator disjunction idea already.

@Stebalien
Copy link
Member Author

That's what we do with the per subscription based validator

That reasoning only makes sense when registering per-topic validators. When validation is a property of the topic, I'd expect all the validators to pass. However, when validation is a property of the subscription, I wouldn't expect the subscriptions to interfere with each other.

Agreed, and that's a really big issue. So we can throw away the validator disjunction idea already.

In that case, I'd also throw away the concept of validators as a property of the subscription.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants