Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-allocation* redesign #76

Open
clarkmcc opened this issue Aug 3, 2024 · 3 comments
Open

Zero-allocation* redesign #76

clarkmcc opened this issue Aug 3, 2024 · 3 comments

Comments

@clarkmcc
Copy link
Owner

clarkmcc commented Aug 3, 2024

This issue is meant to be a scratchpad for ideas and feedback on improving how types and values are handled in the interpreter. There have been several different feature requests recently that could be solved by the a fundamental shift in how CEL values are managed.

Today any value referenced in a CEL expression is owned by either the Program or the Context. The Program owns values that were created directly in the CEL expression, for example in the expression ("foo" == "bar") == false, there are three values owned by the Program, "foo", "bar", and false. Values that are owned by the Context are values that are dynamically provided at execution time to a Program as variables, like foo == bar, both foo and bar are values owned by the Context.

  • I like the idea of fundamentally changing Context so that it does not own any data, meaning that you do not need to clone your types to use them in CEL. Instead I'd like the Context to have references to that data.
  • In the case of deeply nested structs, we would provide a derive macro to generate field accessors that the interpreter would call when doing variable resolution. When referencing a property on a struct, CEL would operate on a reference to that data.

Questions:

  • Would we even need Arc/Rc's anymore or could we get away with just this since we would assume the caller owned all the data. RIght now, an Arc is required for values owned by Program because a Program can be executed one or more times simultaneously.
    pub enum Value<'a> {
        String(&'a str)
    }
  • We can easily get references to, and represent primitive types in the interpreter, but what if an expression returned a property whose type was some other user-defined struct? How would we work with that? Perhaps instead of a Value enum, we have a Value trait that exposes every behavior supported in a CEL expression, i.e.:
    pub trait Value {
        fn get_property(&self, key: &str) -> Box<dyn Value>;
    
        fn add(&self, other: &dyn Value) -> Box<dyn Value>;
    
        fn sub(&self, other: &dyn Value) -> Box<dyn Value>;
    }
@kyle-mccarthy
Copy link

kyle-mccarthy commented Aug 26, 2024

While not zero-allocation, have you considered integrating with the protobuf ecosystem? There is prost-reflect which provides a DynamicMessage trait allowing field access via get_field. Program could then accept a DescriptorPool to help with field and type resolution.

@Caellian
Copy link
Contributor

Caellian commented Oct 14, 2024

I like the idea of fundamentally changing Context so that it does not own any data, meaning that you do not need to clone your types to use them in CEL. Instead I'd like the Context to have references to that data.

This only works if you're not passing temporaries into Context, otherwise end-user is forced to keep (moving) them around for as long as the Context is used which is annoying and can be inefficient. I think simply using Cow for values would be simplest and most versatile, while also removing any extra copies/clones.

EDIT 1:

Also, if Value trait is added, and has add function, an additional add_to is required. In math, commutation is proven for numbers via induction and can't be assumed to hold true for every Value.

So the default implementation should be:

/// Called when other [`Value`] is added to this type.
#[inline]
fn add(&self, second: Value) -> ResolveResult {
    Err(ExecutionError::UnsupportedBinaryOperatorExternal(
        "add", std::any::type_name::<Self>(),
    ))
}

/// Called when this type is added to other [`Value`].
/// 
/// Override if addition is non-commutative.
#[inline]
fn add_to(&self, first: Value) -> ResolveResult {
    Self::add(self, first)
}

@Caellian
Copy link
Contributor

Caellian commented Nov 5, 2024

Tried doing this at some point, just letting you know you'll need to introduce an additional lifetime to a lot of functions in magic.rs because FunctionContext will require one Value<'the_one>, and that will have to stick around in fully type erased version as well.

I personally don't understand what magic does without spending a day/two examining the code and explanation from the forum, but I'm not invested enough (i.e. too busy) to do so atm.

This is however one of those changes that are better done earlier than later because as more features are tackled on it will be more and more difficult to implement it without breaking/rewriting everything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants