[BUG]: Model extraction cases #407

liunelson · 2024-12-10T22:28:48Z

I found some edge cases for template_model_from_sympy_odes.

Case 1

Fixed Birth (l) and proportionate death (m X) processes are interpreted correctly as natural production and natural degradation:

\frac{d S}{d t} = -b S I + l - m S
\frac{d I}{d t} = b S I - g I - m I
\frac{d R}{d t} = g I - m R

However, I had expected the m S terms on d I/d t, d R/d t to be interpreted controlled degradation templates with rate laws m * S.

\frac{d S}{d t} = -b S I + l - m S
\frac{d I}{d t} = b S I - g I - m S
\frac{d R}{d t} = g I - m S

Doing it directly with template_model_from_sympy_odes gives a model that completely ignores all the m S terms:

odes = [
    sympy.Eq(S(t).diff(t), - b * S(t) * I(t) + l - m * S(t)),
    sympy.Eq(I(t).diff(t), b * S(t) * I(t) - g * I(t) - m * S(t)),
    sympy.Eq(R(t).diff(t), g * I(t) - m * S(t)),
]

Doing it within Terarium (which styles the above LaTeX and converts it to SymPy before sending to MIRA) gives a strange model wherein the rate laws are "-I*S*b + l", "I*S*b - I*g", "I*g" (possibly unrelated to MIRA):

Case 2

This is a case with "branching ratios":

\frac{d S}{d t} = -b S I
\frac{d I}{d t} = b S I - g I
\frac{d R}{d t} = k g I
\frac{d V}{d t} = (1 - k) g I

This gives a model where I has a natural degradation g I and two controlled productions of R, V, instead of two natural conversions from I into R, V.

If we were to rewrite the 2nd equation as \frac{d I}{d t} = b S I - k g I - (1 - k) g I, then MIRA returns the correct model (where I branches into R, V with ratio k).

Such a branching case was actually involved in a paper from which the UCSD team wanted to extract a model and it required careful reading of the text to realize a rewrite of the equations.

Equations S1 in the SI of this paper: https://www.sciencedirect.com/science/article/pii/S0264410X21013086#s0120
Note how I branches into Z, R with ratio eta

Do you have a forthcoming solution to this problem or do you expect users/Terarium to rewrite the equations?

The text was updated successfully, but these errors were encountered:

liunelson · 2024-12-10T23:52:04Z

I have more cases described here:
DARPA-ASKEM/terarium#5805

The issue there (Cases 1, 3) appear to be related to the SymPy's parse_latex adding unnecessary parentheses around multiple terms, causing MIRA to represent the parenthesis'd term to be a single template.

Can simplify_rate_law or MIRA fix this problem or do we need to ditch the SymPy parser and hope that a LLM agent could convert the LaTeX to SymPy better?

Case 3a

odes_latex = [
    r"\frac{d S(t)}{d t} = -b * S(t) * I(t) + l - m * S(t)",
    r"\frac{d I(t)}{d t} = b * S(t) * I(t) - g * I(t) - m * I(t)", 
    r"\frac{d R(t)}{d t} = g * I(t) - m * R(t)",
]

odes_sympy = [
    Eq(Derivative(S(t), t), -m*S(t) + (l + ((-b)*S(t))*I(t))),
    Eq(Derivative(I(t), t), -m*I(t) + (-g*I(t) + (b*S(t))*I(t))),
    Eq(Derivative(R(t), t), g*I(t) - m*R(t))
]

Case 3b

odes_latex = [
    r"\frac{d S(t)}{d t} = -b * S(t) * I(t)",
    r"\frac{d I(t)}{d t} = b * S(t) * I(t) - k * g * I(t) - (1 - k) * g * I(t)", 
    r"\frac{d R(t)}{d t} = k * g * I(t)",
    r"\frac{d V(t)}{d t} = (1 - k) * g * I(t)"
]

odes_sympy = [
    Eq(Derivative(S(t), t), -b*I(t)*S(t)),
    Eq(Derivative(I(t), t), -g*(1 - k)*I(t) + ((b*S(t))*I(t) - g*k*I(t))),
    Eq(Derivative(R(t), t), (g*k)*I(t)),
    Eq(Derivative(V(t), t), (g*(1 - k))*I(t))
]

bgyori · 2024-12-11T21:20:57Z

For Case 1, I think the term \frac{d R}{d t} = - m S is not physically plausible and doesn't really fit a canonical pattern that could be recognized. A physically plausible controlled degradation would be something like \frac{d R}{d t} = - m R S. Does this come up in practice or is it a hypothetical example?

bgyori · 2024-12-11T21:27:17Z

For Case 2, I agree this is an ambiguous case (both models produce correct ODEs) and it would be nice to recognize the natural conversions, though not trivial. This requires some further thinking and algorithmic improvement.

liunelson · 2024-12-12T18:38:40Z

@bgyori
Case 1 was meant to be purely hypothetical - I was added the usual natural death processes (dX/dt = ... - m * X) and tried out this admittedly unphysical variation. Is this feature to ignore unphysical patterns such as this one?

I just updated our LaTeX style guide (which an LLM agent is instructed to follow when cleaning up LaTeX provided by users or upstream service). * will now be used to explicitly denote multiplication to avoid SymPy parse_latex(...) from converting LaTeX a b (1 - g) I(t) to SymPy a * b(t = 1 - g) * I(t).

liunelson · 2024-12-12T18:42:37Z

I imagine Case 2 is quite nontrivial to tackle automatically, despite how common I see it in model-extraction scenarios. I'm split on whether to try to teach/instruct the equation-styling LLM agent to recognize and expand branching terms. I'll experiment.

liunelson · 2024-12-12T18:47:38Z

Could you comment on Cases 3a/b? We're trying to figure out how to pass SymPy strings (as opposed to SymPy sympy.core.relational.Equality) to the MIRA function.

Previously, we simply did:

model = template_model_from_sympy_odes([sympy.parsing.latex.parse_latex(ode) for ode in odes_latex])

If MIRA didn't get tripped up by the extra (), we wouldn't have to switch to an LLM solution.

bgyori · 2024-12-16T18:36:07Z

For Case 3a, I believe we are getting the expected result, despite the parentheses.

ControlledConversion 		 I*S*b
NaturalProduction 		 l
NaturalDegradation 		 S*m
NaturalConversion 		 I*g
NaturalDegradation 		 I*m
NaturalDegradation 		 R*m

In particular, the first two templates look correct in terms of a separate production and conversion template.

bgyori · 2024-12-16T18:47:40Z

Case 3b appears to be working correctly as well, I get these templates

ControlledConversion 		 I*S*b
NaturalConversion 		 I*g*k
NaturalConversion 		 I*g*(1 - k)

which look correct (just printing some basic details, the actual subjects/objects/controllers are also correct)

liunelson · 2024-12-16T19:47:41Z

That's quite weird, I get different results from you:

Case 3a: model.json
Case 3b: model.json

I'm also up to the latest MIRA

liunelson · 2024-12-17T20:41:33Z

Here's the code snippets that I used:

# Case 3a
odes_latex = [
    r"\frac{d S(t)}{d t} = -b * S(t) * I(t) + l - m * S(t)",
    r"\frac{d I(t)}{d t} = b * S(t) * I(t) - g * I(t) - m * I(t)", 
    r"\frac{d R(t)}{d t} = g * I(t) - m * R(t)",
]

odes_sympy = [sympy.parsing.latex.parse_latex(ode) for ode in odes_latex]

__ = [print(ode) for ode in odes_sympy]

model = template_model_from_sympy_odes(odes_sympy)

generate_summary_table(model)

# Case 3b
odes_latex = [
    r"\frac{d S(t)}{d t} = -b * S(t) * I(t)",
    r"\frac{d I(t)}{d t} = b * S(t) * I(t) - k * g * I(t) - (1 - k) * g * I(t)", 
    r"\frac{d R(t)}{d t} = k * g * I(t)",
    r"\frac{d V(t)}{d t} = (1 - k) * g * I(t)"
]

odes_sympy = [sympy.parsing.latex.parse_latex(ode) for ode in odes_latex]

__ = [print(ode) for ode in odes_sympy]

model = template_model_from_sympy_odes(odes_sympy)
generate_summary_table(model)

liunelson changed the title ~~[BUG]:~~ [BUG]: Model extraction cases Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: Model extraction cases #407

[BUG]: Model extraction cases #407

liunelson commented Dec 10, 2024

liunelson commented Dec 10, 2024 •

edited

Loading

bgyori commented Dec 11, 2024

bgyori commented Dec 11, 2024

liunelson commented Dec 12, 2024

liunelson commented Dec 12, 2024

liunelson commented Dec 12, 2024

bgyori commented Dec 16, 2024

bgyori commented Dec 16, 2024

liunelson commented Dec 16, 2024

liunelson commented Dec 17, 2024 •

edited

Loading

[BUG]: Model extraction cases #407

[BUG]: Model extraction cases #407

Comments

liunelson commented Dec 10, 2024

Case 1

Case 2

liunelson commented Dec 10, 2024 • edited Loading

Case 3a

Case 3b

bgyori commented Dec 11, 2024

bgyori commented Dec 11, 2024

liunelson commented Dec 12, 2024

liunelson commented Dec 12, 2024

liunelson commented Dec 12, 2024

bgyori commented Dec 16, 2024

bgyori commented Dec 16, 2024

liunelson commented Dec 16, 2024

liunelson commented Dec 17, 2024 • edited Loading

liunelson commented Dec 10, 2024 •

edited

Loading

liunelson commented Dec 17, 2024 •

edited

Loading