Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatter's is_decorated property is temporarily disabled during remove_format breaking concurrent threads #423

Open
vfazio opened this issue Apr 28, 2024 · 9 comments
Milestone

Comments

@vfazio
Copy link

vfazio commented Apr 28, 2024

See python-poetry/poetry#9334

Poetry relies on the formatter's is_decorated property to be consistent when multiple threads query the property.

If one thread is in the middle of updating a section and has changed the formatter's is_decorated property via remove_format, a second thread may query it and get back an incorrect result.

If cleo is not thread-safe, the developers of poetry should be made aware, otherwise there may need to be changes to how remove_format and format work in relation to whether decorated output is supported in general vs decorated output is being emitted for the current format operation.

@vfazio vfazio changed the title Formatter's decorated property is temporarily disabled during remove_format breaking concurrent threads Formatter's is_decorated property is temporarily disabled during remove_format breaking concurrent threads Apr 28, 2024
@vfazio vfazio changed the title Formatter's is_decorated property is temporarily disabled during remove_format breaking concurrent threads Formatter's is_decorated property is temporarily disabled during remove_format breaking concurrent threads Apr 28, 2024
@vfazio
Copy link
Author

vfazio commented Apr 29, 2024

Note that this seems to be an issue because the Formatter object is a shared instance across the IO objects, in Poetry's case StreamOutput and child SectionOutput objects (see StreamOutput.section)

One "workaround" would be to duplicate the Formatter object instead of sharing the object. That way the StreamOutput object could query its Formatter object independent of what the SectionOutput objects are using. This would allow the sections to manipulate their formatters independently and existing code may "just work" without having to make:

Formatter.remove_format -> Formatter.format -> Formatter.format_and_wrap -> Formatter._apply_current_style

aware that while the formatter supports decorating, this operation should not be decorated.

If that's not ideal because formatter styles need to be kept in sync between the parent output object and child sections, maybe leverage a different private instance flag:

    def remove_format(self, text: str) -> str:
        self._suppress_decoration = True
        text = re.sub(r"\033\[[^m]*m", "", self.format(text))
        self._suppress_decoration = False

        return text

    def _apply_current_style(
        self, text: str, current: str, width: int, current_line_length: int
    ) -> tuple[str, int]:
        if not text:
            return "", current_line_length

        if not width:
            if self.is_decorated() and not self._suppress_decoration:
                return self._style_stack.current.apply(text), current_line_length

            return text, current_line_length

</snip>
        if self.is_decorated() and not self._suppress_decoration:
            apply = self._style_stack.current.apply
            text = "\n".join(map(apply, lines))
            
        return text, current_line_length

@devkral
Copy link

devkral commented Oct 29, 2024

Use contextvars and your problems are gone.
They are per Thread.
You may want to use the scope feature of context vars:

contextvar = ContextVar(..)

...

token = contextvar.set(1)
try:
   ...
finally:
   contextvar.reset(token)

@devkral
Copy link

devkral commented Oct 29, 2024

the property can be something like:

@property
def is_decorated(self):
   return contextvar.get()

@Secrus
Copy link
Member

Secrus commented Oct 29, 2024

Thank you @devkral. Would you have a minute to make a PR?

@vfazio
Copy link
Author

vfazio commented Oct 29, 2024

Not saying this isn't the right fix, but one of the questions I posited was whether this library was intended to be thread-safe. If it is, this fixes a single instance where there was a thread conflict, but there may be others. If not, then its on the caller to use the library as intended.

@Secrus
Copy link
Member

Secrus commented Oct 29, 2024

@vfazio I'd want it to be thread-safe (at the very least, UI/terminal printing should be), since Poetry uses Cleo in threaded jobs. I don't know of an easy way to check if the code is thread-safe, but if you know, I would be happy to hear about it.

@vfazio
Copy link
Author

vfazio commented Oct 29, 2024

Im on holiday atm but saw these emails come through.

I don't think theres an "easy" way, but the reality is that if any public attribute/property can be changed on an object, its open to a race condition without some critical section. If these classes were frozen, once created the attribute values didn't change, itd be less of an issue.

Thats the crux of this problem if i recall correctly, an internal function was flipping the value of a publicly queryable property which caused a race.

@vfazio
Copy link
Author

vfazio commented Nov 25, 2024

@vfazio I'd want it to be thread-safe (at the very least, UI/terminal printing should be), since Poetry uses Cleo in threaded jobs. I don't know of an easy way to check if the code is thread-safe, but if you know, I would be happy to hear about it.

I just realize i dropped the ball on actually replying to this until i just reviewed the poetry repo for updates.

Poetry already takes responsibility for locking when updating sections, I'm pretty sure that's what this lock is used for

https://github.com/python-poetry/poetry/blob/c70cbf4b493b21294ecc0f230f7b621ce3225b11/src/poetry/installation/executor.py#L93

So I don't think Poetry expects cleo to be thread safe in general.

The central issue is that Poetry expected an Output object's is_decorated property to be frozen after the class has been instantiated. That would seem mostly reasonable given how the class is declared. However, that's now how the property works since code internal to the formatter flips this value when updating sections.

If Poetry had been made aware that this was not a thread-safe method, they could have just acquired the lock before reading the property:

with self._lock:
    decorated = self.supports_fancy_output()

My PR avoided the lock and just cached the value, since we expect that the final output will be decorated based on how the Output object was constructed. Regardless of the current state of the formatter as sections get written out, we know that we can count on the output being formatted when finally displayed.

python-poetry/poetry#9335

I hopefully didn't give the impression that I thought this library needs to be thread-safe, only that a decision and some communication should be made about the intentions of the library.

If it's not intended to be thread safe, that's 100% ok IMO and consumers need to realize that they're on their own for solving issues that come up.

However, if there is real desire to make this thread safe and to potentially avoid consumers from having to declare their own critical sections, that's a much larger task.

For the formatter, i threw out a couple of bad ideas. Another bad idea is to just use a lock in any method that references to self._decorated to synchronize access while it's potentially being changed.

However, that's just one property and thread safety can be a game of whack-a-mole. Public methods like format and format_and_wrap which utilize an internal stack are likely to also get corrupted if multiple threads have access to the formatter and aren't synchronized.

@Secrus
Copy link
Member

Secrus commented Nov 27, 2024

Let's put it this way: if it can be made thread-safe easily, why not do it? In this case the fix seems quite easy to do, so it will be fixed in the next version

@Secrus Secrus added this to the 3.0 milestone Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants