Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mypyc] Support str.replace #10088

Merged
merged 5 commits into from
Feb 24, 2021
Merged

Conversation

97littleleaf11
Copy link
Collaborator

Description

  • Support str.replace primitive

Test Plan

  • Add a new test function in test-data/run-strings.test, but error occurred:
run-strings.test:156: error: "str" has no attribute "replace"

@TH3CHARLie
Copy link
Collaborator

Thanks for contributing! Looks pretty close, the error is caused by missing fixture. You can try to update mypyc/test-data/fixtures/ir.py and add function definition for str.replace. Use str.split as a close reference.

@TH3CHARLie TH3CHARLie requested a review from JukkaL February 13, 2021 15:26
@97littleleaf11
Copy link
Collaborator Author

I got several compiler warnings, which would lead to pytest failure.
Then I added these flags to solve it: -Wno-nullability-completeness, -Wno-expansion-to-defined

  • Pytest version: 6.2.1
  • Compiler: Apple Clang 12.0.0

Copy link
Collaborator

@TH3CHARLie TH3CHARLie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @JukkaL Please take a look if you have time

# str.replace(old, new)
method_op(
name='replace',
arg_types=[str_rprimitive, str_rprimitive],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the signature here https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_Replace, I think each these arg_types might need one additional str_primitive to account for the "self" argument.

I noticed after compiling:

message = 'hello world'.replace('world', 'mars')
print(message)

in build/opts.txt we get:

...
L3:
    r5 = 'hello world'
    r6 = 'world'
    r7 = 'mars'
    r8 = 'replace'
    r9 = CPyObject_CallMethodObjArgs(r5, r8, r6, r7, 0)
...

(I think because the arg types don't quite align in the mypyc/irbuild/ll_builder.py::matching_call_c check and mypyc/irbuild/ll_builder.py::translate_special_method_call will return None instead of the op for PyUnicode_Replace). With an additional str_primitive arg, I think we will get the desired:

...
L3:
    r5 = 'hello world'
    r6 = 'world'
    r7 = 'mars'
    r8 = PyUnicode_Replace(r5, r6, r7, -1)
...

Does that make sense?

Copy link
Collaborator

@TH3CHARLie TH3CHARLie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasjohns Thanks for pointing it out! Yeah you are right, I've just suggested the changes and @97littleleaf11 please also include an IR test for this primitive. You can find examples of IR test in mypyc/test-data/irbuild-str.test

# str.replace(old, new)
method_op(
name='replace',
arg_types=[str_rprimitive, str_rprimitive],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
arg_types=[str_rprimitive, str_rprimitive],
arg_types=[str_rprimitive, str_rprimitive, str_rprimitive],

mypyc/primitives/str_ops.py Outdated Show resolved Hide resolved
@97littleleaf11
Copy link
Collaborator Author

97littleleaf11 commented Feb 18, 2021

@thomasjohns @TH3CHARLie Thanks for reviewing!
With the additional str_rprimitive on line 125 would cause a pytest error:

FAILED mypyc/test/test_run.py::TestRun::testStringOps

I checked the IR and found the issue:

a = "foofoofoo"
b = a.replace("foo", "bar", 3) 
      r1 = 'foofoofoo'                          (diff)
      a = r1                                    (diff)
      r2 = 'foo'                                (diff)
      r3 = 'bar'                                (diff)
      r4 = PyUnicode_Replace(a, r2, r3, 6)      (diff)

We cannot directly feed the int_rprimitive into PyUnicode_Replace .
Then I wrote a wrapped function in primitives/str_ops.c to fix it, using CPyStr_Split as a reference:

PyObject *CPyStr_Replace(PyObject *str, PyObject *old_substr, PyObject *new_substr, CPyTagged max_replace)
{
    Py_ssize_t temp_max_replace = CPyTagged_AsSsize_t(max_replace);
    if (temp_max_replace == -1 && PyErr_Occurred()) {
        PyErr_SetString(PyExc_OverflowError, "Python int too large to convert to C ssize_t");
            return NULL;
    }
    return PyUnicode_Replace(str, old_substr, new_substr, temp_max_replace);
}

It can pass the pytest now.
Should I make a new commit about it?

@TH3CHARLie
Copy link
Collaborator

TH3CHARLie commented Feb 18, 2021

I see. It's caused by mypyc's tagged integer which uses its lowest bit to represent tag and remaining bits to store the immediate value or the object pointer: https://github.com/python/mypy/blob/master/mypyc/doc/dev-intro.md#value-representation

So yes, you need a wrapper function and you find the correct way to do so. Please send a new commit to include your changes(both the IR test and the wrapper function)

Copy link
Collaborator

@TH3CHARLie TH3CHARLie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! LGTM! Only a small nit to patch and I believe it's good to go.

r6 = PyUnicode_Replace(s, old_substr, new_substr, -1)
return r6
L5:
unreachable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a newline at the end of IR test

@TH3CHARLie TH3CHARLie merged commit 15bd486 into python:master Feb 24, 2021
sthagen added a commit to sthagen/python-mypy that referenced this pull request Feb 24, 2021
@97littleleaf11 97littleleaf11 deleted the str-replace branch February 25, 2021 12:17
JukkaL pushed a commit that referenced this pull request Mar 8, 2021
Here are a few updates/fixes after a read through the mypyc docs.

For the corresponding primitive method updates, see:

* str.replace: #10088
* dict.copy: #9721
* dict.clear: #9724
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants