Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace old string formatting syntax with f-strings #29547

Closed
99 tasks
ShaharNaveh opened this issue Nov 11, 2019 · 137 comments
Closed
99 tasks

Replace old string formatting syntax with f-strings #29547

ShaharNaveh opened this issue Nov 11, 2019 · 137 comments
Labels
Code Style Code style, linting, code_checks good first issue

Comments

@ShaharNaveh
Copy link
Member

ShaharNaveh commented Nov 11, 2019

Since we no longer support python 3.5, we can now use the new f-strings instead of the old .format() ( and obviously the % formatting).

Notes:

  • Don't forget to link this issue in your pull request's body message , simply paste this https://github.com/pandas-dev/pandas/issues/29547 in your pull request's body message.

  • If any of your changed files are related to Replace "foo!r" to "repr(foo)" syntax #29886 , please make sure to link your pull request to that issue as well, simply paste this https://github.com/pandas-dev/pandas/issues/29886 in your pull request's body message as well.

  • Please comment what you are planning to work on, so we won't do double work.

  • If a file/files that should be marked as done, is not marked, please comment letting me know.


To check what files still needs to be fixed in the pandas directory:

grep -l -R '%s'  --include=*.{py,pyx} pandas/
grep -l -R '%d' --include=*.{py,pyx} pandas/
grep -l -R '\.format(' --include=*.{py,pyx} pandas/

All of the above can also be used as a one liner:

grep -l -R -e '%s' -e '%d' -e '\.format(' --include=*.{py,pyx} pandas/
Tip:

If you want to see the line number of the occurrence, replace the -l with -n
for example:

grep -n -R '%s' --include=*.{py,pyx} pandas/

The current list is:

  • pandas/compat/pickle_compat.py

  • pandas/_config/config.py

  • pandas/core/arrays/datetimelike.py

  • pandas/core/arrays/datetimes.py

  • pandas/core/arrays/integer.py

  • pandas/core/arrays/period.py

  • pandas/core/computation/pytables.py

  • pandas/core/config_init.py

  • pandas/core/frame.py

  • pandas/core/generic.py

  • pandas/core/groupby/generic.py

  • pandas/core/groupby/groupby.py

  • pandas/core/indexes/base.py

  • pandas/core/indexes/multi.py

  • pandas/core/indexes/range.py

  • pandas/core/ops/docstrings.py

  • pandas/core/ops/__init__.py

  • pandas/core/reshape/merge.py

  • pandas/core/tools/datetimes.py

  • pandas/io/formats/css.py

  • pandas/io/formats/excel.py

  • pandas/io/formats/format.py

  • pandas/io/formats/html.py

  • pandas/io/formats/info.py

  • pandas/io/formats/latex.py

  • pandas/io/formats/printing.py

  • pandas/io/formats/style.py

  • pandas/io/parsers.py

  • pandas/io/pytables.py

  • pandas/io/sas/sas_xport.py

  • pandas/io/stata.py

  • pandas/_libs/tslibs/c_timestamp.pyx

  • pandas/_libs/tslibs/frequencies.pyx

  • pandas/_libs/tslibs/parsing.pyx

  • pandas/_libs/tslibs/period.pyx

  • pandas/_libs/tslibs/strptime.pyx

  • pandas/_libs/tslibs/timedeltas.pyx

  • pandas/plotting/_matplotlib/converter.py

  • pandas/tests/arrays/categorical/test_operators.py

  • pandas/tests/arrays/test_datetimelike.py

  • pandas/tests/dtypes/test_dtypes.py

  • pandas/tests/extension/base/setitem.py

  • pandas/tests/frame/test_constructors.py

  • pandas/tests/frame/test_missing.py

  • pandas/tests/frame/test_to_csv.py

  • pandas/tests/groupby/aggregate/test_other.py

  • pandas/tests/indexes/datetimes/test_date_range.py

  • pandas/tests/indexes/datetimes/test_datetime.py

  • pandas/tests/indexes/datetimes/test_formats.py

  • pandas/tests/indexes/datetimes/test_partial_slicing.py

  • pandas/tests/indexes/interval/test_constructors.py

  • pandas/tests/indexes/interval/test_interval.py

  • pandas/tests/indexes/multi/test_format.py

  • pandas/tests/indexes/period/test_formats.py

  • pandas/tests/indexes/test_base.py

  • pandas/tests/indexes/timedeltas/test_timedelta.py

  • pandas/tests/indexing/test_categorical.py

  • pandas/tests/indexing/test_coercion.py

  • pandas/tests/io/excel/test_openpyxl.py

  • pandas/tests/io/excel/test_writers.py

  • pandas/tests/io/formats/test_format.py

  • pandas/tests/io/formats/test_printing.py

  • pandas/tests/io/formats/test_style.py

  • pandas/tests/io/formats/test_to_csv.py

  • pandas/tests/io/formats/test_to_html.py

  • pandas/tests/io/formats/test_to_latex.py

  • pandas/tests/io/parser/test_compression.py

  • pandas/tests/io/parser/test_encoding.py

  • pandas/tests/io/parser/test_header.py

  • pandas/tests/io/parser/test_parse_dates.py

  • pandas/tests/io/parser/test_usecols.py

  • pandas/tests/io/test_html.py

  • pandas/tests/io/test_sql.py

  • pandas/tests/io/test_stata.py

  • pandas/tests/reductions/test_reductions.py

  • pandas/tests/reshape/test_concat.py

  • pandas/tests/reshape/test_melt.py

  • pandas/tests/scalar/period/test_period.py

  • pandas/tests/scalar/timedelta/test_timedelta.py

  • pandas/tests/scalar/timestamp/test_constructors.py

  • pandas/tests/series/indexing/test_numeric.py

  • pandas/tests/series/indexing/test_take.py

  • pandas/tests/series/indexing/test_where.py

  • pandas/tests/series/methods/test_rename.py

  • pandas/tests/series/test_api.py

  • pandas/tests/series/test_constructors.py

  • pandas/tests/series/test_datetime_values.py

  • pandas/tests/series/test_repr.py

  • pandas/tests/test_strings.py

  • pandas/tests/tools/test_to_datetime.py

  • pandas/tests/tseries/holiday/test_calendar.py

  • pandas/tests/tseries/holiday/test_holiday.py

  • pandas/tests/tslibs/test_parsing.py

  • pandas/tests/util/test_assert_frame_equal.py

  • pandas/tseries/frequencies.py

  • pandas/util/_decorators.py

  • pandas/util/_test_decorators.py

  • pandas/util/_validators.py

  • pandas/_version.py


NOTE:

The list may change as files are moved/renamed constantly.


Inhereted files and commands from this PR.

@ShaharNaveh ShaharNaveh changed the title Go from old string formating to new F-string formatting. Go from old string formating(% or .format()) to new F-string formatting. Nov 11, 2019
@ShaharNaveh ShaharNaveh changed the title Go from old string formating(% or .format()) to new F-string formatting. Replace old string formatting syntax with f-string format Nov 11, 2019
@ShaharNaveh ShaharNaveh changed the title Replace old string formatting syntax with f-string format Replace old string formatting syntax with f-string Nov 11, 2019
@ShaharNaveh
Copy link
Member Author

ShaharNaveh commented Nov 11, 2019

Im taking:

  • pandas/_libs/groupby.pyx

  • pandas/_libs/hashing.pyx

  • pandas/_libs/index.pyx

  • pandas/_libs/internals.pyx

  • pandas/_libs/interval.pyx

  • pandas/_libs/lib.pyx

  • pandas/_libs/ops.pyx

  • pandas/_libs/parsers.pyx

  • pandas/_libs/reduction.pyx

  • pandas/_libs/sparse.pyx

  • pandas/_libs/testing.pyx

  • pandas/_libs/tslib.pyx

  • pandas/_libs/window.pyx

@yashukla
Copy link
Contributor

yashukla commented Nov 12, 2019

I'll take:

  • pandas/tests/indexes/test_base.py
  • pandas/tests/indexes/test_category.py
  • pandas/tests/indexes/test_common.py
  • pandas/tests/indexes/test_numeric.py

to start, if that's alright!

@SaturnFromTitan
Copy link
Contributor

SaturnFromTitan commented Nov 12, 2019

Hi @MomIsBestFriend Can you recommend any tools for this conversion? A quick look gave me these:

  1. pyupgrade
  2. fstringify
  3. flynt

I have no experience with either of them, but they could be very helpful here

@ShaharNaveh
Copy link
Member Author

ShaharNaveh commented Nov 12, 2019

Hello @SaturnFromTitan , I personally sometimes use pyupgrade but only when the file contain only a few outdated string formats in it. Then I look at the changes and fix if pyupgrade got something wrong.

When they're files with alot of occurrences I go for the "complex" ones manually (e.g '%.2f' % my_float) and let it deal with the common ones, usually it gets it right.

Also, some of the changes will make the changed file non pep8 compatible, so there's a need to fix that as well, otherwise it will not pass the tests.

@ShaharNaveh
Copy link
Member Author

ShaharNaveh commented Nov 14, 2019

Will take next:

  • pandas/compat/__init__.py

  • pandas/compat/numpy/function.py

  • pandas/compat/numpy/__init__.py

  • pandas/compat/_optional.py

@yashukla
Copy link
Contributor

yashukla commented Nov 15, 2019

I'll take:

  • pandas/core/reshape/concat.py
  • pandas/core/reshape/melt.py
  • pandas/core/reshape/merge.py
  • pandas/core/reshape/pivot.py
  • pandas/core/reshape/reshape.py
  • pandas/core/reshape/tile.py

What are everyone's thoughts on tagging this as a good first issue? It should apply to most of the files here. The changes that need to be made are usually only a few lines or so per file, and whoever is making the changes doesn't need to worry too much about affecting other parts of the code (since the end function performed is the same).

I'm picturing a setup similar to #28926.

This was referenced Nov 16, 2019
@lucassa3
Copy link
Contributor

lucassa3 commented Nov 19, 2019

f-string replacement placed on:

  • pandas/core/groupby/generic.py
  • pandas/core/groupby/groupby.py
  • pandas/core/groupby/grouper.py
  • pandas/core/groupby/ops.py

ref #29701

@ShaharNaveh
Copy link
Member Author

ShaharNaveh commented Nov 19, 2019

Will take next:

  • pandas/_libs/tslibs/conversion.pyx

  • pandas/_libs/tslibs/c_timestamp.pyx

  • pandas/_libs/tslibs/fields.pyx

  • pandas/_libs/tslibs/nattype.pyx

  • pandas/_libs/tslibs/np_datetime.pyx

  • pandas/_libs/tslibs/offsets.pyx

  • pandas/_libs/tslibs/parsing.pyx

  • pandas/_libs/tslibs/timestamps.pyx

  • pandas/_libs/tslibs/timezones.pyx

  • pandas/_libs/tslibs/tzconversion.pyx

@ohad83
Copy link
Contributor

ohad83 commented Nov 21, 2019

I'll take

  • pandas/plotting/_core.py
  • pandas/plotting/_matplotlib/boxplot.py
  • pandas/plotting/_matplotlib/converter.py
  • pandas/plotting/_matplotlib/core.py
  • pandas/plotting/_matplotlib/misc.py
  • pandas/plotting/_matplotlib/style.py
  • pandas/plotting/_matplotlib/timeseries.py
  • pandas/plotting/_matplotlib/tools.py
  • pandas/plotting/_misc.py

ref #29781

@alimcmaster1
Copy link
Member

I'll take:

  • pandas/core/reshape/concat.py
  • pandas/core/reshape/melt.py
  • pandas/core/reshape/merge.py
  • pandas/core/reshape/pivot.py
  • pandas/core/reshape/reshape.py
  • pandas/core/reshape/tile.py

What are everyone's thoughts on tagging this as a good first issue? It should apply to most of the files here. The changes that need to be made are usually only a few lines or so per file, and whoever is making the changes doesn't need to worry too much about affecting other parts of the code (since the end function performed is the same).

I'm picturing a setup similar to #28926.

Sure ive labelled accordingly. thanks

@OlivierLuG
Copy link
Contributor

OlivierLuG commented May 25, 2020

The two files were modified:
pandas/_libs/tslibs/timedeltas.pyx
pandas/_libs/tslibs/timestamps.pyx

Note that there was no issues in the following ones. You can mark as done too:
pandas/_libs/tslibs/c_timestamp.pyx
pandas/_libs/tslibs/frequencies.pyx
pandas/_libs/tslibs/parsing.pyx
pandas/_libs/tslibs/period.pyx
pandas/_libs/tslibs/strptime.pyx

note: this is my first PR ever. Let me know if something need to be improved.

@OlivierLuG
Copy link
Contributor

OlivierLuG commented May 26, 2020

I've went through the topic to update the list + check some files.

Files marked as done without any commit

(no need to change anything):

  • pandas/_config/config.py
  • pandas/_version.py
  • pandas/compat/pickle_compat.py
  • pandas/core/computation/pytables.py
  • pandas/core/indexes/base.py
  • pandas/core/indexes/multi.py
  • pandas/core/indexes/range.py
  • pandas/core/ops/init.py
  • pandas/core/ops/docstrings.py
  • pandas/core/reshape/merge.py
  • pandas/core/tools/datetimes.py
  • pandas/io/formats/css.py
  • pandas/io/formats/excel.py
  • pandas/io/formats/format.py
  • pandas/io/formats/html.py
  • pandas/io/formats/info.py
  • pandas/io/formats/latex.py
  • pandas/io/formats/printing.py
  • pandas/io/formats/style.py
  • pandas/io/parsers.py
  • pandas/io/pytables.py
  • pandas/io/sas/sas_xport.py
  • pandas/io/stata.py
  • pandas/tests/arrays/categorical/test_operators.py
  • pandas/tests/arrays/test_datetimelike.py
  • pandas/tests/dtypes/test_dtypes.py
  • pandas/tests/extension/base/setitem.py
  • pandas/tests/frame/test_constructors.py
  • pandas/tests/frame/test_missing.py
  • pandas/tests/frame/test_to_csv.py
  • pandas/tests/groupby/aggregate/test_other.py
  • pandas/tests/indexes/datetimes/test_datetime.py
  • pandas/tests/indexes/datetimes/test_formats.py
  • pandas/tests/indexes/datetimes/test_partial_slicing.py
  • pandas/tests/indexes/interval/test_constructors.py
  • pandas/tests/indexes/multi/test_format.py
  • pandas/tests/indexes/period/test_formats.py
  • pandas/tests/indexes/timedeltas/test_timedelta.py
  • pandas/tests/indexing/test_categorical.py
  • pandas/tests/io/excel/test_openpyxl.py
  • pandas/tests/io/excel/test_writers.py
  • pandas/tests/io/formats/test_format.py
  • pandas/tests/io/formats/test_printing.py
  • pandas/tests/io/formats/test_style.py
  • pandas/tests/io/formats/test_to_csv.py
  • pandas/tests/io/formats/test_to_html.py
  • pandas/tests/io/formats/test_to_latex.py

Remaining files to check:

  • pandas/core/arrays/datetimelike.py
  • pandas/tests/io/parser/test_compression.py
  • pandas/tests/io/parser/test_encoding.py
  • pandas/tests/io/parser/test_header.py
  • pandas/tests/io/parser/test_parse_dates.py
  • pandas/tests/io/parser/test_usecols.py
  • pandas/tests/io/test_html.py
  • pandas/tests/io/test_sql.py
  • pandas/tests/io/test_stata.py
  • pandas/tests/reductions/test_reductions.py
  • pandas/tests/reshape/test_melt.py
  • pandas/tests/scalar/period/test_period.py
  • pandas/tests/scalar/timedelta/test_timedelta.py
  • pandas/tests/scalar/timestamp/test_constructors.py
  • pandas/tests/series/indexing/test_numeric.py
  • pandas/tests/series/indexing/test_take.py
  • pandas/tests/series/indexing/test_where.py
  • pandas/tests/series/methods/test_rename.py
  • pandas/tests/series/test_api.py
  • pandas/tests/series/test_constructors.py
  • pandas/tests/series/test_datetime_values.py
  • pandas/tests/series/test_repr.py
  • pandas/tests/test_strings.py
  • pandas/tests/tools/test_to_datetime.py
  • pandas/tests/tseries/holiday/test_calendar.py
  • pandas/tests/tseries/holiday/test_holiday.py
  • pandas/tests/tslibs/test_parsing.py
  • pandas/tests/util/test_assert_frame_equal.py
  • pandas/tseries/frequencies.py
  • pandas/util/_test_decorators.py
  • pandas/util/_validators.py

@matteosantama
Copy link
Contributor

I took care of pandas/util/_validators.py. Many of these other files already seem ok to me too.

  • pandas/util/_test_decorators.py
  • pandas/tseries/frequencies.py
  • pandas/tests/util/test_assert_frame_equal.py
  • pandas/tests/tslibs/test_parsing.py
  • pandas/tests/tseries/holiday/test_holiday.py
  • pandas/tests/tseries/holiday/test_calendar.py
  • pandas/tests/tools/test_to_datetime.py
  • pandas/tests/test_strings.py
  • pandas/tests/series/test_repr.py
  • pandas/tests/series/test_datetime_values.py
  • pandas/tests/series/test_constructors.py
  • pandas/tests/series/test_api.py

@warden706
Copy link

warden706 commented May 26, 2020 via email

@matteosantama
Copy link
Contributor

Hey @warden706, I'm actually pretty new here too, so I wouldn't have much to show you. I've found this resource very helpful as I've stumbled around, you should check it out.

@MatteoFelici
Copy link
Contributor

MatteoFelici commented May 27, 2020

Hi,
pretty new to contributing also here. I'm taking care of

  • pandas/tests/io/parser/test_header.py
  • pandas/tests/io/test_sql.py
  • pandas/tests/io/test_html.py
  • pandas/tests/reductions/test_reductions.py
  • pandas/tests/reshape/test_melt.py
  • pandas/tests/scalar/timedelta/test_timedelta.py

I checked these other files and seem ok to me

  • pandas/tests/io/parser/test_compression.py
  • pandas/tests/io/parser/test_encoding.py
  • pandas/tests/io/parser/test_parse_dates.py
  • pandas/tests/io/parser/test_usecols.py
  • pandas/tests/io/test_stata.py
  • pandas/tests/scalar/period/test_period.py
  • pandas/tests/scalar/timestamp/test_constructors.py

@DanBasson
Copy link
Contributor

DanBasson commented May 28, 2020

i'm also new here.
i'll take

  • pandas/tests/series/indexing/test_numeric.py
  • pandas/tests/series/indexing/test_take.py
  • pandas/tests/series/indexing/test_where.py

i have a question regarding the code change.
for instance, in pandas/tests/series/indexing/test_take.py, snippet of the code:

 msg = "index {} is out of bounds for( axis 0 with)? size 5"
 with pytest.raises(IndexError, match=msg.format(10)):
     ser.take([1, 10])

so my suggestion is to replace it to:

msg = lambda x: f"index {x} is out of bounds for( axis 0 with)? size 5"
with pytest.raises(IndexError, match=msg(10)):
    ser.take([1, 10])

is that good enough?

@MatteoFelici
Copy link
Contributor

Hi,
I'd like to make a PR, so I'm running the test, but I'm having a couple of fails. So I tried to run tests also on master.
Is it normal that running pytest pandas on unedited forked master returns a couple of fails?

@matteosantama
Copy link
Contributor

Master should generally pass the tests. Make sure you've pulled the latest commits. Which tests are failing?

@MatteoFelici
Copy link
Contributor

@matteosantama I pulled last commits, re-installed the environment an re-run the tests with pytest pandas. These are the results


================= short test summary info =================
FAILED pandas/tests/io/test_parquet.py::TestParquetFastParquet::test_s3_roundtrip - ValueError: Invalid timestamp "Ven, 29 Mag 2020 07:59:19 GMT": Unknown string format: Ven, 29 Mag 2020 07:59:19 GMT
FAILED pandas/tests/plotting/test_datetimelike.py::TestTSPlot::test_ts_plot_with_tz['UTC'] - AttributeError: 'numpy.datetime64' object has no attribute 'hour'
================= 2 failed, 87804 passed, 1185 skipped, 1005 xfailed, 5637 warnings in 2437.06s (0:40:37) =================

I noticed that if I run tests only on the single directory (for example with pytest pandas/tests/io), there are no fails:

 7273 passed, 344 skipped, 53 xfailed, 5584 warnings in 351.76s (0:05:51) 

@MatteoFelici
Copy link
Contributor

MatteoFelici commented Jun 20, 2020

Since the @OlivierLuG comment, it seems like almost all of the files have been corrected or were already ok without any modification. I'll try to update the list about the "still open" files.

Corrected

  • pandas/util/_validators.py
  • pandas/tests/io/parser/test_header.py
  • pandas/tests/io/test_sql.py
  • pandas/tests/reductions/test_reductions.py

No need to modification

  • pandas/util/_test_decorators.py
  • pandas/tseries/frequencies.py
  • pandas/tests/util/test_assert_frame_equal.py
  • pandas/tests/tslibs/test_parsing.py
  • pandas/tests/tseries/holiday/test_holiday.py
  • pandas/tests/tseries/holiday/test_calendar.py
  • pandas/tests/tools/test_to_datetime.py
  • pandas/tests/test_strings.py
  • pandas/tests/series/test_repr.py
  • pandas/tests/series/test_datetime_values.py
  • pandas/tests/series/test_constructors.py
  • pandas/tests/series/test_api.py
  • pandas/tests/io/parser/test_compression.py
  • pandas/tests/io/parser/test_encoding.py
  • pandas/tests/io/parser/test_parse_dates.py
  • pandas/tests/io/parser/test_usecols.py
  • pandas/tests/io/test_stata.py
  • pandas/tests/scalar/period/test_period.py
  • pandas/tests/scalar/timestamp/test_constructors.py
  • pandas/tests/io/test_html.py
  • pandas/tests/reshape/test_melt.py
  • pandas/tests/scalar/timedelta/test_timedelta.py

Moreover, I think that also this is already ok

  • pandas/core/arrays/datetimelike.py
  • pandas/tests/series/methods/test_rename.py

Still to check/correct

  • pandas/tests/series/indexing/test_numeric.py
  • pandas/tests/series/indexing/test_take.py
  • pandas/tests/series/indexing/test_where.py

@DanBasson do you have any update?

@DanBasson
Copy link
Contributor

i keep getting errors which i don't know what they mean.
any help will be appreciated

@MatteoFelici
Copy link
Contributor

Have you tried to fetch the latest modifications on master? Maybe it will fix some of the failed tests.

@DanBasson
Copy link
Contributor

it didn't help.
if someone else wants to take it, you can

@MatteoFelici
Copy link
Contributor

I have a doubt: when we have a situation like in pandas/tests/reshape/test_melt.py:

msg = "The following '{Var}' are not present in the DataFrame: {Col}"
...
with pytest.raises(KeyError, match=msg.format(Var="value_vars", Col="\\['C'\\]")):
...
with pytest.raises(KeyError, match=msg.format(Var="id_vars", Col="\\['A'\\]")):
...

and so on, should we transform msg to a function and call it with different values of "Col"? Or is it better to leave it as it is?

@WillAyd
Copy link
Member

WillAyd commented Jun 26, 2020

@MatteoFelici thanks for that updated list. I checked the last few remaining modules you called out and this looks OK, so I think we can close this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks good first issue
Projects
None yet
Development

No branches or pull requests