Skip to content

Commit

Permalink
Merge branch 'main' into just-change-value-counts
Browse files Browse the repository at this point in the history
  • Loading branch information
MarcoGorelli authored Dec 27, 2022
2 parents 2d40eac + eff6566 commit ec48816
Show file tree
Hide file tree
Showing 161 changed files with 1,772 additions and 1,685 deletions.
2 changes: 1 addition & 1 deletion .github/actions/setup-conda/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ runs:
- name: Set Arrow version in ${{ inputs.environment-file }} to ${{ inputs.pyarrow-version }}
run: |
grep -q ' - pyarrow' ${{ inputs.environment-file }}
sed -i"" -e "s/ - pyarrow<10/ - pyarrow=${{ inputs.pyarrow-version }}/" ${{ inputs.environment-file }}
sed -i"" -e "s/ - pyarrow/ - pyarrow=${{ inputs.pyarrow-version }}/" ${{ inputs.environment-file }}
cat ${{ inputs.environment-file }}
shell: bash
if: ${{ inputs.pyarrow-version }}
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/macos-windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ env:
PANDAS_CI: 1
PYTEST_TARGET: pandas
PATTERN: "not slow and not db and not network and not single_cpu"
TEST_ARGS: "-W error:::pandas"


permissions:
Expand Down
7 changes: 5 additions & 2 deletions .github/workflows/ubuntu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:
matrix:
env_file: [actions-38.yaml, actions-39.yaml, actions-310.yaml]
pattern: ["not single_cpu", "single_cpu"]
pyarrow_version: ["7", "8", "9"]
pyarrow_version: ["7", "8", "9", "10"]
include:
- name: "Downstream Compat"
env_file: actions-38-downstream_compat.yaml
Expand All @@ -38,6 +38,7 @@ jobs:
- name: "Minimum Versions"
env_file: actions-38-minimum_versions.yaml
pattern: "not slow and not network and not single_cpu"
test_args: ""
- name: "Locale: it_IT"
env_file: actions-38.yaml
pattern: "not slow and not network and not single_cpu"
Expand All @@ -62,10 +63,12 @@ jobs:
env_file: actions-310.yaml
pattern: "not slow and not network and not single_cpu"
pandas_copy_on_write: "1"
test_args: ""
- name: "Data Manager"
env_file: actions-38.yaml
pattern: "not slow and not network and not single_cpu"
pandas_data_manager: "array"
test_args: ""
- name: "Pypy"
env_file: actions-pypy-38.yaml
pattern: "not slow and not network and not single_cpu"
Expand Down Expand Up @@ -93,7 +96,7 @@ jobs:
LC_ALL: ${{ matrix.lc_all || '' }}
PANDAS_DATA_MANAGER: ${{ matrix.pandas_data_manager || 'block' }}
PANDAS_COPY_ON_WRITE: ${{ matrix.pandas_copy_on_write || '0' }}
TEST_ARGS: ${{ matrix.test_args || '' }}
TEST_ARGS: ${{ matrix.test_args || '-W error:::pandas' }}
PYTEST_WORKERS: ${{ contains(matrix.pattern, 'not single_cpu') && 'auto' || '1' }}
PYTEST_TARGET: ${{ matrix.pytest_target || 'pandas' }}
IS_PYPY: ${{ contains(matrix.env_file, 'pypy') }}
Expand Down
10 changes: 10 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -333,3 +333,13 @@ repos:
additional_dependencies:
- autotyping==22.9.0
- libcst==0.4.7
- id: check-test-naming
name: check that test names start with 'test'
entry: python -m scripts.check_test_naming
types: [python]
files: ^pandas/tests
language: python
exclude: |
(?x)
^pandas/tests/generic/test_generic.py # GH50380
|^pandas/tests/io/json/test_readlines.py # GH50378
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,6 @@ RUN apt-get install -y build-essential
RUN apt-get install -y libhdf5-dev

RUN python -m pip install --upgrade pip
RUN python -m pip install --use-deprecated=legacy-resolver \
RUN python -m pip install \
-r https://raw.githubusercontent.com/pandas-dev/pandas/main/requirements-dev.txt
CMD ["/bin/bash"]
2 changes: 1 addition & 1 deletion asv_bench/asv.conf.json
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
// pip (with all the conda available packages installed first,
// followed by the pip installed packages).
"matrix": {
"numpy": [],
"numpy": ["1.23.5"], // https://github.com/pandas-dev/pandas/pull/50356
"Cython": ["0.29.32"],
"matplotlib": [],
"sqlalchemy": [],
Expand Down
2 changes: 1 addition & 1 deletion ci/deps/actions-310-numpydev.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,5 @@ dependencies:
- "cython"
- "--extra-index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple"
- "--pre"
- "numpy"
- "numpy<1.24"
- "scipy"
4 changes: 2 additions & 2 deletions ci/deps/actions-310.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ dependencies:

# required dependencies
- python-dateutil
- numpy
- numpy<1.24
- pytz

# optional dependencies
Expand All @@ -42,7 +42,7 @@ dependencies:
- psycopg2
- pymysql
- pytables
- pyarrow<10
- pyarrow
- pyreadstat
- python-snappy
- pyxlsb
Expand Down
4 changes: 2 additions & 2 deletions ci/deps/actions-38-downstream_compat.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ dependencies:

# required dependencies
- python-dateutil
- numpy
- numpy<1.24
- pytz

# optional dependencies
Expand All @@ -40,7 +40,7 @@ dependencies:
- openpyxl
- odfpy
- psycopg2
- pyarrow<10
- pyarrow
- pymysql
- pyreadstat
- pytables
Expand Down
4 changes: 2 additions & 2 deletions ci/deps/actions-38.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ dependencies:

# required dependencies
- python-dateutil
- numpy
- numpy<1.24
- pytz

# optional dependencies
Expand All @@ -40,7 +40,7 @@ dependencies:
- odfpy
- pandas-gbq
- psycopg2
- pyarrow<10
- pyarrow
- pymysql
- pyreadstat
- pytables
Expand Down
4 changes: 2 additions & 2 deletions ci/deps/actions-39.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ dependencies:

# required dependencies
- python-dateutil
- numpy
- numpy<1.24
- pytz

# optional dependencies
Expand All @@ -41,7 +41,7 @@ dependencies:
- pandas-gbq
- psycopg2
- pymysql
- pyarrow<10
- pyarrow
- pyreadstat
- pytables
- python-snappy
Expand Down
2 changes: 1 addition & 1 deletion ci/deps/actions-pypy-38.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,6 @@ dependencies:
- hypothesis>=5.5.3

# required
- numpy
- numpy<1.24
- python-dateutil
- pytz
4 changes: 2 additions & 2 deletions ci/deps/circle-38-arm64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ dependencies:

# required dependencies
- python-dateutil
- numpy
- numpy<1.24
- pytz

# optional dependencies
Expand All @@ -40,7 +40,7 @@ dependencies:
- odfpy
- pandas-gbq
- psycopg2
- pyarrow<10
- pyarrow
- pymysql
# Not provided on ARM
#- pyreadstat
Expand Down
1 change: 1 addition & 0 deletions doc/source/reference/indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,7 @@ MultiIndex components
MultiIndex.swaplevel
MultiIndex.reorder_levels
MultiIndex.remove_unused_levels
MultiIndex.drop

MultiIndex selecting
~~~~~~~~~~~~~~~~~~~~
Expand Down
9 changes: 5 additions & 4 deletions doc/source/user_guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,9 @@ parse_dates : boolean or list of ints or names or list of lists or dict, default
infer_datetime_format : boolean, default ``False``
If ``True`` and parse_dates is enabled for a column, attempt to infer the
datetime format to speed up the processing.

.. deprecated:: 2.0.0
A strict version of this argument is now the default, passing it has no effect.
keep_date_col : boolean, default ``False``
If ``True`` and parse_dates specifies combining multiple columns then keep the
original columns.
Expand Down Expand Up @@ -916,12 +919,10 @@ an exception is raised, the next one is tried:

Note that performance-wise, you should try these methods of parsing dates in order:

1. Try to infer the format using ``infer_datetime_format=True`` (see section below).

2. If you know the format, use ``pd.to_datetime()``:
1. If you know the format, use ``pd.to_datetime()``:
``date_parser=lambda x: pd.to_datetime(x, format=...)``.

3. If you have a really non-standard format, use a custom ``date_parser`` function.
2. If you have a really non-standard format, use a custom ``date_parser`` function.
For optimal performance, this should be vectorized, i.e., it should accept arrays
as arguments.

Expand Down
Loading

0 comments on commit ec48816

Please sign in to comment.