-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEC: Add security disclosure process to developers page #8545
Comments
pls send me a private email and I'll take a look |
Resolved. Looks like I needed to update to the latest pandas release. Thanks! |
Going forward, for your project, it would be good to have a documented process for fielding security issues. |
@westurner I get that you want to raise these types of issues. But not sure that this is a pandas issue at all. It may be that the 'use' of pandas is incorrect, so possibly a doc note is in order as pandas is not directly web-facing. |
I get all this, but what can pandas actually do about this? |
ahh, you want to make this a doc issue, ok with that. |
@westurner ok pull-request for 0.15.1 then! |
Document what process for documenting issues and resolution are optimal in a security sensitive context. (e.g. link to a mailing list, or whatever you feel is appropriate) |
Here's a good example: |
https://securitytxt.org/ recommends |
@westurner : Seems reasonable. You're more than welcome to open a PR to add this! |
This is already added in https://github.com/pandas-dev/pandas/blob/master/.github/SECURITY.md so I think we can close this issue |
👍 |
Actually, this still isn't on the docs? Maybe;
|
Or would that be unhelpful because the Sphinx docs are in RST instead of the - newer - MyST Markdown? |
Ah good point @westurner, this is not explicitly called out in the docs. Might be good to add a section in https://pandas.pydata.org/docs/development/policies.html with the security policy. I'll reopen this |
Thanks.
From https://github.com/pandas-dev/pandas/security/policy 2023-07 :
To report a security vulnerability to pandas, please go to
https://tidelift.com/security and see the instructions there
https://github.com/pandas-dev/pandas/security/advisories lists zero
security advisories. Will need to check out how that works; does it feed
from OSV?
- Src:
- Docs: https://google.github.io/osv.dev/
From https://osv.dev/ :
Data sources
This infrastructure serves as an aggregator of vulnerability databases
that have adopted the OSV schema, including GitHub Security Advisories,
PyPA, RustSec, and Global Security Database, and more.
[...]
OSV schema
All advisories in this database use the OpenSSF OSV format, which was
developed in collaboration with open source communities.
The OSV schema provides a human and machine readable data format to
describe vulnerabilities in a way that precisely maps to open source
package versions or commit hashes.
- OSV API query endpoint:
https://api.osv.dev/v1/query
```sh
curl -d \
'{"version": "0.0.0",
"package": {"name": "pandas", "ecosystem": "PyPI"}}' \
"https://api.osv.dev/v1/query"
```
GitHub Advisory Database > Sources
https://github.com/github/advisory-database#sources :
- https://github.com/pypa/advisory-database
From https://github.com/pypa/advisory-database#readme :
*****
# Python Packaging Advisory Database
This is community owned repository of advisories for packages published on
https://pypi.org.
Advisories live in the [vulns](vulns/) directory and use a YAML encoding of
a [simple format](https://ossf.github.io/osv-schema/).
## Contributing advisories
### Making a pull request
Existing entries can be edited by simply creating a pull request.
To introduce a new entry, create a pull request with a new file that has a
name
matching `PYSEC-0000-<anything>.yaml`. This will be later picked up by
automation to allocate a proper ID once merged.
### Triage process
Much of the existing set of vulnerabilities are collected from the
[NVD CVE](https://nvd.nist.gov/vuln/data-feeds) feed.
We use [this tool](https://github.com/google/osv/tree/master/vulnfeeds),
which
performs a lot of heuristics to match CVEs with exact Python packages and
versions (which is a difficult problem!) and a small amount of human triage
to
generate the `.yaml` entries here.
## Using this data
Vulnerabilities are integrated into the
[Open Source Vulnerabilities](https://osv.dev) project, which provides an
API to
query for vulnerabilities like so:
```bash
$ curl -X POST -d \
'{"version": "2.4.1", "package": {"name": "jinja2", "ecosystem":
"PyPI"}}' \
"https://api.osv.dev/v1/query"
```
Longer term, we are working with the PyPI team to
[build a pipeline](pypi/warehouse#9407) to
automatically get these vulnerabilities into PyPI. The goal is to
have the `pip install` (and an additional `pip audit`) command automatically
report vulnerabilities out of the box.
```
*****
-
- https://www.google.com/search?q=CVE-2020-13091
- pickle vuln in pandas<=1.0.3 due to upstream cpython/python#pickle vuln
- pickle `eval()`s data/**code** and `exec()`s the `__reduce__()`
method, and there's (still?) not (yet?) a pickle protocol to prevent exec
on read
- SQLi: SQL Injection
Perhaps obviously, if you prepare unsafe SQL queries - for example
without use query parameterization;;-- string concatenation - and run them
on a SQL database (with pandas (SQLalchemy) or any other library in any
programming language) there would be SQLi (SQL Injection) vulnerabilities
in your app which depends upon pandas.
- ENH: sql support with SQLAlchemy
#6292 (comment)
(2014)
-
https://github.com/pandas-dev/pandas/blob/main/pandas/tests/io/test_sql.py
- https://pandas.pydata.org/docs/user_guide/io.html#sql-queries
*****
-
https://pandas.pydata.org/docs/user_guide/io.html#general-parsing-configuration
`dtype_backend="pyarrow"`
- https://arrow.apache.org/blog/2022/02/16/introducing-arrow-flight-sql/
- Arrow Flight SQL is faster than and designed to be the basis for a
SQL JDBC/ODBC driver
- JDBC/ODBC are typically not Zero-copy operations and there's data
reshaping because database and IPC and object structs differ unnecessarily
without Arrow
- https://github.com/BlazingDB
- BlazingSQL does GPU-accelerated CuDF w/ Dask, but from_arrow()
*converts* the pyarrow.Table to a cudf.DataFrame; which is not zero-copy
like zero_buffer
-
https://arrow.apache.org/datafusion/user-guide/faq.html#how-does-datafusion-compare-with-xyz
- DataFusion and Polars accelerate data operations by utilizing the
native SIMD support in many processors
- https://en.wikipedia.org/wiki/Single_instruction,_multiple_data
- https://github.com/simdjson/simdjson
- https://duckdb.org/faq.html#does-duckdb-use-simd :
Does DuckDB use SIMD?
DuckDB does not use explicit SIMD instructions because they
greatly complicate portability and compilation. Instead, DuckDB uses
implicit SIMD, where we go to great lengths to write our C++ code in such a
way that the compiler can auto-generate SIMD instructions for the specific
hardware. As an example why this is a good idea, porting DuckDB to the new
[ARM64-compatible] architecture took 10 minutes
…On Mon, Jul 10, 2023, 9:50 PM Matthew Roeschke ***@***.***> wrote:
Closed #8545 <#8545> as
completed via #54060 <#54060>.
—
Reply to this email directly, view it on GitHub
<#8545 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAMNS7P3SLSVZZ7IPUKUWTXPSWPNANCNFSM4AVW2BTA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
http://pandas.pydata.org/developers.html
Examples:
The text was updated successfully, but these errors were encountered: