Skip to content

Commit

Permalink
[GR-56921] Add an option to warn when C extensions are loaded
Browse files Browse the repository at this point in the history
PullRequest: graalpython/3429
  • Loading branch information
timfel authored and msimacek committed Aug 9, 2024
2 parents b16cdea + d8108c9 commit 27cbe6f
Show file tree
Hide file tree
Showing 9 changed files with 72 additions and 6 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ language runtime. The main focus is on user-observable behavior of the engine.
* Updated developer metadata of Maven artifacts.

## Version 24.1.0
* GraalPy is now considered stable for pure Python workloads. While many workloads involving native extension modules work, we continue to consider them experimental. You can use the command-line option `--python.WarnExperimentalFeatures` to enable warnings for such modules at runtime. In Java embeddings the warnings are enabled by default and you can suppress them by setting the context option 'python.WarnExperimentalFeatures' to 'false'.
* Update to Python 3.11.7
* We now provide intrinsified `_pickle` module also in the community version.
* `polyglot.eval` now raises more meaningful exceptions. Unavaliable languages raise `ValueError`. Exceptions from the polyglot language are raised directly as interop objects (typed as `polyglot.ForeignException`). The shortcut for executing python files without specifying language has been removed, use regular `eval` for executing Python code.
Expand Down
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

GraalPy is a high-performance implementation of the [Python](https://www.python.org/) language for the JVM built on [GraalVM](https://www.graalvm.org/).
GraalPy has first-class support for embedding in Java and can turn Python applications into fast, standalone binaries.
GraalPy is ready for production running pure Python code and has experimental support for many popular native extension modules.

## Why GraalPy?

Expand All @@ -17,15 +18,15 @@ GraalPy has first-class support for embedding in Java and can turn Python applic

**Compatible with the Python ecosystem**

* Install [packages](docs/user/Python-Runtime.md#installing-packages) like *NumPy*, *PyTorch*, or *Tensorflow*; run [Hugging Face](https://huggingface.co/) models like *Stable Diffusion* or *GPT*
* See if the packages you need work with our [Python Compatibility Checker](https://www.graalvm.org/python/compatibility/)
* Use almost any standard Python feature, the CPython tests run on every commit and pass ~85%
![](docs/user/assets/mcd.svg#gh-light-mode-only)![](docs/user/assets/mcd-dark.svg#gh-dark-mode-only)<sup>
We run the tests of the [most depended on PyPI packages](https://libraries.io/pypi) every day.
For 96% of those packages a recent version can be installed on GraalPy and GraalPy passes about 50% of all tests of all packages combined.
We assume that CPython not passing 100% of all tests is due to problems in our infrastructure that may also affect GraalPy.
Packages where CPython fails all tests are marked as "not tested" for both CPython and GraalPy.
</sup>
* See if the packages you need work according to our [Python Compatibility Checker](https://www.graalvm.org/python/compatibility/)
* Support for native extension modules is considered experimental, but you can already install [packages](docs/user/Python-Runtime.md#installing-packages) like *NumPy*, *PyTorch*, or *Tensorflow*; run [Hugging Face](https://huggingface.co/) models like *Stable Diffusion* or *GPT*

**Runs Python code faster**

Expand Down
10 changes: 9 additions & 1 deletion docs/user/Embedding-Permissions.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ The Java backend is the default when GraalPy is run via the `Context` API, that
GraalPy can log information about known incompatibility of functions executed at runtime, which includes the OS interface-related functions.
To turn on this logging, use the command-line option `--log.python.compatibility.level=FINE` (or other desired logging level).

Known limitations of the of the Java backend are:
Known limitations of the Java backend are:

* Its state is disconnected from the actual OS state, which applies especially to:
* *file descriptors*: Python-level file descriptors are not usable in native code.
Expand All @@ -74,3 +74,11 @@ Known limitations of the of the Java backend are:
## Python Native Extensions

Python native extensions run by default as native binaries, with full access to the underlying system.
See [Embedding limitations](Native-Extensions.md#embedding-limitations)

The context permissions needed to run native extensions are:
```java
.allowIO(IOAccess.ALL)
.allowCreateThread(true)
.allowNativeAccess(true)
```
26 changes: 26 additions & 0 deletions docs/user/Native-Extensions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
layout: docs-experimental
toc_group: python
link_title: Native Extensions Support
permalink: /reference-manual/python/Native-Extensions/
---

# Native Extensions Support

CPython provides a [native extensions API](https://docs.python.org/3/c-api/index.html){:target="_blank"} for writing Python extensions in C/C++.
GraalPy provides experimental support for this API, which allows many packages like NumPy and PyTorch to work well for many use cases.
The support extends only to the API, not the binary interface (ABI), so extensions built for CPython are not binary compatible with GraalPy.
Packages that use the native API must be built and installed with GraalPy, and the prebuilt wheels for CPython from pypi.org cannot be used.
For best results, it is crucial that you only use the `pip` command that comes preinstalled in GraalPy virtualenvs to install packages.
The version of `pip` shipped with GraalPy applies additional patches to packages upon installation to fix known compatibility issues and it is preconfigured to use an additional repository from graalvm.org where we publish a selection of prebuilt wheels for GraalPy.
Please do not update `pip` or use alternative tools such as `uv`.

## Embedding limitations

Python native extensions run by default as native binaries, with full access to the underlying system.
Native code is not sandboxed and can circumvent any protections Truffle or the JVM may provide, up to and including aborting the entire process.
Native data structures are not subject to the Java GC and the combination of them with Java data structures may lead to memory leaks.
Native libraries generally cannot be loaded multiple times into the same process, and they may contain global state that cannot be safely reset.
Thus, it is not possible to create multiple GraalPy contexts that access native modules within the same JVM.
This includes the case when you create a context, close it, and then create another context.
The second context will not be able to access native extensions.
Original file line number Diff line number Diff line change
Expand Up @@ -757,7 +757,7 @@ protected void launch(Builder contextBuilder) {
}
contextBuilder.option("python.DontWriteBytecodeFlag", Boolean.toString(dontWriteBytecode));
if (verboseFlag) {
contextBuilder.option("log.python.level", "FINE");
contextBuilder.option("log.python.level", "INFO");
}
contextBuilder.option("python.QuietFlag", Boolean.toString(quietFlag));
contextBuilder.option("python.NoUserSiteFlag", Boolean.toString(noUserSite));
Expand Down Expand Up @@ -791,6 +791,10 @@ protected void launch(Builder contextBuilder) {
contextBuilder.option("python.PosixModuleBackend", "java");
}

if (!hasContextOptionSetViaCommandLine("WarnExperimentalFeatures")) {
contextBuilder.option("python.WarnExperimentalFeatures", "false");
}

if (multiContext) {
contextBuilder.engine(Engine.newBuilder().allowExperimentalOptions(true).options(enginePolyglotOptions).build());
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
*graalpython.lib-python.3.test.test_urllib2net.CloseSocketTest.test_close
*graalpython.lib-python.3.test.test_urllib2net.OtherNetworkTests.test_custom_headers
*graalpython.lib-python.3.test.test_urllib2net.OtherNetworkTests.test_file
*graalpython.lib-python.3.test.test_urllib2net.OtherNetworkTests.test_ftp
*graalpython.lib-python.3.test.test_urllib2net.OtherNetworkTests.test_redirect_url_withfrag
*graalpython.lib-python.3.test.test_urllib2net.OtherNetworkTests.test_sites_no_connection_close
*graalpython.lib-python.3.test.test_urllib2net.OtherNetworkTests.test_urlwithfrag
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@

import java.io.IOException;
import java.nio.file.LinkOption;
import java.util.Set;
import java.util.logging.Level;

import org.graalvm.collections.Pair;
import org.graalvm.shadowed.com.ibm.icu.impl.Punycode;
Expand Down Expand Up @@ -274,6 +276,15 @@ private static String dlopenFlagsToString(int flags) {
return str;
}

private static final Set<String> C_EXT_SUPPORTED_LIST = Set.of(
// Stdlib modules are considered supported
"_cpython_sre",
"_cpython_unicodedata",
"_sha3",
"_sqlite3",
"termios",
"pyexpat");

/**
* This method loads a C extension module (C API) and will initialize the corresponding native
* contexts if necessary.
Expand All @@ -294,6 +305,17 @@ private static String dlopenFlagsToString(int flags) {
@TruffleBoundary
public static Object loadCExtModule(Node location, PythonContext context, ModuleSpec spec, CheckFunctionResultNode checkFunctionResultNode)
throws IOException, ApiInitException, ImportException {
if (getLogger().isLoggable(Level.WARNING) && context.getOption(PythonOptions.WarnExperimentalFeatures)) {
boolean runViaLauncher = context.getOption(PythonOptions.RunViaLauncher);
if (!runViaLauncher || !C_EXT_SUPPORTED_LIST.contains(spec.name.toJavaStringUncached())) {
String message = "Loading C extension module %s from '%s'. Support for the Python C API is considered experimental.";
if (!runViaLauncher) {
message += " See https://www.graalvm.org/latest/reference-manual/python/Native-Extensions/#embedding-limitations for the limitations. " +
"You can suppress this warning by setting the context option 'python.WarnExperimentalFeatures' to 'false'";
}
getLogger().warning(message.formatted(spec.name, spec.path));
}
}

// we always need to load the CPython C API
CApiContext cApiContext = CApiContext.ensureCapiWasLoaded(location, context, spec.name, spec.path);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,9 @@ private PythonOptions() {
@EngineOption @Option(category = OptionCategory.INTERNAL, usageSyntax = "true|false", help = "If true, uses native storage strategy for primitive types") //
public static final OptionKey<Boolean> UseNativePrimitiveStorageStrategy = new OptionKey<>(false);

@Option(category = OptionCategory.EXPERT, usageSyntax = "true|false", help = "Print warnings when using experimental features at runtime.", stability = OptionStability.STABLE) //
public static final OptionKey<Boolean> WarnExperimentalFeatures = new OptionKey<>(true);

public static final OptionDescriptors DESCRIPTORS = new PythonOptionsOptionDescriptors();

@CompilationFinal(dimensions = 1) private static final OptionKey<?>[] ENGINE_OPTION_KEYS;
Expand Down
4 changes: 3 additions & 1 deletion mx.graalpython/mx_graalpython.py
Original file line number Diff line number Diff line change
Expand Up @@ -721,6 +721,8 @@ def update_unittest_tags(args):
'graalpython.lib-python.3.test.test_buffer.TestBufferProtocol.test_ndarray_slice_multidim',
# Transient failure to delete semaphore on process death
'test.test_multiprocessing_spawn.test_misc.TestResourceTracker.test_resource_tracker_sigkill',
# Connecting to external page that sometimes times out
'graalpython.lib-python.3.test.test_urllib2net.OtherNetworkTests.test_ftp',
]

result_tags = linux_tags & darwin_tags
Expand Down Expand Up @@ -1593,7 +1595,7 @@ def graalpython_gate_runner(args, tasks):
svm_image = python_svm()
benchmark = os.path.join(PATH_MESO, "image-magix.py")
out = mx.OutputCapture()
mx.run([svm_image, "-v", "-S", "--log.python.level=FINEST", benchmark], nonZeroIsFatal=True, out=mx.TeeOutputCapture(out), err=mx.TeeOutputCapture(out))
mx.run([svm_image, "-S", "--log.python.level=FINE", benchmark], nonZeroIsFatal=True, out=mx.TeeOutputCapture(out), err=mx.TeeOutputCapture(out))
success = "\n".join([
"[0, 0, 0, 0, 0, 0, 10, 10, 10, 0, 0, 10, 3, 10, 0, 0, 10, 10, 10, 0, 0, 0, 0, 0, 0]",
])
Expand Down

0 comments on commit 27cbe6f

Please sign in to comment.