You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
java.lang.ArrayIndexOutOfBoundsException: 0
at ai.rapids.cudf.Table.<init>(Table.java:58)
at com.nvidia.spark.rapids.GpuColumnVector.from(GpuColumnVector.java:524)
at org.apache.spark.sql.rapids.execution.python.RebatchingRoundoffIterator.$anonfun$next$3(GpuArrowEvalPythonExec.scala:158)
at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:30)
at org.apache.spark.sql.rapids.execution.python.RebatchingRoundoffIterator.next(GpuArrowEvalPythonExec.scala:157)
at org.apache.spark.sql.rapids.execution.python.RebatchingRoundoffIterator.next(GpuArrowEvalPythonExec.scala:51)
at org.apache.spark.sql.rapids.execution.python.BatchProducer$$anon$1.next(GpuArrowEvalPythonExec.scala:248)
at org.apache.spark.sql.rapids.execution.python.BatchProducer$$anon$1.next(GpuArrowEvalPythonExec.scala:233)
...
Also, based on print statement output in the logs, the first udf appears to complete fully before the second one starts. The batches should flow through both python udfs incrementally as is the case with baseline Spark.
A different behavior is observed with the following (but I think they may be related)
transformed_df=spark.read.parquet("s3a://spark-rapids-ml-bm-datasets-public/pca/1m_3k_singlecol_float32_50_files.parquet")
features_col='feature_array'prediction_col='label'centers=np.random.rand(1000,3000)
frompyspark.sql.typesimportStructType, StructField, DoubleTypesc=transformed_df.rdd.contextcenters_bc=sc.broadcast(centers)
defpartition_score_udf(
pdf_iter
) :
local_centers=centers_bc.value.astype(np.float64)
partition_score=0.0importlogginglogger=logging.getLogger('partition_score_udf')
logger.info("in partition score udf")
forpdfinpdf_iter:
print("in partition score udf")
input_vecs=np.array(list(pdf[features_col]), dtype=np.float64)
predictions=list(pdf[prediction_col])
center_vecs=local_centers[predictions, :]
partition_score+=np.sum((input_vecs-center_vecs) **2)
yieldpd.DataFrame({"partition_score": [partition_score]})
total_score= (
# the below is extremely slow# if instead of transformed_df_w_label_2 we apply to transformed_df_w_label it runs fine# one difference is that transformed_df_ws_label_2 is itself the output of another pandas udf# so data for this case is passing back and forth between jvm and python workers multiple timestransformed_df_w_label_2.mapInPandas(
partition_score_udf, # type: ignoreStructType([StructField("partition_score", DoubleType(), True)]),
)
.agg(F.sum("partition_score").alias("total_score"))
.toPandas()
) # type: ignoretotal_score=total_score["total_score"][0] # type: ignore
In this case, at least in 13.3ML, the computation slows dramatically and may be deadlocked.
Expected behavior
No exception and no slowdowns, like with baseline Spark without the plugin.
Environment details (please complete the following information)
Environment location: First example: Databricks 12.2ML or 13.3ML, spark-rapids 24.04. Second example is slow only in Databricks 13.3ML
fix#10751
A cuDF Table requires non empyt columns, so need to check the number of columns when converting a batch to a cuDF table. This PR adds the support for rows-only batches in RebatchingRoundoffIterator.
---------
Signed-off-by: Firestarman <[email protected]>
Describe the bug
Successively applied Pandas UDFs and MapInPandas result in exceptions or make no progress in Databricks.
Steps/Code to reproduce bug
results in an error:
Also, based on print statement output in the logs, the first udf appears to complete fully before the second one starts. The batches should flow through both python udfs incrementally as is the case with baseline Spark.
A different behavior is observed with the following (but I think they may be related)
In this case, at least in 13.3ML, the computation slows dramatically and may be deadlocked.
Expected behavior
No exception and no slowdowns, like with baseline Spark without the plugin.
Environment details (please complete the following information)
Cluster shape: 2x workers with g5.2xlarge and driver with g4dn.xlarge
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: