Unexpected console message from IVY and spark #1423

mbalduini · 2021-08-09T11:05:55Z

Description

used docker image: jupyter/pyspark-notebook:b9f6ce795cfc
Started up via docker-compose:

version: '3'
services:
  qc-platform:
    image: jupyter/pyspark-notebook:b9f6ce795cfc
    ports:
      - 8888:8888
    environment:
      - GRANT_SUDO=yes 
      - JUPYTER_ENABLE_LAB=yes
      - JUPYTER_TOKEN=test
    user: root
    restart: unless-stopped

Additional Information: Downgraded from java 11 to Java 8

Running a simple cell to create a spark session (see code below) annoying message from Ivy (related to packages config) and spark start to be printed on console.

I tried to change che log4j configuration for spark and the logging lib to set the global logging level with no results.

Any help in removing completely the messages in the picture?

The text was updated successfully, but these errors were encountered:

mathbunnyru · 2021-08-09T11:19:45Z

I think the issue is that now, when something is printed to stderr, jupyter shows it in a red box.

What you can do is:

import os
import sys
f = open(os.devnull, 'w')
sys.stderr = f

Note, that this sets stderr to /dev/null/, so, if you want to see it later, you need to save and restore previous value.

mbalduini · 2021-08-09T11:57:29Z

Thank you for suggestion @mathbunnyru, but the proposed solution doesn't seem to work. No change in the output.

Any chance to specify redirection only for specific source?

mathbunnyru · 2021-08-09T12:13:59Z

Could you please make your question reproducible by other people?
No one wants to copy-paste the code from the screenshot.
Also, please tell us why you downgraded java and how you did it.

mbalduini · 2021-08-09T12:22:36Z

Got it, you are right.

I downgraded to java 8 in order to avoid additional warning (find below):

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/usr/local/spark-3.1.2-bin-hadoop3.2/jars/spark-unsafe_2.12-3.1.2.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release

I downgraded by installing the open jdk version sudo apt-get install openjdk-8-jre and then selecting the version via sudo update-alternatives --config java command
Here the code I used to create the spark session with additional packages:

from pyspark.sql import SparkSession

spark_jars = "org.apache.hadoop:hadoop-aws:3.2.0,org.postgresql:postgresql:42.2.18,org.apache.spark:spark-avro_2.12:3.0.1,org.apache.spark:spark-streaming-kafka-0-10_2.11:2.4.5,org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1,org.apache.kafka:kafka-clients:2.6.0,com.databricks:spark-xml_2.12:0.12.0"

spark = (SparkSession.builder
    .master("local[*]")
    .appName("test-edu")
    .config("spark.jars.packages", spark_jars)
    .getOrCreate()
        )

spark

Here below the ivy print:

Ivy Default Cache set to: /home/jovyan/.ivy2/cache
The jars for the packages stored in: /home/jovyan/.ivy2/jars
org.apache.hadoop#hadoop-aws added as a dependency
org.postgresql#postgresql added as a dependency
org.apache.spark#spark-avro_2.12 added as a dependency
org.apache.spark#spark-streaming-kafka-0-10_2.11 added as a dependency
org.apache.spark#spark-sql-kafka-0-10_2.12 added as a dependency
org.apache.kafka#kafka-clients added as a dependency
com.databricks#spark-xml_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-b38a7466-c91b-4810-b1f8-64ad6781a4d4;1.0
	confs: [default]
	found org.apache.hadoop#hadoop-aws;3.2.0 in central
	found com.amazonaws#aws-java-sdk-bundle;1.11.375 in central
	found org.postgresql#postgresql;42.2.18 in central
	found org.checkerframework#checker-qual;3.5.0 in central
	found org.apache.spark#spark-avro_2.12;3.0.1 in central
	found org.spark-project.spark#unused;1.0.0 in central
	found org.apache.spark#spark-streaming-kafka-0-10_2.11;2.4.5 in central
	found org.apache.spark#spark-sql-kafka-0-10_2.12;3.0.1 in central
	found org.apache.spark#spark-token-provider-kafka-0-10_2.12;3.0.1 in central
	found org.apache.commons#commons-pool2;2.6.2 in central
	found org.apache.kafka#kafka-clients;2.6.0 in central
	found com.github.luben#zstd-jni;1.4.4-7 in central
	found org.lz4#lz4-java;1.7.1 in central
	found org.xerial.snappy#snappy-java;1.1.7.3 in central
	found org.slf4j#slf4j-api;1.7.30 in central
	found com.databricks#spark-xml_2.12;0.12.0 in central
	found commons-io#commons-io;2.8.0 in central
	found org.glassfish.jaxb#txw2;2.3.3 in central
	found org.apache.ws.xmlschema#xmlschema-core;2.2.5 in central
:: resolution report :: resolve 491ms :: artifacts dl 13ms
	:: modules in use:
	com.amazonaws#aws-java-sdk-bundle;1.11.375 from central in [default]
	com.databricks#spark-xml_2.12;0.12.0 from central in [default]
	com.github.luben#zstd-jni;1.4.4-7 from central in [default]
	commons-io#commons-io;2.8.0 from central in [default]
	org.apache.commons#commons-pool2;2.6.2 from central in [default]
	org.apache.hadoop#hadoop-aws;3.2.0 from central in [default]
	org.apache.kafka#kafka-clients;2.6.0 from central in [default]
	org.apache.spark#spark-avro_2.12;3.0.1 from central in [default]
	org.apache.spark#spark-sql-kafka-0-10_2.12;3.0.1 from central in [default]
	org.apache.spark#spark-streaming-kafka-0-10_2.11;2.4.5 from central in [default]
	org.apache.spark#spark-token-provider-kafka-0-10_2.12;3.0.1 from central in [default]
	org.apache.ws.xmlschema#xmlschema-core;2.2.5 from central in [default]
	org.checkerframework#checker-qual;3.5.0 from central in [default]
	org.glassfish.jaxb#txw2;2.3.3 from central in [default]
	org.lz4#lz4-java;1.7.1 from central in [default]
	org.postgresql#postgresql;42.2.18 from central in [default]
	org.slf4j#slf4j-api;1.7.30 from central in [default]
	org.spark-project.spark#unused;1.0.0 from central in [default]
	org.xerial.snappy#snappy-java;1.1.7.3 from central in [default]
	:: evicted modules:
	org.apache.kafka#kafka-clients;2.0.0 by [org.apache.kafka#kafka-clients;2.6.0] in [default]
	org.apache.kafka#kafka-clients;2.4.1 by [org.apache.kafka#kafka-clients;2.6.0] in [default]
	---------------------------------------------------------------------
	|                  |            modules            ||   artifacts   |
	|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
	---------------------------------------------------------------------
	|      default     |   21  |   0   |   0   |   2   ||   19  |   0   |
	---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-b38a7466-c91b-4810-b1f8-64ad6781a4d4
	confs: [default]
	0 artifacts copied, 19 already retrieved (0kB/26ms)
21/08/09 12:19:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).

The console outputs related to Ivy operations appear with both the java version

mbalduini · 2021-09-22T09:01:29Z

Any update on this issue?

I tested the code with the most recent version and the behaviour persists

mathbunnyru · 2021-09-22T11:23:41Z

@mbalduini I've tried several solutions, but I didn't find something working, I think you need to somehow configure pyspark.sql logger (I haven't used pyspark, that's why I can't help you).

mathbunnyru · 2021-09-22T11:24:50Z

Another option would be to somehow configure jupyterlab / jupyterlab cell to not to show stderr (or sth like this).
I don't know if it's easily possible.

mbalduini · 2021-10-19T15:30:31Z

Hi @mathbunnyru thank you for your effort.
Unfortunately I tried several option too but no success yet, even with the latest release.

Do you have any further information or suggestions to cope with this problem?

romainx · 2022-01-06T20:24:43Z

Hello @mbalduini and @mathbunnyru,

I have looked into this problem in more depth. The modification of the notebook output comes from one of the changes made in the release 6.0.0 of ipykernel.

All outputs to stdout/stderr should now be captured, including subprocesses and output of compiled libraries (blas, lapack....). In notebook server, some outputs that would previously go to the notebooks logs will now both head to notebook logs and in notebooks outputs.

A subsequent fix provides a way to restore the previous behavior. The fix consists in disabling the capture new behavior by turning it off through the capture_fd_output flag, see the following comment for more detail -> ipython/ipykernel#795 (comment).

You can configure it by turning it off in your ipython profile.

# create a default profile
ipython profile create

Edit the file ~/.ipython/profile_default/ipython_kernel_config.py and add the following line.

c.IPKernelApp.capture_fd_output = False

That's it ! All the outputs from Java, Spark and Ivy will no more be displayed in the notebook but only in the logs.
We have to check if we could / should do something here to provide this configuration by default.
@mathbunnyru what is your opinion?

mathbunnyru · 2022-01-07T15:03:01Z

@romainx nice!

I think we can try to add this file to pyspark image (it will also be included in all-spark).
I think these logs are noisy and everyone using spark sees them and they don't make a lot of sense, if everything goes right.

romainx · 2022-01-07T15:43:25Z

@mathbunnyru 👍
And in fact they still appear in the container logs even after this change.
I will draft a PR for that (this will be the opportunity to start my contributions here again 😄).

mbalduini added the type:Bug A problem with the definition of one of the docker images maintained here label Aug 9, 2021

romainx added the tag:Upstream A problem with one of the upstream packages installed in the docker images label Jan 6, 2022

romainx mentioned this issue Jan 7, 2022

Turn off ipython low-level output capture and forward #1562

Merged

mathbunnyru closed this as completed in #1562 Jan 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected console message from IVY and spark #1423

Unexpected console message from IVY and spark #1423

mbalduini commented Aug 9, 2021

mathbunnyru commented Aug 9, 2021

mbalduini commented Aug 9, 2021

mathbunnyru commented Aug 9, 2021 •

edited

Loading

mbalduini commented Aug 9, 2021

mbalduini commented Sep 22, 2021

mathbunnyru commented Sep 22, 2021

mathbunnyru commented Sep 22, 2021

mbalduini commented Oct 19, 2021

romainx commented Jan 6, 2022 •

edited

Loading

mathbunnyru commented Jan 7, 2022

romainx commented Jan 7, 2022

Unexpected console message from IVY and spark #1423

Unexpected console message from IVY and spark #1423

Comments

mbalduini commented Aug 9, 2021

mathbunnyru commented Aug 9, 2021

mbalduini commented Aug 9, 2021

mathbunnyru commented Aug 9, 2021 • edited Loading

mbalduini commented Aug 9, 2021

mbalduini commented Sep 22, 2021

mathbunnyru commented Sep 22, 2021

mathbunnyru commented Sep 22, 2021

mbalduini commented Oct 19, 2021

romainx commented Jan 6, 2022 • edited Loading

mathbunnyru commented Jan 7, 2022

romainx commented Jan 7, 2022

mathbunnyru commented Aug 9, 2021 •

edited

Loading

romainx commented Jan 6, 2022 •

edited

Loading