Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the support for the sparklens in spark-3.0.0 and later version of spark with scala-2.12 #63

Open
wants to merge 1 commit into
base: SPARK30
Choose a base branch
from

Conversation

SaurabhChawla100
Copy link

@SaurabhChawla100 SaurabhChawla100 commented Mar 19, 2021

In spark 3.0.0 there are lots of changes done by the community , so creating this improvement PR to make the sparklens work with spark 3.0.0 / new version of spark and scala-2.12.

I have made code change and also tested the same .please find the output

./bin/spark-shell --jars file:///tmp//src/opensrc/sparklens/target/scala-2.12/sparklens_2.12-0.4.0.jar --conf spark.extraListeners=com.qubole.sparklens.QuboleJobListener --conf spark.eventLog.enabled=true


Printing application meterics. These metrics are collected at task-level granularity and aggregated across the app (all tasks, stages, and jobs).

 AggregateMetrics (Application Metrics) total measurements 20 
                NAME                        SUM                MIN           MAX                MEAN         
 diskBytesSpilled                            0.0 KB         0.0 KB         0.0 KB              0.0 KB
 executorRuntime                           344.0 ms         3.0 ms        30.0 ms             17.0 ms
 inputBytesRead                              0.0 KB         0.0 KB         0.0 KB              0.0 KB
 jvmGCTime                                   1.3 ss         0.0 ms       130.0 ms             65.0 ms
 memoryBytesSpilled                          0.0 KB         0.0 KB         0.0 KB              0.0 KB
 outputBytesWritten                          0.0 KB         0.0 KB         0.0 KB              0.0 KB
 peakExecutionMemory                         0.0 KB         0.0 KB         0.0 KB              0.0 KB
 resultSize                                 17.5 KB         0.9 KB         0.9 KB              0.9 KB
 shuffleReadBytesRead                        0.0 KB         0.0 KB         0.0 KB              0.0 KB
 shuffleReadFetchWaitTime                    0.0 ms         0.0 ms         0.0 ms              0.0 ms
 shuffleReadLocalBlocks                           0              0              0                   0
 shuffleReadRecordsRead                           0              0              0                   0
 shuffleReadRemoteBlocks                          0              0              0                   0
 shuffleWriteBytesWritten                    0.0 KB         0.0 KB         0.0 KB              0.0 KB
 shuffleWriteRecordsWritten                       0              0              0                   0
 shuffleWriteTime                            0.0 ms         0.0 ms         0.0 ms              0.0 ms
 taskDuration                                4.3 ss        10.0 ms       441.0 ms            215.0 ms




Total Hosts 1, and the maximum concurrent hosts = 1


Host 192.168.1.19 startTime 09:32:50:262 executors count 1
Done printing host timeline
======================



Printing executors timeline....

Total Executors 1, and maximum concurrent executors = 1
At 09:32 executors added 1 & removed  0 currently available 1

Done printing executors timeline...
============================



Printing Application timeline 

09:32:49:492 app started 
09:32:57:059 JOB 0 started : duration 00m 00s 
[      0                                      |||||||||||||||||||||||||||||||||||||||||| ]
09:32:57:448      Stage 0 started : duration 00m 00s 
09:32:57:889      Stage 0 ended : maxTaskTime 30 taskCount 10
09:32:57:899 JOB 0 ended 
09:33:01:125 JOB 1 started : duration 00m 00s 
[      1        |||||||||||||||||                                                        ]
09:33:01:132      Stage 1 started : duration 00m 00s 
09:33:01:149      Stage 1 ended : maxTaskTime 6 taskCount 10
09:33:01:150 JOB 1 ended 
09:33:03:044 app ended 



Checking for job overlap...

 
 JobGroup 1  SQLExecID (-1)
 Number of Jobs 1  JobIDs(0)
 Timing [09:32:57:059 - 09:32:57:899]
 Duration  00m 00s
 
 JOB 0 Start 09:32:57:059  End 09:32:57:899
 
 
 JobGroup 2  SQLExecID (-1)
 Number of Jobs 1  JobIDs(1)
 Timing [09:33:01:125 - 09:33:01:150]
 Duration  00m 00s
 
 JOB 1 Start 09:33:01:125  End 09:33:01:150
 

No overlapping jobgroups found. Good




 Time spent in Driver vs Executors
 Driver WallClock Time    00m 12s   93.62%
 Executor WallClock Time  00m 00s   6.38%
 Total WallClock Time     00m 13s
      


Minimum possible time for the app based on the critical path (with infinite resources)   00m 12s
Minimum possible time for the app with same executors, perfect parallelism and zero skew 00m 12s
If we were to run this app with single executor and single core                          00h 00m

       
 Total cores available to the app 16

 OneCoreComputeHours: Measure of total compute power available from cluster. One core in the executor, running
                      for one hour, counts as one OneCoreComputeHour. Executors with 4 cores, will have 4 times
                      the OneCoreComputeHours compared to one with just one core. Similarly, one core executor
                      running for 4 hours will OnCoreComputeHours equal to 4 core executor running for 1 hour.

 Driver Utilization (Cluster idle because of driver)

 Total OneCoreComputeHours available                             00h 03m
 Total OneCoreComputeHours available (AutoScale Aware)           00h 03m
 OneCoreComputeHours wasted by driver                            00h 03m

 AutoScale Aware: Most of the calculations by this tool will assume that all executors are available throughout
                  the runtime of the application. The number above is printed to show possible caution to be
                  taken in interpreting the efficiency metrics.

 Cluster Utilization (Executors idle because of lack of tasks or skew)

 Executor OneCoreComputeHours available                  00h 00m
 Executor OneCoreComputeHours used                       00h 00m        2.49%
 OneCoreComputeHours wasted                              00h 00m        97.51%

 App Level Wastage Metrics (Driver + Executor)

 OneCoreComputeHours wasted Driver               93.62%
 OneCoreComputeHours wasted Executor             6.22%
 OneCoreComputeHours wasted Total                99.84%

       


 App completion time and cluster utilization estimates with different executor counts

 Real App Duration 00m 13s
 Model Estimation  00m 12s
 Model Error       6%

 NOTE: 1) Model error could be large when auto-scaling is enabled.
       2) Model doesn't handles multiple jobs run via thread-pool. For better insights into
          application scalability, please try such jobs one by one without thread-pool.

       
 Executor count     1  (100%) estimated time 00m 12s and estimated cluster utilization 0.17%
 Executor count     1  (110%) estimated time 00m 12s and estimated cluster utilization 0.17%
 Executor count     1  (120%) estimated time 00m 12s and estimated cluster utilization 0.17%
 Executor count     1  (150%) estimated time 00m 12s and estimated cluster utilization 0.17%
 Executor count     2  (200%) estimated time 00m 12s and estimated cluster utilization 0.08%
 Executor count     3  (300%) estimated time 00m 12s and estimated cluster utilization 0.06%
 Executor count     4  (400%) estimated time 00m 12s and estimated cluster utilization 0.04%
 Executor count     5  (500%) estimated time 00m 12s and estimated cluster utilization 0.03%



Total tasks in all stages 20
Per Stage  Utilization
Stage-ID   Wall    Task      Task     IO%    Input     Output    ----Shuffle-----    -WallClockTime-    --OneCoreComputeHours---   MaxTaskMem
          Clock%  Runtime%   Count                               Input  |  Output    Measured | Ideal   Available| Used%|Wasted%                                  
       0   96.00   84.30        10    NaN    0.0 KB    0.0 KB    0.0 KB    0.0 KB    00m 00s   00m 00s    00h 00m    4.1   95.9    0.0 KB 
       1    3.00   15.70        10    NaN    0.0 KB    0.0 KB    0.0 KB    0.0 KB    00m 00s   00m 00s    00h 00m   19.9   80.1    0.0 KB 
Max memory which an executor could have taken =   0.0 KB


 Stage-ID WallClock  OneCore       Task   PRatio    -----Task------   OIRatio  |* ShuffleWrite% ReadFetch%   GC%  *|
          Stage%     ComputeHours  Count            Skew   StageSkew                                                
      0   96.29         00h 00m      10    0.63     1.03     0.07     0.00     |*   0.00           0.00   448.28  *|
      1    3.71         00h 00m      10    0.63     1.00     0.35     0.00     |*   0.00           0.00     0.00  *|

PRatio:        Number of tasks in stage divided by number of cores. Represents degree of
               parallelism in the stage
TaskSkew:      Duration of largest task in stage divided by duration of median task.
               Represents degree of skew in the stage
TaskStageSkew: Duration of largest task in stage divided by total duration of the stage.
               Represents the impact of the largest task on stage time.
OIRatio:       Output to input ration. Total output of the stage (results + shuffle write)
               divided by total input (input data + shuffle read)

These metrics below represent distribution of time within the stage

ShuffleWrite:  Amount of time spent in shuffle writes across all tasks in the given
               stage as a percentage
ReadFetch:     Amount of time spent in shuffle read across all tasks in the given
               stage as a percentage
GC:            Amount of time spent in GC across all tasks in the given stage as a
               percentage

If the stage contributes large percentage to overall application time, we could look into
these metrics to check which part (Shuffle write, read fetch or GC is responsible)```

@SaurabhChawla100
Copy link
Author

cc @itskals @iamrohit @mayurdb

@vishalrv1904
Copy link

How you did sbt compile !!?
I am facing this error on compilation,
sbt.ResolveException: unresolved dependency: org.spark-packages#sbt-spark-package;0.2.4: not found

@tayarajat
Copy link

tayarajat commented Aug 11, 2021

How you did sbt compile !!?
I am facing this error on compilation,
sbt.ResolveException: unresolved dependency: org.spark-packages#sbt-spark-package;0.2.4: not found

update resolver at path to https://repos.spark-packages.org/

@vishalrv1904
Copy link

How you did sbt compile !!?
I am facing this error on compilation,
sbt.ResolveException: unresolved dependency: org.spark-packages#sbt-spark-package;0.2.4: not found

update resolver at path to https://repos.spark-packages.org/

Wow, that works !!
Thanks @tayarajat

@mayurdb
Copy link
Collaborator

mayurdb commented Aug 21, 2021

@SaurabhChawla100 Should we have a separate branch for Spark 3.0 instead?

I have created a new branch for Spark 3.0. If you think its better to have a separate branch please raise this against - https://github.com/qubole/sparklens/tree/SPARK30

@SaurabhChawla100 SaurabhChawla100 changed the base branch from master to SPARK30 August 23, 2021 08:36
@SaurabhChawla100
Copy link
Author

SaurabhChawla100 commented Aug 23, 2021

@SaurabhChawla100 Should we have a separate branch for Spark 3.0 instead?

I have created a new branch for Spark 3.0. If you think its better to have a separate branch please raise this against - https://github.com/qubole/sparklens/tree/SPARK30

@mayurdb - I have changed the merge branch to SPARK30.
But I think it's better to have branch2.x and we should merge the SPARK30 to master.

… spark with scala-2.12

  Author:    SaurabhChawla
  Date:      Fri Mar 19 22:37:13 2021 +0530
  Committer: Saurabh Chawla <[email protected]>

(cherry picked from commit 479468d)
@amaltb
Copy link

amaltb commented Sep 5, 2021

How did you run the sbt compile. I have updated the resolver URL to the one mentioned above. I still am getting unresolved dependency error

sbt.ResolveException: unresolved dependency: com.eed3si9n#sbt-assembly;0.12.0: not found [error] unresolved dependency: org.spark-packages#sbt-spark-package;0.2.4: not found

my project/plugin.sbt is

`addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.12.0")

resolvers += "Spark Package Main Repo" at "https://repos.spark-packages.org/"

addSbtPlugin("org.spark-packages" % "sbt-spark-package" % "0.2.4")`

@jbornemann
Copy link

Any update on this? This would be great to have

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants