You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The purpose of this enhancement is to create a qualification tool that analyzes customer event files to determine which workloads are suitable for execution with Gluten. This is crucial when onboarding new customers, as not all workloads benefit from Gluten's native acceleration—especially workloads with RDD operations, unsupported SQL operators or UDF workloads.
Proposed Solution:
Develop a Java program to analyze the event files, given a Hadoop file path as input. The program will generate two reports:
Application Report:
Percentage of RDD usage
Percentage of Unsupported SQL operations
Percentage of supported SQL operations
Cumulative task time for each application
Recommendation to use Gluten Acceleration (Recommended if Percentage of supported SQL operations >= 70%)
Unsupported Operator Report:
Unsupported SQL operators
Impact on cumulative CPU time
Requirements:
Compatibility with Hadoop file paths that point to:
Single event files
Event directories with rolling event files
Deeply nested directories containing event files
Compressed event files
The text was updated successfully, but these errors were encountered:
Description
The purpose of this enhancement is to create a qualification tool that analyzes customer event files to determine which workloads are suitable for execution with Gluten. This is crucial when onboarding new customers, as not all workloads benefit from Gluten's native acceleration—especially workloads with RDD operations, unsupported SQL operators or UDF workloads.
Proposed Solution:
Develop a Java program to analyze the event files, given a Hadoop file path as input. The program will generate two reports:
Requirements:
The text was updated successfully, but these errors were encountered: