-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PIC S9(10)V USAGE COMP-3 is converted to long instead of Decimal(10,0) #678
Comments
Yes, Cobrix prefers |
Sure. We can do that but we will have to maintain a mapping for each file for which fields to convert and hence was asking. |
One way to automate this is to use extended metadata. It adds more info from the copybook for each field. .option("metadata", "extended") You can view which metadata fields are available like this: df.schema.fields.foreach{ field =>
val metadata = field.metadata
println(s"Field: ${field.name}, metadata: ${metadata.json} ")
}
// Returns: {"originalName":"MY_FIELD","precision":4,"usage":"COMP-3","signed":true,"offset":17,"byte_size":2,"sign_separate":false,"pic":"S9(4)","level":2} The You can apply the casting logic something like this: val columns = df.schema.fields.map { field =>
val metadata = field.metadata
if (metadata.contains("precision")) {
val precision = metadata.getLong("precision").toInt
if (field.dataType == LongType || field.dataType == IntegerType || field.dataType == ShortType) {
println(s"Cast ${field.name} to decimal($precision,0)")
col(field.name).cast(DecimalType(precision, 0)).as(field.name)
} else {
col(field.name)
}
} else {
col(field.name)
}
}
val dfNew = df.select(columns: _*)
dfNew.printSchema |
An experimental method val df = sparkread.format("cobol")
.option("metadata", "extended")
.load("/path/to/file")
val df2 = SparkUtils.covertIntegralToDecimal(df) |
This is something we are going to implement as a new feature. Something like: .option("strict_integral_precision", "true") (not final) |
The changes are available in the
|
The feature is available in |
I have a copybook where the field is defied as PIC S9(10)V USAGE COMP-3. When we read the file in cobrix this field is being created as Long instead of Decimal(10,0). Is it expected this way because its spark property or can it be created as Decimal(10,0)
The text was updated successfully, but these errors were encountered: