Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blob and struct support #258

Merged
merged 7 commits into from
Oct 17, 2023
Merged

Blob and struct support #258

merged 7 commits into from
Oct 17, 2023

Conversation

honnix
Copy link
Member

@honnix honnix commented Oct 7, 2023

TL;DR

Restore the more or less useless blob support that we had before moving to 0.4. Also added struct support for both Java and Scala.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

It is more or less useless because blob is designed to go together with offloaded data (blob is not designed to be used directly as a task input/output) https://docs.flyte.org/projects/flytekit/en/latest/generated/flytekit.BlobType.html#flytekit-blobtype, which I think we have no plan to support any time soon.

Add struct support for both Java and Scala so a task can have nested input/output using either auto-value object or case class.

Note that originally this work was broken into 3 PRs (this and #259 and #262), each on top of a previous one, to avoid complex merging and conflict resolving. In a later stage of development, we merged #259 and #262 to this PR to ship everything in one goal.

Tracking Issue

NA

Follow-up issue

NA

@honnix honnix force-pushed the blob branch 8 times, most recently from 3280083 to 4d261c9 Compare October 8, 2023 21:15
@@ -230,6 +231,18 @@ object SdkScalaType {
implicit def durationLiteralType: SdkScalaLiteralType[Duration] =
DelegateLiteralType(SdkLiteralTypes.durations())

// fixme: create blob type from annotation
implicit def blobLiteralType: SdkScalaLiteralType[Blob] =
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried adding a new annotation for scala but it didn't work well because we will need to recreate a param, which is not easy. For the record, it is something like:

  private def updateParamIfNeeded[T](param: Param[Typeclass, T]) = {
    param.annotations
      .collectFirst { case ann: BlobTypeDescription =>
        SdkLiteralTypes
          .blobs(
            BlobType
              .builder()
              .format(ann.format)
              .dimensionality(ann.dimensionality)
              .build()
          )
          .asInstanceOf[SdkLiteralType[Any]]
      }
      .map(sdkBindingLiteralType =>
        // fixme: this apply function is not designed to be called from user code
        Param(
          param.label,
          param.typeName,
          param.index,
          param.repeated,
          CallByNeed(sdkBindingLiteralType),
          CallByNeed(param.default),
          param.annotationsArray,
          param.typeAnnotationsArray
        )
      )
      .getOrElse(param)
  }

  def combine[T](ctx: CaseClass[SdkScalaType, T]): SdkScalaProductType[T] = {
    // throwing an exception will abort implicit resolution for this case
    // very dirty down casting, but we need some evidence that all parameters are TypedLiterals
    // and that's the best we can do with magnolia unless we want to play with phantom types
    case class ParamsWithDesc(
        param: Param[SdkScalaLiteralType, T],
        desc: Option[String]
    )
    val params = ctx.parameters.map { param =>
      param.typeclass match {
        case _: SdkScalaProductType[_] =>
          sys.error("nested structs aren't supported")
        case _: SdkScalaLiteralType[_] =>
          val optDesc = param.annotations.collectFirst {
            case ann: Description => ann.value
          }

          val updatedParam = updateParamIfNeeded(param)
          ParamsWithDesc(
            updatedParam.asInstanceOf[Param[SdkScalaLiteralType, T]],
            optDesc
          )
      }
    }

First of all, the apply is not supposed to be used; second, I got weird compilation problem that I didn't have time (and it didn't seem worth it either) to dig deeply.

return SdkLiteralTypes.maps(toLiteralType(valueType, false, propName, declaringClassName));
return SdkLiteralTypes.maps(toLiteralType(valueType, false, propName, member));
} else if (Blob.class.isAssignableFrom(type)) {
BlobTypeDescription annotation = member.getAnnotation(BlobTypeDescription.class);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only reason we need to get hold of the whole member object.

@@ -80,7 +83,7 @@ private SdkBindingData<?> transform(JsonNode tree) {
}
}

private static SdkBindingData<? extends Serializable> transformScalar(JsonNode tree) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessarily strict and blob is not serializable.

@@ -51,6 +51,11 @@
<artifactId>auto-service-annotations</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed because we now reference Blob in flytekit-java.

@@ -172,7 +172,6 @@ object SdkBindingDataConverters {
jf.Function.identity()
)
}
case LiteralType.Kind.BLOB_TYPE => ??? // TODO not yet supported
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no thing to covert for Blob because it is the same class for scala and java.

default:
throw new UnsupportedOperationException(
"Type contains an unsupported scalar: " + scalarKind);
}
}

private static SdkBindingData<Blob> transformBlob(JsonNode tree) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main change to get blob back.

@honnix honnix marked this pull request as ready for review October 9, 2023 09:23
@honnix honnix force-pushed the blob branch 4 times, most recently from 6ea206b to 8284ee3 Compare October 9, 2023 21:09
@@ -48,8 +51,8 @@ public abstract static class AutoAllInputsInput {

public abstract SdkBindingData<Duration> d();

// TODO add blobs to sdkbinding data
// public abstract SdkBindingData<Blob> blob();
@BlobTypeDescription(format = "csv", dimensionality = BlobDimensionality.MULTIPART)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure whether we should add this annotation because Blob type is not meant to be used by users directly. We could decide to restore to exactly what it was in 0.3.

Copy link
Member Author

@honnix honnix Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this in a later commit. My reasoning is to lower the surface area until we figured out whether we want to support https://docs.flyte.org/projects/flytekit/en/latest/generated/flytekit.BlobType.html#flytekit-blobtype property.

@honnix
Copy link
Member Author

honnix commented Oct 17, 2023

Note that #262 is also targeting blob branch.

@honnix
Copy link
Member Author

honnix commented Oct 17, 2023

I don't know what DCO is complaining about. Every commit is signed I believe. Fixed it but I still don't know why :D.

@honnix
Copy link
Member Author

honnix commented Oct 17, 2023

Note that #262 is also targeting blob branch.

@andresgomezfrr Maybe let's just use this PR to ship everything including #262 ?

@andresgomezfrr
Copy link
Collaborator

Note that #262 is also targeting blob branch.

@andresgomezfrr Maybe let's just use this PR to ship everything including #262 ?

ok! I just reviewed this one, but before merge I'll review the scala one too

@andresgomezfrr
Copy link
Collaborator

@honnix we need to fix the DCO to merge this one

@honnix
Copy link
Member Author

honnix commented Oct 17, 2023

@honnix we need to fix the DCO to merge this one

Yeah. On it.

Signed-off-by: Hongxin Liang <[email protected]>
Signed-off-by: Hongxin Liang <[email protected]>
Signed-off-by: Hongxin Liang <[email protected]>
Signed-off-by: Hongxin Liang <[email protected]>
Signed-off-by: Hongxin Liang <[email protected]>
Signed-off-by: Hongxin Liang <[email protected]>
@honnix honnix changed the title Bring back Blob support Blob and struct support Oct 17, 2023
@honnix
Copy link
Member Author

honnix commented Oct 17, 2023

Let's :shipit: ? @andresgomezfrr

@andresgomezfrr andresgomezfrr merged commit 6a26d59 into master Oct 17, 2023
3 checks passed
@andresgomezfrr andresgomezfrr deleted the blob branch October 17, 2023 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants