Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Keep getting: open `': Change reported by S3 during open at position 0. ETag was unavailable" when reading from S3 #242

Open
iwb-vhuysmans opened this issue Jan 18, 2023 · 0 comments

Comments

@iwb-vhuysmans
Copy link

Hi,

I'm currently using following maven dependencies in my project:

       <dependency>
            <groupId>com.dimafeng</groupId>
            <artifactId>testcontainers-scala_2.12</artifactId>
            <version>0.40.12</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>com.dimafeng</groupId>
            <artifactId>testcontainers-scala-dynalite_2.12</artifactId>
            <version>0.40.12</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>com.dimafeng</groupId>
            <artifactId>testcontainers-scala-localstack_2.12</artifactId>
            <version>0.40.12</version>
            <scope>test</scope>
        </dependency>

I have some code where I setup AmazonS3 client using LocalStackContainer:

  override val container: LocalStackContainer = new LocalStackContainer(services = List(S3))
  implicit var client: AmazonS3 = null
  var sparkCsvReader: SparkCsvReader = null

  override protected def beforeAll(): Unit = {
    container.start()
    client = AmazonS3ClientBuilder
      .standard()
      .withEndpointConfiguration(
        new AwsClientBuilder.EndpointConfiguration(
          container.container.getEndpointOverride(S3).toString,
          container.container.getRegion
        )
      )
      .withCredentials(
        new AWSStaticCredentialsProvider(
          new BasicAWSCredentials(container.container.getAccessKey, container.container.getSecretKey)
        )
      )
      .build()

    ss.sparkContext.hadoopConfiguration.set("fs.s3a.endpoint", container.container.getEndpointOverride(S3).toString)
    ss.sparkContext.hadoopConfiguration.set("fs.s3a.access.key", container.container.getAccessKey)
    ss.sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", container.container.getSecretKey)
    ss.sparkContext.hadoopConfiguration.set("fs.s3a.impl","org.apache.hadoop.fs.s3a.S3AFileSystem")
    ss.sparkContext.hadoopConfiguration.set("fs.s3.impl","org.apache.hadoop.fs.s3a.S3AFileSystem")

    sparkCsvReader = new SparkCsvReader() // Class I would like to test

  }

When I use the client and create a bucket and upload two files everything works fine. But when I try to read them back from the S3 bucket, I keep getting following error:

open `s3a://bucket1/file1.csv': Change reported by S3 during open at position 0. ETag 079a45cc9a4cda24698dddf8f6263cdd was unavailable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant