Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-5466. Refactor BlockOutputStream. #2442

Merged
merged 2 commits into from
Jul 28, 2021
Merged

Conversation

szetszwo
Copy link
Contributor

What changes were proposed in this pull request?

We propose to make BlockOutputStream abstract and move out the Ratis specific code out. Then, other implementations such as ECBlockOutputStream can extend it and reuse the common parts.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-5466

How was this patch tested?

Modified existing tests.

Copy link
Contributor

@umamaheswararao umamaheswararao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szetszwo Thanks a lot for refactoring BOS. I have added few comments, please take a look. Thanks

* @throws IOException if there is an I/O error while performing the call
* @throws OzoneChecksumException if there is an error while computing
* checksum
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we have Ratis specific logic here? What do you think if we move this to BlockOutputStream?
Actual write is depends on xceiverClient anyway ( whether it's Ratis or Standalone)

Even for EC we tried to use the same writeChunkToContainer API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This method should also be moved back to BlockOutputStream.

RatisBlockOutputStream.class);

private XceiverClientFactory xceiverClientFactory;
private XceiverClientSpi xceiverClient;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have exceiverClientRatis and exceiverClientGRPC.
So, currently both can set into RatisBlockOutputStream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Let me move them back to BlockOutputStream.

@@ -503,9 +453,16 @@ private void writeChunk(ChunkBuffer buffer)
bufferList = new ArrayList<>();
}
bufferList.add(buffer);
writeChunkToContainer(buffer.duplicate(0, buffer.position()));
final ChunkBuffer duplicated = buffer.duplicate(0, buffer.position());
final ChunkInfo chunk = writeChunkToContainer(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we override write API, the extended BOS need to complement their own way of writing, but they need a way to add the chunkInfo to containerBlockData as it's used in executePutBlock. Should we expose abstract addChunks method to update the chunkinfo into containerBlockData object?

Ex: for ECBOS, we decided to skip the buffered writing. So, overridden the write API and used writeChunkToContainer API to write directly to container. Here we also overridden putBlock call and just added getContainerBlock data API to get the structure.
Please check if there is better idea though.


/**
* Writes buffered data as a new chunk to the container and saves chunk
* information to be used later in putKey call.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think second point is about adding chunkinfo to containerBlockData. Since moved that part to invoking method, we may need to update this comment or moved the addchunkInfo here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since xceiverClient is moved back to BlockOutputStream, this method should also be moved back.

@umamaheswararao
Copy link
Contributor

Thank you @szetszwo for the updates.
The latest patch looks good to me. However I have one point to discuss: Most of the abstract APIs are related to commitWatcher. When we have other implementations like standalone or EC, they don't need most of them. is it a good idea to have them empty implementation in BOS, so that it will not force other classes to implement?

@szetszwo
Copy link
Contributor Author

That is a great idea. Let me update the change.

@kaijchen
Copy link
Contributor

Hi, a few questions here.

The BlockDataStreamOutput in HDDS-4454 is not going to extend BlockOutputStream right?
Because for avoid buffer copy, it won't extend java.io.OutputStream.

When seperating XceiverClientRatis and XceiverClientGrpc and removing XceiverClientSpi.
Will the RatisBlockOutputStream be exclusive to XceiverClientRatis?
Will there be something like GrpcBlockOutputStream for XceiverClientGrpc?

@szetszwo
Copy link
Contributor Author

The BlockDataStreamOutput in HDDS-4454 is not going to extend BlockOutputStream right?
Because for avoid buffer copy, it won't extend java.io.OutputStream.

Yes, you are correct.

When seperating XceiverClientRatis and XceiverClientGrpc and removing XceiverClientSpi.
Will the RatisBlockOutputStream be exclusive to XceiverClientRatis?
Will there be something like GrpcBlockOutputStream for XceiverClientGrpc?

In this case, there should be a GrpcBlockOutputStream. Otherwise, grpc won't be able to write anything.

@szetszwo szetszwo merged commit 59d6e95 into apache:master Jul 28, 2021
@umamaheswararao
Copy link
Contributor

Thank you @szetszwo for the patch and committing it.

errose28 added a commit to errose28/ozone that referenced this pull request Jul 30, 2021
* master: (48 commits)
  HDDS-5514. Skip check for UNHEALTHY containers for datanode finalize. (apache#2469)
  HDDS-5279. OFS mkdir -p does not work when Volume is not pre-created (apache#2412)
  HDDS-5328. Remove delete container command from admin CLI (apache#2456)
  HDDS-5382. Increase default container report interval to 60 mins (apache#2363)
  HDDS-5378 Add APIs to retrieve Namespace Summary from Recon (apache#2417)
  HDDS-5466. Refactor BlockOutputStream. (apache#2442)
  HDDS-5465. Delete redundant code when set、add and remove bucket acl (apache#2439)
  HDDS-5184. Use separate DB profile for Datanodes. (apache#2214)
  HDDS-5494. Reduce retry in Kubernetes test (apache#2461)
  HDDS-5414. Data buffers incorrectly filtered for Ozone Insight (apache#2387)
  HDDS-5450. Avoid refresh pipeline for S3 headObject (apache#2431)
  HDDS-5500. New k3s version breaks kubernetes test (apache#2464)
  HDDS-5489. Install OS-specific flekszible (apache#2462)
  Multi-raft style placement with permutations for offline data generator (apache#2434)
  HDDS-5484. Intermittent failure in TestReplicationManager#testMovePrerequisites (apache#2454)
  HDDS-5443 Create and then recreate a bucket with a randomized name (apache#2436)
  HDDS-5492. Disable failing kubernetes test (apache#2459)
  HDDS-4330. Bootstrap new OM node (apache#1494)
  HDDS-5418. Let Recon send reregisterCommand to Datanodes if DatanodeDetails changed (apache#2392)
  HDDS-5479. s3g bucket list failed when there is non-english key name. (apache#2450)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants