-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ALLUXIO-2226] Add a LocalFirstPolicy that without evict action #4445
Conversation
alluxio-bot, check this please |
Automated checks report:
All checks passed! |
Merged build finished. Test PASSed. |
Test PASSed. |
Automated checks report:
All checks passed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gjhkael Thanks for adding this new policy with good test cases! I've left a few comments on the code. Can you explain more about the purpose of the USER_FILE_WRITE_CAPACITY_RESERVED_RATIO? If possible I'd like to keep this PR simple and consider that change separately from the new policy.
LocalFirstWithoutEvictionPolicy policy = new LocalFirstWithoutEvictionPolicy(); | ||
List<BlockWorkerInfo> workerInfoList = new ArrayList<>(); | ||
workerInfoList.add(new BlockWorkerInfo(new WorkerNetAddress().setHost("worker1") | ||
.setRpcPort(PORT).setDataPort(PORT).setWebPort(PORT), Constants.GB, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to set these ports, they will be zero by default which is fine for this test
LocalFirstWithoutEvictionPolicy policy = new LocalFirstWithoutEvictionPolicy(); | ||
List<BlockWorkerInfo> workerInfoList = new ArrayList<>(); | ||
workerInfoList.add(new BlockWorkerInfo(new WorkerNetAddress().setHost("worker1") | ||
.setRpcPort(PORT).setDataPort(PORT).setWebPort(PORT), Constants.GB, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like worker1 has 1GB available space, so it's confusing that it's not allowed to allocate a 1GB block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the policy is local first, even if worker1 has the available space but local will be choice first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, meant to write this on line 73
@@ -30,6 +30,9 @@ alluxio.user.file.worker.client.threads: | |||
How many threads to use for file worker clients to read from workers. | |||
alluxio.user.file.write.location.policy.class: | |||
The default location policy for choosing workers for writing a file's blocks | |||
alluxio.user.file.write.capacity.reserved.ratio: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this property useful? We don't use anything like this for the regular LocalFirstPolicy
so it seems strange to use it here. Is it possible to break adding this property into a separate PR so that it doesn't block merging the new location policy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The property is discussed in this pull request(#4423), Because my git commit history is corrupted, So, I open the new pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link, this makes more sense now. Instead of using a ratio, do you think it makes more sense for the configuration to specify a certain number of MB to reserve instead of a ratio, and have it default to 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it is more suitable to use a configuration to specify the number of MB to reserve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for you review, I think this feature is demand by many people. So, I think it is of great value to users.
* If No worker meets the demands, return local host. | ||
*/ | ||
@ThreadSafe | ||
public class LocalFirstWithoutEvictionPolicy implements FileWriteLocationPolicy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This policy could still cause eviction when all workers are out of space, so it might be better to name it LocalFirstAvoidEvictionPolicy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems quite reasonable.
workerInfoList.add(new BlockWorkerInfo(new WorkerNetAddress().setHost(localhostName) | ||
.setRpcPort(PORT).setDataPort(PORT).setWebPort(PORT), Constants.GB, Constants.MB)); | ||
Assert.assertEquals(localhostName, | ||
policy.getWorkerForNextBlock(workerInfoList, Constants.GB).getHost()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the block needs 1GB , the local worker has (1GB - 1MB) available, and the remote worker has 1GB available, should the block be written to the remote worker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because getAvailableBytes() is not return the actual available bytes of the worker, for the used bytes are updated only after a file commits. So, I need to reserve some space to get the available bytes. But, as you say, this policy would happened that the worker1 actually has 1GB, but the client do not get it to store 1GB data.
Merged build finished. Test FAILed. |
Test FAILed. |
@aaudiber can you review it? |
Merged build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test PASSed. |
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few more small comments, looks good overall
public long getAvailableBytes() { | ||
mUserFileWriteCapacityReserved = Configuration | ||
.getBytes(PropertyKey.USER_FILE_WRITE_CAPACITY_RESERVED_SIZE_BYTES); | ||
return mCapacityBytes - mUsedBytes - mUserFileWriteCapacityReserved; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The space reservation logic is specific to LocalFirstAvoidEvictionPolicy
, so we should put the logic there instead of modifying BlockWorkerInfo
/** | ||
* A policy that returns local host first, and if the local worker doesn't have enough availability, | ||
* it randomly picks a worker from the active workers list for each block write. | ||
* If No worker meets the demands, return local host. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also specify the behavior of USER_FILE_WRITE_CAPACITY_RESERVED_SIZE_BYTES
?
@@ -252,6 +252,8 @@ | |||
Name.USER_FILE_WORKER_CLIENT_POOL_GC_THRESHOLD_MS, 300 * Constants.SECOND_MS), | |||
USER_FILE_WRITE_LOCATION_POLICY(Name.USER_FILE_WRITE_LOCATION_POLICY, | |||
"alluxio.client.file.policy.LocalFirstPolicy"), | |||
USER_FILE_WRITE_CAPACITY_RESERVED_SIZE_BYTES( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This property only modifies the new policy, so what do you think of naming it
USER_FILE_WRITE_AVOID_EVICTION_POLICY_RESERVED_BYTES
?
@@ -30,6 +30,9 @@ alluxio.user.file.worker.client.threads: | |||
How many threads to use for file worker clients to read from workers. | |||
alluxio.user.file.write.location.policy.class: | |||
The default location policy for choosing workers for writing a file's blocks | |||
alluxio.user.file.write.capacity.reserved.size.bytes: | |||
The portion of space reserced in worker when user use the LocalFirstAvoidEvictionPolicy class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reserced -> reserved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about this small mistake.
Merged build finished. Test PASSed. |
Test PASSed. |
alluxio-bot, check this please |
Test PASSed. |
Merged build finished. Test PASSed. |
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for adding this much-requested policy!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this useful policy. Left some minor comments
* to store the block, for the values mCapacityBytes minus mUsedBytes is not the available bytes. | ||
*/ | ||
@ThreadSafe | ||
public class LocalFirstAvoidEvictionPolicy implements FileWriteLocationPolicy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your reminder.
* A policy that returns local host first, and if the local worker doesn't have enough availability, | ||
* it randomly picks a worker from the active workers list for each block write. | ||
* If No worker meets the demands, return local host. | ||
* USER_FILE_WRITE_AVOID_EVICTION_POLICY_RESERVED_BYTES is use to reserved some space of the worker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is used to reserve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Merged build finished. Test PASSed. |
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gjhkael Thanks for this feature! I left a few comments.
return workerInfo.getNetAddress(); | ||
} | ||
} | ||
return localWorkerNetAddress; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there was no worker on the local host? That means localWorkerNetAddress
would have been null. Should null
be returned, or should a random worker still be picked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gpang After long deliberation, if no worker on the local host and have on worker have available capacity to store the block, it should pick a random worker. When no worker in this cluster, it should return null. Thanks for you remind, it make sense.
|
||
/** | ||
* @param workerInfo BlockWorkerInfo of the worker | ||
* @return the available bytes of the worker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment how USER_FILE_WRITE_AVOID_EVICTION_POLICY_RESERVED_BYTES
is used to compute the available bytes of the worker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original idea is use the variable mCapacityBytes minus mUsedBytes of class BlockWorkerInfo to get the valid Bytes of the worker. But yupeng9 tell me that this way does not actually get the available Bytes for the information of class BlockWorkerInfo is update only after a file is completely write. So, he suggest me to reserve some space to store the block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will comment above describe to the method. Thanks for you advise.
Merged build finished. Test PASSed. |
Test PASSed. |
How to resolve conflicts? |
Build finished. Test FAILed. |
Test FAILed. Failed Tests: 1org.alluxio:alluxio-core-client: 1Test FAILed. |
Build finished. Test PASSed. |
Test PASSed. |
@gjhkael You can run git merge master which will automatically merge the non-conflicting changes. You will then need to manually select which changes you want to keep (or if you want to keep parts of both) for the conflicting portions. Here is the github documentation: https://help.github.com/articles/resolving-a-merge-conflict-using-the-command-line/ |
Build finished. Test PASSed. |
Test PASSed. |
Merged build finished. Test PASSed. |
Test PASSed. |
@calvinjia thanks for you help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gjhkael Thanks for this feature!
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
https://alluxio.atlassian.net/browse/ALLUXIO-2226