Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ALLUXIO-2226] Add a LocalFirstPolicy that without evict action #4445

Merged
merged 19 commits into from
Jan 25, 2017

Conversation

gjhkael
Copy link
Contributor

@gjhkael gjhkael commented Dec 15, 2016

@gjhkael
Copy link
Contributor Author

gjhkael commented Dec 15, 2016

alluxio-bot, check this please

@alluxio-bot
Copy link
Contributor

Automated checks report:

  • Valid pull request title: PASS
  • Contains link to JIRA ticket: PASS
  • Commits associated with Github account: PASS

All checks passed!

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/12598/
Test PASSed.

@alluxio-bot
Copy link
Contributor

Automated checks report:

  • Valid pull request title: PASS
  • Contains link to JIRA ticket: PASS
  • Commits associated with Github account: PASS

All checks passed!

Copy link
Contributor

@aaudiber aaudiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gjhkael Thanks for adding this new policy with good test cases! I've left a few comments on the code. Can you explain more about the purpose of the USER_FILE_WRITE_CAPACITY_RESERVED_RATIO? If possible I'd like to keep this PR simple and consider that change separately from the new policy.

LocalFirstWithoutEvictionPolicy policy = new LocalFirstWithoutEvictionPolicy();
List<BlockWorkerInfo> workerInfoList = new ArrayList<>();
workerInfoList.add(new BlockWorkerInfo(new WorkerNetAddress().setHost("worker1")
.setRpcPort(PORT).setDataPort(PORT).setWebPort(PORT), Constants.GB, 0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to set these ports, they will be zero by default which is fine for this test

LocalFirstWithoutEvictionPolicy policy = new LocalFirstWithoutEvictionPolicy();
List<BlockWorkerInfo> workerInfoList = new ArrayList<>();
workerInfoList.add(new BlockWorkerInfo(new WorkerNetAddress().setHost("worker1")
.setRpcPort(PORT).setDataPort(PORT).setWebPort(PORT), Constants.GB, 0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like worker1 has 1GB available space, so it's confusing that it's not allowed to allocate a 1GB block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the policy is local first, even if worker1 has the available space but local will be choice first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, meant to write this on line 73

@@ -30,6 +30,9 @@ alluxio.user.file.worker.client.threads:
How many threads to use for file worker clients to read from workers.
alluxio.user.file.write.location.policy.class:
The default location policy for choosing workers for writing a file's blocks
alluxio.user.file.write.capacity.reserved.ratio:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this property useful? We don't use anything like this for the regular LocalFirstPolicy so it seems strange to use it here. Is it possible to break adding this property into a separate PR so that it doesn't block merging the new location policy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The property is discussed in this pull request(#4423), Because my git commit history is corrupted, So, I open the new pull request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link, this makes more sense now. Instead of using a ratio, do you think it makes more sense for the configuration to specify a certain number of MB to reserve instead of a ratio, and have it default to 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it is more suitable to use a configuration to specify the number of MB to reserve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for you review, I think this feature is demand by many people. So, I think it is of great value to users.

* If No worker meets the demands, return local host.
*/
@ThreadSafe
public class LocalFirstWithoutEvictionPolicy implements FileWriteLocationPolicy {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This policy could still cause eviction when all workers are out of space, so it might be better to name it LocalFirstAvoidEvictionPolicy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems quite reasonable.

workerInfoList.add(new BlockWorkerInfo(new WorkerNetAddress().setHost(localhostName)
.setRpcPort(PORT).setDataPort(PORT).setWebPort(PORT), Constants.GB, Constants.MB));
Assert.assertEquals(localhostName,
policy.getWorkerForNextBlock(workerInfoList, Constants.GB).getHost());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the block needs 1GB , the local worker has (1GB - 1MB) available, and the remote worker has 1GB available, should the block be written to the remote worker?

Copy link
Contributor Author

@gjhkael gjhkael Dec 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because getAvailableBytes() is not return the actual available bytes of the worker, for the used bytes are updated only after a file commits. So, I need to reserve some space to get the available bytes. But, as you say, this policy would happened that the worker1 actually has 1GB, but the client do not get it to store 1GB data.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/12634/
Test FAILed.

@gjhkael
Copy link
Contributor Author

gjhkael commented Dec 19, 2016

@aaudiber can you review it?

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/12637/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/12638/
Test PASSed.

Copy link
Contributor

@aaudiber aaudiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few more small comments, looks good overall

public long getAvailableBytes() {
mUserFileWriteCapacityReserved = Configuration
.getBytes(PropertyKey.USER_FILE_WRITE_CAPACITY_RESERVED_SIZE_BYTES);
return mCapacityBytes - mUsedBytes - mUserFileWriteCapacityReserved;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The space reservation logic is specific to LocalFirstAvoidEvictionPolicy, so we should put the logic there instead of modifying BlockWorkerInfo

/**
* A policy that returns local host first, and if the local worker doesn't have enough availability,
* it randomly picks a worker from the active workers list for each block write.
* If No worker meets the demands, return local host.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also specify the behavior of USER_FILE_WRITE_CAPACITY_RESERVED_SIZE_BYTES?

@@ -252,6 +252,8 @@
Name.USER_FILE_WORKER_CLIENT_POOL_GC_THRESHOLD_MS, 300 * Constants.SECOND_MS),
USER_FILE_WRITE_LOCATION_POLICY(Name.USER_FILE_WRITE_LOCATION_POLICY,
"alluxio.client.file.policy.LocalFirstPolicy"),
USER_FILE_WRITE_CAPACITY_RESERVED_SIZE_BYTES(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This property only modifies the new policy, so what do you think of naming it
USER_FILE_WRITE_AVOID_EVICTION_POLICY_RESERVED_BYTES?

@@ -30,6 +30,9 @@ alluxio.user.file.worker.client.threads:
How many threads to use for file worker clients to read from workers.
alluxio.user.file.write.location.policy.class:
The default location policy for choosing workers for writing a file's blocks
alluxio.user.file.write.capacity.reserved.size.bytes:
The portion of space reserced in worker when user use the LocalFirstAvoidEvictionPolicy class
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reserced -> reserved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about this small mistake.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/12641/
Test PASSed.

@gjhkael
Copy link
Contributor Author

gjhkael commented Dec 19, 2016

alluxio-bot, check this please

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/12642/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/12643/
Test PASSed.

Copy link
Contributor

@aaudiber aaudiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for adding this much-requested policy!

Copy link
Contributor

@yupeng9 yupeng9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this useful policy. Left some minor comments

* to store the block, for the values mCapacityBytes minus mUsedBytes is not the available bytes.
*/
@ThreadSafe
public class LocalFirstAvoidEvictionPolicy implements FileWriteLocationPolicy {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your reminder.

* A policy that returns local host first, and if the local worker doesn't have enough availability,
* it randomly picks a worker from the active workers list for each block write.
* If No worker meets the demands, return local host.
* USER_FILE_WRITE_AVOID_EVICTION_POLICY_RESERVED_BYTES is use to reserved some space of the worker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is used to reserve

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/12674/
Test PASSed.

Copy link
Contributor

@gpang gpang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gjhkael Thanks for this feature! I left a few comments.

return workerInfo.getNetAddress();
}
}
return localWorkerNetAddress;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there was no worker on the local host? That means localWorkerNetAddress would have been null. Should null be returned, or should a random worker still be picked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gpang After long deliberation, if no worker on the local host and have on worker have available capacity to store the block, it should pick a random worker. When no worker in this cluster, it should return null. Thanks for you remind, it make sense.


/**
* @param workerInfo BlockWorkerInfo of the worker
* @return the available bytes of the worker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment how USER_FILE_WRITE_AVOID_EVICTION_POLICY_RESERVED_BYTES is used to compute the available bytes of the worker

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original idea is use the variable mCapacityBytes minus mUsedBytes of class BlockWorkerInfo to get the valid Bytes of the worker. But yupeng9 tell me that this way does not actually get the available Bytes for the information of class BlockWorkerInfo is update only after a file is completely write. So, he suggest me to reserve some space to store the block.

Copy link
Contributor Author

@gjhkael gjhkael Dec 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will comment above describe to the method. Thanks for you advise.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/12744/
Test PASSed.

@gjhkael
Copy link
Contributor Author

gjhkael commented Dec 29, 2016

How to resolve conflicts?

@AmplabJenkins
Copy link

Build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/12829/

Failed Tests: 1

org.alluxio:alluxio-core-client: 1


Test FAILed.

@AmplabJenkins
Copy link

Build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/12830/
Test PASSed.

@calvinjia
Copy link
Contributor

@gjhkael You can run git merge master which will automatically merge the non-conflicting changes. You will then need to manually select which changes you want to keep (or if you want to keep parts of both) for the conflicting portions. Here is the github documentation: https://help.github.com/articles/resolving-a-merge-conflict-using-the-command-line/

@AmplabJenkins
Copy link

Build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/13036/
Test PASSed.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Alluxio-Pull-Request-Builder/13037/
Test PASSed.

@gjhkael
Copy link
Contributor Author

gjhkael commented Jan 10, 2017

@calvinjia thanks for you help.

Copy link
Contributor

@gpang gpang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gjhkael Thanks for this feature!

LGTM

Copy link
Contributor

@aaudiber aaudiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aaudiber aaudiber merged commit 2046ad8 into Alluxio:master Jan 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants