Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation #2

Open
achintya-kumar opened this issue May 5, 2017 · 13 comments
Open

Installation #2

achintya-kumar opened this issue May 5, 2017 · 13 comments
Assignees
Milestone

Comments

@achintya-kumar
Copy link
Owner

No description provided.

@achintya-kumar achintya-kumar added this to the Labs milestone May 5, 2017
@achintya-kumar achintya-kumar self-assigned this May 5, 2017
@achintya-kumar
Copy link
Owner Author

achintya-kumar commented May 10, 2017

Hi @HorizonNet and @tantalus1984 !
I have two doubts from the first task. Kindly help me understand the problem better.

  1. It is a regular practice to disable SELinux. It is not in the list of tasks. Should not this be one of the tasks?
  2. Linux OS reserves 5% of the memory for root users, which implies the remaining can be used by non-root user. Is that what you mean by reserve space for non root volumes?

Thank you in advance.

@HorizonNet
Copy link
Collaborator

@achintya-kumar Good questions.

  1. Best answer: it depends. Some components of Hadoop are not working very well with SELinux. You can create policies, but they are hard to implement. Where does the depends comes into play? It depends on whether your cluster is in the public net or not. It also depends on your security policies. For a POC cluster, which is what you're going to do, it is in my opinion totally fine to have SELinux disabled.
    It is not in the list of System Configuration Checks because this list is not complete from an installation point of view (the list is not intended to be a step-by-step guide). It only helps us (and you) to see if there are possible problems you can run into when the cluster is running for a while.
  2. Not exactly. We want to see that you made sure to have enough disk space before starting the installation.

Hope this answers your questions.

@achintya-kumar
Copy link
Owner Author

Thank you so much for your reply. It is crystal clear now.
I have one additional question.

I am using Azure Free Tier for this assignment. I am allowed to use 4 cores per region. I have currently 2 machines, with 2 cores/16GB memory each. My initial plan was to go with the recommended size of the cluster, ie 5. However, because of the regional limit, I am limited to 2 machines as of now.

I am allowed to create new VM instances in other regions. My question is, is it possible by any way to have nodes in different regions and yet build a cluster out of them without increased complexity?

@HorizonNet
Copy link
Collaborator

This should be possible, but a problem you can run into is the traffic between the regions. Normally traffic inside a datacenter is free, but in- and outbound traffic is not. Not sure if this can be a problem in the free tier. Do you can, by chance, increase the core restriction via a service request to the Azure team? I did this to increase the core limit on my MSDN subscriptions, but I'm not sure if that is possible in the free tier.

@achintya-kumar
Copy link
Owner Author

Thank you!

Upon requesting increment, they ask me to upgrade to 'Pay As You Go' tier. I suppose I should do it once I've everything working for my free-tier nodes.

@HorizonNet
Copy link
Collaborator

You should stay in the free tier. Try to use different regions, but at first you should review in- and outbound traffic limitations. It could be that this isn't a problem at all.

@achintya-kumar
Copy link
Owner Author

achintya-kumar commented May 16, 2017

Hi!
Here is a report of two things I have learnt so far.

  1. CM demands that we disable SELinux at the time of installation.
  2. I created some solid hosts(4 cores, 28GB, 200GB SSD) but in different regions(West Europe, North Europe and West US). This led to having to route the data through the internet for connectivity. While it works, I believe an intra-datacentre cluster will outperform the current setup by several folds.

@HorizonNet
Copy link
Collaborator

Below is a short review.

Tasks

  • System Configuration Checks
  • MySQL installation
  • CM/CDH installation
  • Using the CM API
  • Upgrade Cloudera Manager
  • Using a local parcel repository

Open points:

  • System Configuration Checks
    • The command for the mount attributes of all volumes is wrong. Hint: You have to check a specific file on disk for that.
    • The command for showing the reserve space of any non-root, ext-based volumes is already in your documentation and just needs some adjustment.
  • CM/CDH installation
    • You deployed CDH 5.8.4 instead of CDH 5.8.3. Seems weird because you seem to have the correct CM version according to your documentation.
    • There's no need to use the installer. The agents and the server is service-based.

General feedback:

You're definitely on the right track, but details are important. Have a look at the open points above. The amount of documentation you wrote so far is pretty good and helps to understand where you went into the wrong direction.

@achintya-kumar
Copy link
Owner Author

Thank you for taking your time and giving me this detailed feedback. I shall rectify what's wrong here and get back to you.

Best Regards

@achintya-kumar
Copy link
Owner Author

screenshot from 2017-06-07 23-29-30

Hi!
I did this installation yesterday. Upon reaching CDH installation phase, despite my CM being of version 5.8.3, it doesn't let me have the same version of CDH with the note shown in the image above. This is the reason why I have CM 5.8.3 and CDH 5.8.4 in my installation as it doesn't let one choose.

Thanks! :)

@HorizonNet
Copy link
Collaborator

That shouldn't be possible. You cannot manage a CDH version newer than your CM version. Don't know if that's also true for patch versions. Definitively for minor and major versions.

Did you install CM via an installer or YUM?

@achintya-kumar
Copy link
Owner Author

This was done using YUM.

@HorizonNet
Copy link
Collaborator

I have gone through the documentation. The minor version of CM must be equal of the minor version of CDH. Nevertheless, you're working with the default parcel version of CM 5.8.3 which is the latest CDH 5.8.x version. Changing the default let's you install CDH 5.8.3. This link may be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants