Choosing Kerberos approach for Hadoop cluster in an enterprise environment

Factors to consider before choosing an approach for Kerberos implementation within an enterprise.

Article

Choosing an approach for Kerberos implementation on Hadoop cluster is critical from a long term maintenance point. Enterprises have their own security policies and guidelines and a successful kerberos implementation needs to adhere to enterprise security architecture. There are multiple guides available on how to implement Kerberos but I couldn't find information on which approach to choose and Pros and Cons associated with each approach.
In a Hortonworks Hadoop cluster, there are 3 different ways of generating keytabs and principals and managing them.
a. Use an MIT KDC specific to Hadoop cluster - automated keytab management using Ambari
KDC specific to Hadoop cluster can be installed and maintained on one of the Hadoop nodes. All users/keytabs required for kerberos implementation are automatically managed using Ambari.
Pros:
  • Enterprise security teams not involved with KDC setup. Hadoop administrators have complete control of KDC installation.
  • Automated keytab management using Ambari. No need to manually manage any keytabs during cluster configuration changes or cluster topology changes.
  • Non-expiring keytabs can be generated for developers and distributed to hadoop developers. Developers can have a copy of keytabs attached to their own id.
  • One way trust can be set up so enterprise Active Directory can recognize hadoop users.
Cons:
  • May be against enterprise security policies.
  • Hadoop administrators have additional responsibility of managing KDC. Any security vulnerabilities will be responsibility of Hadoop administrators.
  • Ensuring KDC is setup for high availability and Disaster Recovery is responsibility of Hadoop administrators.
  • Requires manual keytab generation for any developers. For any new developers, new keytabs need to be generated and distributed by hadoop administrators.
  • Need to setup procedures for loss of keytabs.
b. Use an existing Enterprise Active Directory - Manual setup
An alternative to having local KDC for hadoop cluster is to manually generate usernames and principals required for kerberos using Ambari and then use corporate AD to create these users.
Pros:
  • Meets enterprise security standards by leveraging existing corporate AD infrastructure.
  • Developers are part of existing AD and no keytabs generations are required for developers.
Cons:
  • Manually managing keytabs in a large cluster becomes tedious and difficult to maintain with continous changes to cluster structure.
  • Any changes in Hadoop cluster structure (add/delete node, add/delete service on new node) require new keytabs to be generated and distributed
c. Using existing Enterprise AD with automated management using Ambari
In this approach a new OU unit is created in enterprise AD and an AD account is created with complete administrative privilege on new OU. This account and OU are then used during automated setup in Ambari. This allows Ambari to automatically manage all keytabs/principal generation and keytab distribution. OU maintains all keytabs and principals for hadoop internal users required for kerberos functionality.
Pros:
  • Satisfies corporate security policies. Since complete auditing of users creation/maintenance is available within AD.
  • All developers and users are part of enterprise AD and a kerberos ticket is already issued to them. Existing tickets are used for any communication with Kerberos cluster.
  • Backup, High availability and other administrative tasks for KDC are taken care by enterprise AD teams managing AD.
  • Separate OU within AD ensures hadoop internal users are not mixed with other users in AD.
  • Any existing Active Directory groups are available in Ranger to implement security policies.
  • Automated management of all hadoop internal users for keytab generation/distribution.
  • Changes to cluster topology configuration are handled by Ambari.
Cons:
  • Any manual service users ( with non-expiring passwords ) for hadoop cluster need to be added to Active Directory manually and keytab distributed manually. ( May require service requests to generate new id and keytabs to other enterprise groups )
  • Developers do not have access to keytabs associated with their own ids. Keytabs associated to developer ids are invalidated due to password change policy rules ( Password expiration after certain number of days). Developers can use ticket associated to their id by Active Directory.
  • Some JAVA applications/tools require copy of keytab files. It may be difficult to find workaround to use cached tickets with these applications/tools.
This is a prelim guide based on my experience with implementing Kerberos. Any other suggestions/ideas are welcome.

Comments

  1. Thanks for sharing the information very useful about hadoop and keep updating

    us, Please........
    http://www.nareshit.com/course/hadoop-online-training...

    ReplyDelete
  2. There are lots of information about latest technology and how to get trained in them, like this have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies. By the way you are running a great blog. Thanks for sharing this.

    Hadoop Training in Chennai

    Base SAS Training in Chennai

    ReplyDelete
  3. I am expecting more interesting topics from you. And this was nice content and definitely it will be useful for many people.

    Digital Marketing Training in Chennai

    Hadoop Training in Chennai

    ReplyDelete

  4. I have seen a lot of blogs and Info. on other Blogs and Web sites But in this Hadoop Blog Information is useful very thanks for sharing it........

    ReplyDelete
  5. It is really a great and useful piece of info. I’m glad that you shared this helpful info with us. Please keep us informed like this. Thank you for sharing.

    MSBI Training in Chennai

    Informatica Training in Chennai

    ReplyDelete
  6. Really it was an awesome article...very interesting to read..You have provided an nice article....Thanks for sharing..
    Android Training in Chennai
    Ios Training in Chennai

    ReplyDelete
  7. good explaination about hadoop and map reduce ,
    i found more resources where you can find tested source code of map reduce programs


    refere this

    top 10 map reduce program sources code : https://goo.gl/mZkDX7

    top 10 Read Write fs program using java api : https://goo.gl/GTgb8U

    top 30 hadoop shell commands : https://goo.gl/ZLbNMj

    ReplyDelete
  8. Informative post about hadoop, i am looking forward for realtime hadoop online training institute.

    ReplyDelete
  9. This information you provided in the blog that is really unique I love it!! Thanks for sharing such a great blog. Keep posting..
    Hadoop training
    Hadoop Course
    Hadoop training institute

    ReplyDelete
  10. This is post is very good. Its very useful.

    Big Data Hadoop Training in electronic city, Bangalore | #Big Data Hadoop Training in electronic city, Bangalore

    ReplyDelete
  11. The best explanation given very useful
    Hadoop is the most powerful keyword plenty of opportunities are there
    There are number of professionals trained in Hadoop. So it’s easy to grab a job with big companies
    Improve your career prospects by exploring your career path.
    Hadoop training in Hyderabad

    ReplyDelete
  12. I found your blog while searching for the updates, I am happy to be here. Very useful content and also easily understandable providing.. Believe me I did wrote an post about tutorials for beginners with reference of your blog. 
    Data Science Training in Chennai
    Data science training in bangalore
    online Data science training
    Data science training in pune
    Data science training in kalyan nagar
    Data science training in Bangalore

    ReplyDelete
  13. Thank you for benefiting from time to focus on this kind of, I feel firmly about it and also really like comprehending far more with this particular subject matter. In case doable, when you get know-how, is it possible to thoughts modernizing your site together with far more details? It’s extremely useful to me

    java training in annanagar | java training in chennai

    java training in marathahalli | java training in btm layout

    java training in rajaji nagar | java training in jayanagar

    ReplyDelete
    Replies
    1. Hello! This is my first visit to your blog! We are a team of volunteers and starting a new initiative in a community in the same niche. Your blog provided us useful information to work on. You have done an outstanding job.

      AWS Training in Rajaji Nagar | Amazon Web Services Training in Rajaji Nagar


      Amazon Web Services Training in Pune | Best AWS Training in Pune

      AWS Online Training | Online AWS Certification Course - Gangboard

      Delete
  14. You’ve written a really great article here. Your writing style makes this material easy to understand.. I agree with some of the many points you have made. Thank you for this is real thought-provoking content
    python online training
    python training in OMR
    python training course in chennai

    ReplyDelete
  15. Great post! I am actually getting ready to across this information, It’s very helpful for this blog.Also great with all of the valuable information you have Keep up the good work you are doing well.

    Devops Training in pune
    DevOps online Training

    ReplyDelete
  16. Wow it is really wonderful and awesome thus it is very much useful for me to understand many concepts and helped me a lot. it is really explainable very well and i got more information from your blog.

    rpa interview questions and answers
    automation anywhere interview questions and answers
    blueprism interview questions and answers
    uipath interview questions and answers
    rpa training in chennai

    ReplyDelete
  17. Great post! I am actually getting ready to across this information, It’s very helpful for this blog.Also great with all of the valuable information you have Keep up the good work you are doing well...Best Python Training Institute In Chennai | Best AWS Training Institute in Chennai

    ReplyDelete

Post a Comment

Popular posts from this blog

HIVE Sorting and Join

Hive Indexing

Sqoop with Postgresql