exam questions

Exam AWS Certified Big Data - Specialty All Questions

View all questions & answers for the AWS Certified Big Data - Specialty exam

Exam AWS Certified Big Data - Specialty topic 2 question 13 discussion

Exam question from Amazon's AWS Certified Big Data - Specialty
Question #: 13
Topic #: 2
[All AWS Certified Big Data - Specialty Questions]

An organization needs to store sensitive information on Amazon S3 and process it through Amazon EMR. Data must be encrypted on Amazon S3 and Amazon
EMR at rest and in transit. Using Thrift Server, the Data Analysis team users HIVE to interact with this data. The organization would like to grant access to only specific databases and tables, giving permission only to the SELECT statement.
Which solution will protect the data and limit user access to the SELECT statement on a specific portion of data?

  • A. Configure Transparent Data Encryption on Amazon EMR. Create an Amazon EC2 instance and install Apache Ranger. Configure the authorization on the cluster to use Apache Ranger.
  • B. Configure data encryption at rest for EMR File System (EMRFS) on Amazon S3. Configure data encryption in transit for traffic between Amazon S3 and EMRFS. Configure storage and SQL base authorization on HiveServer2.
  • C. Use AWS KMS for encryption of data. Configure and attach multiple roles with different permissions based on the different user needs.
  • D. Configure Security Group on Amazon EMR. Create an Amazon VPC endpoint for Amazon S3. Configure HiveServer2 to use Kerberos authentication on the cluster.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
mattyb123
Highly Voted 3 years, 8 months ago
It's A. https://aws.amazon.com/blogs/big-data/implementing-authorization-and-auditing-using-apache-ranger-on-amazon-emr/
upvoted 5 times
apertus
3 years, 8 months ago
Transparent Data Encryption is for HDFS not s3, B may be the correct answer https://poonamkucheriya.wordpress.com/2019/01/11/how-to-implement-sql-standard-based-hive-authorization-in-emr-hive/
upvoted 1 times
apertus
3 years, 8 months ago
Transparent Data Encryption is for HDFS: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-data-encryption-options.html
upvoted 1 times
...
...
...
ru4aws
Most Recent 2 years, 11 months ago
Every is saying B but https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive-differences.html it says "Hive authorization : Amazon EMR supports Hive authorization for HDFS but not for EMRFS and Amazon S3. Amazon EMR clusters run with authorization disabled by default."
upvoted 1 times
...
DerekKey
3 years, 7 months ago
Answer A: Transparent Encryption - Security configurations offer settings to enable security for data in-transit and data at-rest in Amazon Elastic Block Store (Amazon EBS) storage volumes and EMRFS data in Amazon S3. Ranger - EMR steps are used to perform the following: Install and configure Ranger HDFS and Hive plugins
upvoted 1 times
...
rohitsingh
3 years, 7 months ago
it's A.. Apache Ranger has the following goals: Centralized security administration to manage all security related tasks in a central UI or using REST APIs. Fine grained authorization to do a specific action and/or operation with Hadoop component/tool and managed through a central administration tool Standardize authorization method across all Hadoop components. Enhanced support for different authorization methods - Role based access control, attribute based access control etc. Centralize auditing of user access and administrative actions (security related) within all the components of Hadoop.
upvoted 1 times
...
askaron
3 years, 7 months ago
B is the only one to address "SQL base authorization": https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization Which is basically the right to execute a SELECT statement, as required by the question. Answer A is tempting, but why installing an additional ec2 instance if it goes without doing so?
upvoted 2 times
...
jkoffee
3 years, 7 months ago
B my choice https://aws.amazon.com/fr/blogs/big-data/encrypt-data-at-rest-and-in-flight-on-amazon-emr-with-security-configurations/
upvoted 1 times
...
freedomeox
3 years, 7 months ago
B. A: didn’t mention data encryption on S3. C: KMS is a key management service to managed the encryption keys. No matter what encryption you use, you will can always KMS to manage the keys. so you can say use KMS to manage encryption keys, but not use KMS for encryption of data. D: security group is not helping in giving permission to select statement.
upvoted 2 times
...
k115
3 years, 7 months ago
B is the correct answer
upvoted 2 times
...
Bulti
3 years, 7 months ago
This is a good example which confirms that A is the answer-> https://noise.getoto.net/2016/12/02/implementing-authorization-and-auditing-using-apache-ranger-on-amazon-emr/. The key is that the question is about EMR HDFS but some of the choices offered talk about EMRFS which may lead to picking up a wrong answer. As this is AWS EMR, we are looking at an HDFS cluster that needs to be protected. Transparent Data Encryption meets the at rest and in transit requirements. So now authorizing access to Hive tables is best offered using Apache ranger as shown in the blog at the above link.
upvoted 3 times
Corram
3 years, 7 months ago
B is correct. "store sensitive information on Amazon S3 and process it through Amazon EMR" <- this is EMRFS at its best. Option A does not help with encrypting data in S3, nor in transit when it gets loaded from S3 to EMR, which the question clearly asks for.
upvoted 2 times
...
...
Zinty
3 years, 7 months ago
I dont think B is correct - Amazon EMR supports Hive Authorization(Storage Based Authorization, SQL Standards Based Authorization in HiveServer2) for HDFS but not for EMRFS and Amazon S3. Amazon EMR clusters run with authorization disabled by default.
upvoted 1 times
Corram
3 years, 7 months ago
This article written by AWS employees indicates otherwise stating "The EMRFS authorization feature specifically applies to access by using HiveServer2." https://idk.dev/best-practices-for-securing-amazon-emr/
upvoted 1 times
...
...
san2020
3 years, 7 months ago
my selection B
upvoted 2 times
...
aws123
3 years, 7 months ago
It's B https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Authorization
upvoted 2 times
...
sriansri
3 years, 7 months ago
Why not C? Is it wrong?
upvoted 1 times
...
shwang
3 years, 8 months ago
why nobody has a think about C?
upvoted 1 times
srirampc
3 years, 7 months ago
Yes, C is the answer. Using KMS IAM roles to control data access is a good pattern that control access to users using the data.
upvoted 1 times
DerekKey
3 years, 7 months ago
WRONG: IAM roles are not for protecting data in transit
upvoted 1 times
...
...
...
s3an
3 years, 8 months ago
A is wrong. https://aws.amazon.com/blogs/aws/new-at-rest-and-in-transit-encryption-for-amazon-emr/ "We already offer several data encryption options for EMR including server and client side encryption for Amazon S3 with EMRFS and Transparent Data Encryption for HDFS. While these solutions do a good job of protecting data at rest, they do not address data stored in temporary files or data that is in flight, moving between job steps. Each of these encryption options must be individually enabled and configured, making the process of implementing encryption more tedious that it need be"
upvoted 2 times
s3an
3 years, 8 months ago
B is the right answer, as Kerberos authentication in D will not limit access to SELECT statements, it's simply an authentication mechanism
upvoted 1 times
ME2000
3 years, 7 months ago
First of all "transit for traffic between Amazon S3 and EMRFS." is Invalid, because EMRFS is already on S3 Secondly, you can use different IAM roles for EMRFS requests to Amazon S3 based on cluster users, groups, or the location of EMRFS data in Amazon S3. (invalid - authorization on HiveServer2) https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-fs.html Therefore the correct answer is C
upvoted 1 times
...
...
...
cybe001
3 years, 8 months ago
I choose B
upvoted 2 times
...
jlpl
3 years, 8 months ago
Apache Ranger is not AWS product, might not a right choice
upvoted 1 times
mattyb123
3 years, 8 months ago
Please view the big data exam preparation course on aws. It is mentioned quite heavily and the use case matches https://www.aws.training/Details/Curriculum?id=21332
upvoted 1 times
jlpl
3 years, 8 months ago
https://www.aws.training/Details/Curriculum?id=21332 -> can not open for some reason, login with credential
upvoted 1 times
mattyb123
3 years, 8 months ago
It's a free course. You just need to sign in with your amazon/aws account or APN account to access the training.
upvoted 1 times
...
...
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...