exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 58 discussion

A data engineer needs to maintain a central metadata repository that users access through Amazon EMR and Amazon Athena queries. The repository needs to provide the schema and properties of many tables. Some of the metadata is stored in Apache Hive. The data engineer needs to import the metadata from Hive into the central metadata repository.
Which solution will meet these requirements with the LEAST development effort?

  • A. Use Amazon EMR and Apache Ranger.
  • B. Use a Hive metastore on an EMR cluster.
  • C. Use the AWS Glue Data Catalog.
  • D. Use a metastore on an Amazon RDS for MySQL DB instance.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
rralucard_
Highly Voted 1 year, 5 months ago
Selected Answer: C
https://aws.amazon.com/blogs/big-data/metadata-classification-lineage-and-discovery-using-apache-atlas-on-amazon-emr/ Option C, using the AWS Glue Data Catalog, is the best solution to meet the requirements with the least development effort. The AWS Glue Data Catalog is designed to be a central metadata repository that can integrate with various AWS services including EMR and Athena, providing a managed and scalable solution for metadata management with built-in Hive compatibility.
upvoted 6 times
...
vic614
Most Recent 1 year ago
Selected Answer: C
Data Catalog.
upvoted 1 times
...
Felix_G
1 year, 4 months ago
Option C, using the AWS Glue Data Catalog, requires the least development effort to meet the requirements for a central metadata repository accessed from EMR and Athena.
upvoted 2 times
Felix_G
1 year, 4 months ago
Here's an analysis of each option: A) Amazon EMR and Apache Ranger would require significant coding to build a custom metadata repository solution B) A Hive metastore provides metadata to EMR, but would require substantial development work to share that metadata with Athena C) The AWS Glue Data Catalog integrates natively with EMR and Athena, providing a shared schema registry, making it the easiest solution D) An RDS database metastore would also require building custom integration points with Athena, EMR, and other services to enable metadata sharing Since AWS Glue provides a fully managed data catalog service purpose built for this metadata management use case across different analytics engines, Option C clearly stands out as the solution requiring the least development effort.
upvoted 2 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...