Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 54 discussion

Exam question from Amazon's AWS Certified Data Engineer - Associate DEA-C01

Question #: 54
Topic #: 1

[All AWS Certified Data Engineer - Associate DEA-C01 Questions]

A company is planning to migrate on-premises Apache Hadoop clusters to Amazon EMR. The company also needs to migrate a data catalog into a persistent storage solution.
The company currently stores the data catalog in an on-premises Apache Hive metastore on the Hadoop clusters. The company requires a serverless solution to migrate the data catalog.
Which solution will meet these requirements MOST cost-effectively?

A. Use AWS Database Migration Service (AWS DMS) to migrate the Hive metastore into Amazon S3. Configure AWS Glue Data Catalog to scan Amazon S3 to produce the data catalog.
B. Configure a Hive metastore in Amazon EMR. Migrate the existing on-premises Hive metastore into Amazon EMR. Use AWS Glue Data Catalog to store the company's data catalog as an external data catalog.
C. Configure an external Hive metastore in Amazon EMR. Migrate the existing on-premises Hive metastore into Amazon EMR. Use Amazon Aurora MySQL to store the company's data catalog.
D. Configure a new Hive metastore in Amazon EMR. Migrate the existing on-premises Hive metastore into Amazon EMR. Use the new metastore as the company's data catalog.

Show Suggested Answer

Suggested Answer: B 🗳️

by rralucard_ at Feb. 2, 2024, 11 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

Asmunk

9 months, 1 week ago

Selected Answer: B

A and D can be discarded because of added steps. This link provides documentation for this exact use case : https://aws.amazon.com/blogs/big-data/migrate-and-deploy-your-apache-hive-metastore-on-amazon-emr/ C is also discarded because of the serverless key word, although Aurora can be serverless it is not specified in the choice.

upvoted 2 times

...

Christina666

1 year, 4 months ago

Selected Answer: B

Serverless and Cost-Efficient: AWS Glue Data Catalog offers a serverless metadata repository, reducing operational overhead and making it cost-effective. Using it as an external data catalog means you don't have to manage additional database infrastructure. Seamless Migration: Migrating your existing Hive metastore to Amazon EMR ensures compatibility with your current Hadoop setup. EMR is designed to run Hadoop workloads, facilitating this process. Flexibility: An external data catalog in AWS Glue offers flexibility and separation of concerns. Your metastore remains managed by EMR for your Hadoop workloads, while Glue provides a centralized catalog for broader AWS data sources.

upvoted 2 times

...

nyaopoko

1 year, 4 months ago

B is answer! By leveraging AWS Glue Data Catalog as an external data catalog and migrating the existing Hive metastore into Amazon EMR, the company can achieve a serverless, persistent, and cost-effective solution for storing and managing their data catalog.

upvoted 1 times

...

arvehisa

1 year, 4 months ago

Selected Answer: B

B. https://aws.amazon.com/jp/blogs/big-data/migrate-and-deploy-your-apache-hive-metastore-on-amazon-emr/

upvoted 2 times

...

lucas_rfsb

1 year, 4 months ago

Selected Answer: A

I will go with A. Besides DMS is typical for migration, it's the only choice which explicitly concerns about how the migration itself will be made. Other choices would demand a script or GLUE ETL job if you will. But this logic of migration was never put

upvoted 2 times

...

LeoSantos121212121212121

1 year, 4 months ago

I will go with A

upvoted 2 times

...

jpmadan

1 year, 5 months ago

Selected Answer: B

serverless catalog in AWS == glue

upvoted 1 times

...

damaldon

1 year, 5 months ago

B. Set up an AWS Glue ETL job which extracts metadata from your Hive metastore (MySQL) and loads it into your AWS Glue Data Catalog. This method requires an AWS Glue connection to the Hive metastore as a JDBC source. An ETL script is provided to extract metadata from the Hive metastore and write it to AWS Glue Data Catalog. https://github.com/aws-samples/aws-glue-samples/blob/master/utilities/Hive_metastore_migration/README.md

upvoted 1 times

...

rralucard_

1 year, 6 months ago

Selected Answer: B

https://aws.amazon.com/blogs/big-data/migrate-and-deploy-your-apache-hive-metastore-on-amazon-emr/ Option B is likely the most suitable. Migrating the Hive metastore into Amazon EMR and using AWS Glue Data Catalog as an external catalog provides a balance between leveraging the scalable and managed services of AWS (like EMR and Glue Data Catalog) and ensuring a smooth transition from the on-premises setup. This approach leverages the serverless nature of AWS Glue Data Catalog, minimizing operational overhead and potentially reducing costs compared to managing database servers.

upvoted 3 times

...