exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 52 discussion

A financial services company is building a robust serverless data lake on Amazon S3. The data lake should be flexible and meet the following requirements:
✑ Support querying old and new data on Amazon S3 through Amazon Athena and Amazon Redshift Spectrum.
✑ Support event-driven ETL pipelines
✑ Provide a quick and easy way to understand metadata
Which approach meets these requirements?

  • A. Use an AWS Glue crawler to crawl S3 data, an AWS Lambda function to trigger an AWS Glue ETL job, and an AWS Glue Data catalog to search and discover metadata.
  • B. Use an AWS Glue crawler to crawl S3 data, an AWS Lambda function to trigger an AWS Batch job, and an external Apache Hive metastore to search and discover metadata.
  • C. Use an AWS Glue crawler to crawl S3 data, an Amazon CloudWatch alarm to trigger an AWS Batch job, and an AWS Glue Data Catalog to search and discover metadata.
  • D. Use an AWS Glue crawler to crawl S3 data, an Amazon CloudWatch alarm to trigger an AWS Glue ETL job, and an external Apache Hive metastore to search and discover metadata.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
DonaldCMLIN
Highly Voted 2 years, 7 months ago
BOTH A AND B ARE ANSWERS. BUT external Apache Hive MIGHT BE NOT SERVERLESS SOLUTION. The AWS Glue Data Catalog is your persistent metadata store. It is a managed service that lets you store, annotate, and share metadata in the AWS Cloud in the same way you would in an Apache Hive metastore. The Data Catalog is a drop-in replacement for the Apache Hive Metastore https://docs.aws.amazon.com/zh_tw/glue/latest/dg/components-overview.html BEAUTIFUL ANSWER IS A.
upvoted 45 times
rsimham
2 years, 7 months ago
I am thinking about Answer C, because events can be triggered by cloudwatch w/Glue metastore
upvoted 1 times
qwerty456
2 years, 6 months ago
you can't schedule AWS Batch with CloudWatch
upvoted 4 times
kalyanvarma
2 years, 6 months ago
We can schedule batch with cloud watch events.
upvoted 1 times
...
qwerty456
2 years, 6 months ago
srr, looks like you can apart from Cron, the argument should be AWS Batch aren't SERVERLESS
upvoted 2 times
...
...
ComPah
2 years, 7 months ago
if we use Flexible as key word ..Using Lambda might be a constraint
upvoted 4 times
...
...
...
cybe001
Highly Voted 2 years, 7 months ago
Answer is A. Lamda is the preferred way of implementing event-driven ETL job with S3, when new data arrives in S3, it notifies lamda which can start the ETL job.
upvoted 18 times
rb39
1 year, 7 months ago
agree, event-driven means Lambda, CloudWatch alarms are just to trigger alarms based on log analysis.
upvoted 3 times
...
...
loict
Most Recent 8 months ago
Selected Answer: A
A. YES - all integrated components B. NO - missing a component to invoke the Lambda C. NO - CloudWatch will not trigger when there is a new file to process D. NO - CloudWatch will not trigger when there is a new file to process
upvoted 2 times
...
Mickey321
8 months, 2 weeks ago
Selected Answer: A
A for me
upvoted 1 times
...
kaike_reis
9 months, 2 weeks ago
Selected Answer: A
Note that the question asks for a serverless system. In this case, the letters B, C and D are wrong, as they bring options that are managed: AWS Batch (managed) and external Apache Hive (even more managed). For event-driven AWS ETL solutions that are serverless, activation through the Lambda function is recommended, so the correct alternative is Letter A. Note that CloudWatch Alarms only activates from log evaluation, which is not mentioned in the question.
upvoted 1 times
...
jackzhao
1 year, 2 months ago
I will chose A, I think C & D is wrong, you can use Amazon CloudWatch Event to trigger lambda but not CloudWatch alarm.
upvoted 1 times
...
Valcilio
1 year, 2 months ago
Selected Answer: A
Batch is more for configurations and other kinds of things by scheduling than event driven and batch data processing with ETL, the answer is A.
upvoted 1 times
...
Jeremy1
1 year, 5 months ago
Selected Answer: A
Found this supporting A - Lambda used to trigger ETL job after crawler completes. The crawler starts on schedules or events (files arriving).
upvoted 1 times
...
Skychaser
1 year, 10 months ago
Selected Answer: A
Based on Majority discussion
upvoted 2 times
...
exam887
1 year, 11 months ago
Selected Answer: C
Quite confused between A&C since they all workable solution. In below AWS Blog, even mix the CloudWatch + Lambda to use the Glue. For key word event trigger, prefer CloudWatch https://aws.amazon.com/blogs/big-data/build-and-automate-a-serverless-data-lake-using-an-aws-glue-trigger-for-the-data-catalog-and-etl-jobs/ https://docs.aws.amazon.com/glue/latest/dg/automating-awsglue-with-cloudwatch-events.html
upvoted 2 times
ZSun
1 year ago
cloudwatch and lambda function can work together to trigger event. But AWS batch cannot independently conduct ETL and require other service. when it comes to ETL, glue is much easier choice than Batch
upvoted 1 times
...
VinceCar
1 year, 5 months ago
Agreed. CloudWatch could trigger event to launch Lambda. Refer to: https://docs.aws.amazon.com/lambda/latest/dg/services-cloudwatchevents.html
upvoted 1 times
...
...
syu31svc
2 years, 6 months ago
Answer is A 100%
upvoted 2 times
...
halfway
2 years, 6 months ago
A is preferred. Lambda can trigger ETL pipelines: https://aws.amazon.com/glue/
upvoted 3 times
...
PRC
2 years, 7 months ago
A is correct...Lambda is event driven and Glue is serverless as opposed to Hive
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago