exam questions

Exam AWS Certified Big Data - Specialty All Questions

View all questions & answers for the AWS Certified Big Data - Specialty exam

Exam AWS Certified Big Data - Specialty topic 1 question 61 discussion

Exam question from Amazon's AWS Certified Big Data - Specialty
Question #: 61
Topic #: 1
[All AWS Certified Big Data - Specialty Questions]

A company hosts a portfolio of e-commerce websites across the Oregon, N. Virginia, Ireland, and Sydney
AWS regions. Each site keeps log files that capture user behavior. The company has built an application that generates batches of product recommendations with collaborative filtering in Oregon. Oregon was selected because the flagship site is hosted there and provides the largest collection of data to train machine learning models against. The other regions do NOT have enough historic data to train accurate machine learning models.
Which set of data processing steps improves recommendations for each region?

  • A. Use the e-commerce application in Oregon to write replica log files in each other region.
  • B. Use Amazon S3 bucket replication to consolidate log entries and build a single model in Oregon.
  • C. Use Kinesis as a buffer for web logs and replicate logs to the Kinesis stream of a neighboring region.
  • D. Use the CloudWatch Logs agent to consolidate logs into a single CloudWatch Logs group.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
I_heart_shuffle_girls
Highly Voted 3 years, 8 months ago
I feel like we went down a rabbit hole here. We are looking for what improves recommendations for each region. The company seems to want to build regional models so B would not improve Oregon's recommendations, C seems rather off base to me, and D agains goes for consolidating logs. I feel the answer is A as you can transfer logs from Oregon to other regions which will then give them enough data to begin training their models for their regions.
upvoted 11 times
...
guruguru
Most Recent 3 years, 7 months ago
B. Because other regions don't have enough data to train the their own model, then funnel them all to a single region to train one model to apply to all to improve their recommendation. CloudWatch cross region is only available from Nov 2019, but this question is available 9 months ago, which is Sep, 2019. Therefore, D is wrong. https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-cloudwatch-launches-cross-account-cross-region-dashboards/
upvoted 1 times
...
freedomeox
3 years, 7 months ago
I will go for C. imo the key of improving the RM engine is to give other regions the access to Oregon logs.
upvoted 1 times
...
MultiCloudGuru
3 years, 7 months ago
Answer is D Refer : https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-cloudwatch-launches-cross-account-cross-region-dashboards/
upvoted 1 times
...
chandrakatasani
3 years, 7 months ago
the answer is D https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-cloudwatch-launches-cross-account-cross-region-dashboards/
upvoted 1 times
...
Bulti
3 years, 7 months ago
I think A is the right answer. None of the option clearly talk about replicating the user behavior log captured in Oregon to all the other region to be able to retain the model there and thereby improve the prediction in those regions except for A. Although A might look a bit more involved and less efficient it is the only way to successfully improve the recommendation in other regions.
upvoted 2 times
...
Bulti
3 years, 7 months ago
Two keywords in this question are "data manipulation" and "interactive". Only Option B- Pig with Tachyon satisfies both requirements. Pig with Tachyon will be able to handle very high throughput working with large datasets and also be able to manipulate the data. Also You can execute Pig commands interactively or in batch mode. To use Pig interactively, create an SSH connection to the master node and submit commands using the Grunt shell. Refer to this link-> https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-pig.html
upvoted 2 times
Bulti
3 years, 7 months ago
This is a comment against next question.
upvoted 3 times
...
DerekKey
3 years, 7 months ago
This is not what the question is about. Which set of data processing steps improves recommendations for each region
upvoted 1 times
...
...
Bulti
3 years, 7 months ago
Option B- Doesn't make sense as it doesn't talk about how to improve the recommendation in each region. Option D: CloudWatch can consolidate logs into a log group within the same region and not replicate these logs ( from Oregon for e.g.) into other regions of interest. So its between Option A and Option C. I would go with Option C over Option A because it involves using AWS services to replicate the logs from one region to another using a relay mechanism as opposed to having to write code in your ecommerce application to replicate the logs across all regions. Once the logs are replicated using the relay mechanism from one region to the next , the application can use these logs in its respective region to improve recommendation.
upvoted 1 times
...
san2020
3 years, 8 months ago
my selection D
upvoted 4 times
...
ME2000
3 years, 8 months ago
Oregon was selected because the flagship site is hosted there and provides the largest collection of data to train machine learning models against. it means ML model is already and the application is ready. The other regions do NOT have enough historic data to train accurate machine learning models. So the other sides are going to use the application based on the Oregon ML model and for that they need... D. Use the CloudWatch Logs agent to consolidate logs into a single CloudWatch Logs group.
upvoted 2 times
...
PK1234
3 years, 8 months ago
The application built can consume log files..and generates recommendations only. It does not mention that the application can write log files...for that you need Kinesis.
upvoted 1 times
...
PK1234
3 years, 8 months ago
The recommendations of a cold place like oregon (in nov) will be different than sydney (summer in nov). So tehre cannot eb any consolidation. As per elimination, only C seems correct....
upvoted 1 times
...
s3an
3 years, 8 months ago
the question is "recommendation for EACH region", nothing says aggregate across regions. D seems like a good answer
upvoted 2 times
...
d00ku
3 years, 8 months ago
From AWS: "You can aggregate the metrics for AWS resources across multiple resources. Amazon CloudWatch can't aggregate data across Regions. Metrics are completely separate between Regions." sooo... B?
upvoted 2 times
...
BigEv
3 years, 8 months ago
Check this out. Seems like D is the closet answer https://aws.amazon.com/solutions/centralized-logging/
upvoted 1 times
BigEv
3 years, 8 months ago
Actually, I think @Hitu is correct. Cloudwatch agent could not aggregate cross-region log into one group. After reading this article, I am voting for B now. Any comments gents? Logs are generated regionally by AWS services so the best practice is to funnel all regional logs into one region in order to analyze the data across regions. There are three options to centralize your AWS logs. Use CloudWatch for your centralized log collection and then push them to a log analysis solution via Lambda or Kinesis. Send all logs directly to S3 and further process them with Lambda functions. Configure agents like Beats on EC2 instances and FunctionBeat on Lambdas to push logs to a logging solution. https://coralogix.com/log-analytics-blog/aws-centralized-logging-guide/
upvoted 3 times
Mountie
3 years, 7 months ago
actually Cloudwatch can support cross regions monitoring as of Nov 8,2019 https://aws.amazon.com/tw/about-aws/whats-new/2019/11/amazon-cloudwatch-launches-cross-account-cross-region-dashboards/
upvoted 1 times
...
...
...
pkfe
3 years, 8 months ago
Centralized Log Management with AWS CloudWatch. Log groups are used to classify log streams together https://cloudacademy.com/blog/centralized-log-management-with-aws-cloudwatch-part-1-of-3/ so D
upvoted 1 times
...
exams
3 years, 8 months ago
I support D
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...