exam questions

Exam AWS Certified Big Data - Specialty All Questions

View all questions & answers for the AWS Certified Big Data - Specialty exam

Exam AWS Certified Big Data - Specialty topic 1 question 16 discussion

Exam question from Amazon's AWS Certified Big Data - Specialty
Question #: 16
Topic #: 1
[All AWS Certified Big Data - Specialty Questions]

An administrator needs to design the event log storage architecture for events from mobile devices. The event data will be processed by an Amazon EMR cluster daily for aggregated reporting and analytics before being archived.
How should the administrator recommend storing the log data?

  • A. Create an Amazon S3 bucket and write log data into folders by device. Execute the EMR job on the device folders.
  • B. Create an Amazon DynamoDB table partitioned on the device and sorted on date, write log data to table. Execute the EMR job on the Amazon DynamoDB table.
  • C. Create an Amazon S3 bucket and write data into folders by day. Execute the EMR job on the daily folder.
  • D. Create an Amazon DynamoDB table partitioned on EventID, write log data to table. Execute the EMR job on the table.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
mattyb123
Highly Voted 3 years, 11 months ago
Thoughts on C?
upvoted 6 times
...
Mandy_007
Most Recent 1 year, 8 months ago
I think C is correct. We are running EMR daily, so partitioning by day will give data for all devices for each day.
upvoted 1 times
...
KartV
3 years, 9 months ago
I think it is A because the EMR job is run daily, so partitioning by day will lead to only one partition. Partitioning by device, on the other hand, allows for parallel jobs to be run for each device or group of devices.
upvoted 2 times
KartV
3 years, 9 months ago
On second thought, it could still be C because one can sub-partition the logs by device under date. E.g. date/device which still allows for parallelization. Whereas device/date will be the read query more complex as the job needs to be run daily.
upvoted 1 times
Corram
3 years, 9 months ago
Actually, you don't need supartitioning. Partitioning isn't even a thing in S3. Instead, EMR is quite capable of reading many objects from one s3 folder in parallel. Also, if possible, it splits up the objects into 64MB chunks when reading so it can even read one object in parallel. Bottom line, C is true and thinking about partitioning does not make sense in this context. hope it helps :)
upvoted 1 times
...
...
...
Debi_mishra
3 years, 9 months ago
Answer should be C. It's a daily aggregation not by device.
upvoted 1 times
...
jiedee
3 years, 9 months ago
of course is C A is clearly wrong because the api call fee is TOOOOOOOOO expensive
upvoted 2 times
...
san2020
3 years, 9 months ago
my selection C
upvoted 1 times
...
kalpanareddy
3 years, 10 months ago
Why not A
upvoted 1 times
practicioner
3 years, 10 months ago
Because of the partition scheme. We should imagine about simple access method. In this case "yyyy/mm/dd" or similar are the best choice for partitioning.
upvoted 1 times
...
...
practicioner
3 years, 10 months ago
C is the right choice. Nothing else
upvoted 1 times
...
PK1234
3 years, 10 months ago
This is not for real time analysis, but intends to process log data in batch, hence S3 is better than dynamo db.
upvoted 4 times
...
BigEv
3 years, 10 months ago
Why can't we use DynamoDB?
upvoted 2 times
...
M2
3 years, 10 months ago
C looks correct to me
upvoted 1 times
...
exams
3 years, 10 months ago
Yeah C looks correct. day/time mechanism is always better for storing logs
upvoted 3 times
...
pra276
3 years, 10 months ago
Answer is C:
upvoted 1 times
...
muhsin
3 years, 11 months ago
It is C. Daily EMR job and based on time not device.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...