exam questions

Exam AWS Certified Big Data - Specialty All Questions

View all questions & answers for the AWS Certified Big Data - Specialty exam

Exam AWS Certified Big Data - Specialty topic 2 question 12 discussion

Exam question from Amazon's AWS Certified Big Data - Specialty
Question #: 12
Topic #: 2
[All AWS Certified Big Data - Specialty Questions]

An advertising organization uses an application to process a stream of events that are received from clients in multiple unstructured formats.
The application does the following:
✑ Transforms the events into a single structured format and streams them to Amazon Kinesis for real-time analysis.
✑ Stores the unstructured raw events from the log files on local hard drivers that are rotated and uploaded to Amazon S3.
The organization wants to extract campaign performance reporting using an existing Amazon redshift cluster.
Which solution will provide the performance data with the LEAST number of operations?

  • A. Install the Amazon Kinesis Data Firehose agent on the application servers and use it to stream the log files directly to Amazon Redshift.
  • B. Create an external table in Amazon Redshift and point it to the S3 bucket where the unstructured raw events are stored.
  • C. Write an AWS Lambda function that triggers every hour to load the new log files already in S3 to Amazon redshift.
  • D. Connect Amazon Kinesis Data Firehose to the existing Amazon Kinesis stream and use it to stream the event directly to Amazon Redshift.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Bulti
Highly Voted 3 years, 7 months ago
Not A – No use loading unstructured data in multiple formats to RedShift via Kinesis Firehouse agent. Not B- Creating External table using RedShift Spectrum will be an issue against unstructured data in multiple formats. Not C - Not a good choice. Never seen Lambda talking to RedShift and why would you use it when KFH directly connect to RedShift. Correct Option is D- Because it loads structured data in a single format to RedShift.
upvoted 6 times
...
DerekKey
Most Recent 3 years, 6 months ago
Correct D: least number of operations
upvoted 1 times
...
jove
3 years, 7 months ago
D seems more reasonable. I go with D
upvoted 2 times
...
Bulti
3 years, 7 months ago
Option A also talks about shipping structured data from the application using Kinesis Firehouse agent to Redshift. So using the same Agent it's possible to stream the data to both Kinesis Data Streams for analyiss and KFH to deliver it to Redshift. It seems like the most direct option.
upvoted 1 times
jove
3 years, 7 months ago
A is about log files which are unstructured. Not a good idea to move the log files to Redshift.
upvoted 1 times
...
...
susan8840
3 years, 7 months ago
B. the question is asking how to consume the data in Redshift not how to get/input the data which is already in place
upvoted 1 times
...
Zinty
3 years, 7 months ago
The unstructured date is already transformed to single dtructured format prior to putting into Kinesis. So I will go with D for LEAST number of operations. B = spectrum is not needed
upvoted 1 times
...
zhengtoronto
3 years, 7 months ago
To handle the unstructured data structure, Kinesis Data Firehose can invoke Lambda function to do data transformation and format conversion, so it's D
upvoted 2 times
...
san2020
3 years, 7 months ago
my selection D
upvoted 2 times
...
mars2
3 years, 7 months ago
answer is D. The key here is multiple unstructured formats. You can't define an external table with multiple source formats.
upvoted 3 times
...
Kuntazulu
3 years, 7 months ago
A. FH to Redshift is direct...
upvoted 1 times
...
sriansri
3 years, 7 months ago
For unstructured data combine Redshift with S3 is basic. Because Redshift is not for unstructured data.
upvoted 4 times
DerekKey
3 years, 7 months ago
Transforms the events into a single structured format and streams them to Amazon Kinesis for real-time analysis.
upvoted 1 times
...
...
cybe001
3 years, 8 months ago
I go with D. Fire Hose can read the Structured data from Kinesis Stream and store it in Redshift.
upvoted 2 times
...
Zire
3 years, 8 months ago
The problem with B is fine if the data was structured since we could use redshidt spectrum to create external tables pointing to S3 . For this I'd go with D. At least as solution it is correct
upvoted 1 times
...
bigdatalearner
3 years, 8 months ago
B is the right answer
upvoted 2 times
d00ku
3 years, 8 months ago
How can B be the answer when is says 'point the table to the unstructured data'? The answer is D.
upvoted 3 times
shwang
3 years, 7 months ago
refereed FAQ, unstructured data in s3 could be the external table of redshift, So it is B
upvoted 1 times
DerekKey
3 years, 6 months ago
Transforms the events into a single structured format and streams them to Amazon Kinesis for real-time analysis.
upvoted 1 times
...
...
...
...
mattyb123
3 years, 8 months ago
Thoughts on D?
upvoted 4 times
mattyb123
3 years, 8 months ago
Amazon Redshift Spectrum uses external tables to query data that is stored in Amazon S3. You can query an external table using the same SELECT syntax you use with other Amazon Redshift tables. External tables are read-only. You can't write to an external table. 1.https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_EXTERNAL_TABLE.html 2.https://blog.openbridge.com/10-simple-tips-that-help-you-quickly-find-success-adopting-amazon-redshift-spectrum-810db089abbe 3.https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-external-tables.html 4.https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_EXTERNAL_TABLE.html
upvoted 2 times
jove
3 years, 7 months ago
Amazon Redshift now supports writing to external tables in Amazon S3 : https://aws.amazon.com/about-aws/whats-new/2020/06/amazon-redshift-now-supports-writing-to-external-tables-in-amazon-s3/
upvoted 1 times
...
...
mattyb123
3 years, 8 months ago
I think its D due to FH being able to automatically copy/write the data to redshift. Where if you were using redshift spectrum you can only create read only external tables and you would need to write the SQL to create the external table.
upvoted 4 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...