exam questions

Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 7 discussion

A streaming application is reading data from Amazon Kinesis Data Streams and immediately writing the data to an Amazon S3 bucket every 10 seconds. The application is reading data from hundreds of shards. The batch interval cannot be changed due to a separate requirement. The data is being accessed by Amazon
Athena. Users are seeing degradation in query performance as time progresses.
Which action can help improve query performance?

  • A. Merge the files in Amazon S3 to form larger files.
  • B. Increase the number of shards in Kinesis Data Streams.
  • C. Add more memory and CPU capacity to the streaming application.
  • D. Write the files to multiple S3 buckets.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
abhineet
Highly Voted 3 years, 8 months ago
It should be A, large number of small files ins3 will slow down reads
upvoted 41 times
testtaker3434
3 years, 7 months ago
Yeap, I agree its A.
upvoted 5 times
...
[Removed]
3 years, 4 months ago
You can speed up your queries dramatically by compressing your data, provided that files are splittable or of an optimal size (optimal S3 file size is between 200MB-1GB). Smaller data sizes mean less network traffic between Amazon S3 to Athena.
upvoted 2 times
...
...
Paitan
Highly Voted 3 years, 7 months ago
Merge the files in Amazon S3 to form larger files will definitely increase read performance. So option A is the right choice.
upvoted 10 times
...
chinmayj213
Most Recent 1 year, 2 months ago
Everyone is saying A, which is write but why because 1000's shard and per shard capacity is 1 mb , So 1000's of files per second . which require merge to improve the query performance.
upvoted 1 times
...
NikkyDicky
1 year, 9 months ago
Selected Answer: A
A for sure
upvoted 1 times
...
pk349
2 years ago
A: I passed the test
upvoted 2 times
Priya_angre
2 years ago
what is right answer
upvoted 1 times
...
...
Aina
2 years, 1 month ago
A. This bit of AWS documentation: https://docs.aws.amazon.com/athena/latest/ug/performance-tuning-s3-throttling.html says "If possible, avoid having a large number of small files. Amazon S3 has a limit of 5500 requests per second, and your Athena queries share this same limit. If you scan millions of small objects in a single query, your query will likely be throttled by Amazon S3."
upvoted 1 times
...
AwsNewPeople
2 years, 2 months ago
A. Merge the files in Amazon S3 to form larger files. To improve query performance when using Amazon Athena to access data from an Amazon S3 bucket, the streaming application should merge the files in S3 to form larger files. When the streaming application writes data to S3 every 10 seconds, it creates small files, which can lead to a large number of small files over time. This can lead to performance degradation in Athena queries as more small files mean more metadata needs to be scanned, and more file operations are required to read data. By merging small files into larger files, the number of files in the bucket can be reduced, which can significantly improve Athena query performance. Increasing the number of shards in Kinesis Data Streams, adding more memory and CPU capacity to the streaming application, or writing files to multiple S3 buckets are not directly related to the issue of degraded query performance in Athena.
upvoted 4 times
...
itsme1
2 years, 2 months ago
Selected Answer: A
s3 has a limit of 5500 requests per second, combining reduces the requests https://docs.aws.amazon.com/athena/latest/ug/performance-tuning.html
upvoted 1 times
...
cloudlearnerhere
2 years, 6 months ago
Selected Answer: A
Correct answer is A as merging files to form a bigger file can help optimize and improve query performance. Option B is wrong as increasing shards would only increase the ingestion flow. Options C & D are wrong as it does not improve Athena's query performance.
upvoted 7 times
...
MultiCloudIronMan
2 years, 7 months ago
Selected Answer: A
Merging small files into larger files will reduce the number of compute activities and speed up the process
upvoted 2 times
...
Abep
2 years, 8 months ago
https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/
upvoted 2 times
...
rocky48
2 years, 10 months ago
Selected Answer: A
Answer should be A
upvoted 2 times
...
ru4aws
2 years, 10 months ago
Selected Answer: A
A as merging small files into one large file will result in less meta data to maintain for the Data Catalog to maintain which results in Athena to scan data faster
upvoted 3 times
...
dushmantha
2 years, 11 months ago
Things can be done to increase performance of Athena are use columnar formats, use small number of large files, use partitions. So the answer should be A.
upvoted 2 times
...
Bik000
2 years, 12 months ago
Selected Answer: A
Answer should be A
upvoted 1 times
...
moon2351
3 years, 2 months ago
Selected Answer: A
Answer is A
upvoted 3 times
...
RSSRAO
3 years, 3 months ago
Selected Answer: A
A is the correct answer. merge small files into larger files works as expected
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago