A company is Running Apache Spark on an Amazon EMR cluster. The Spark job writes to an Amazon S3 bucket. The job fails and returns an HTTP 503 `Slow Down` AmazonS3Exception error. Which actions will resolve this error? (Choose two.)
A.
Add additional prefixes to the S3 bucket
B.
Reduce the number of prefixes in the S3 bucket
C.
Increase the EMR File System (EMRFS) retry limit
D.
Disable dynamic partition pruning in the Spark configuration for the cluster
E.
Add more partitions in the Spark configuration for the cluster
There are three ways to resolve this problem:
Add more prefixes to the S3 bucket.
Reduce the number of Amazon S3 requests.
Increase the EMR File System (EMRFS) retry limit.
A - CORRECT, limit of S3 are defined on a per-prefix basis. ( https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html , "3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per partitioned prefix" ). If you "add" more prefixes, meaning you change the logic to make tasks write to "less shared" prefixes, then the limit is less likely to be hit.
B - WRONG, limit of S3 are defined on a per-prefix basis. If you reduce the prefixes, then the limit is more likely to be hit.
C - CORRECT, EMRFS uses an exponential backoff strategy to retry requests to Amazon S3 with default value 15. To increase the retry limit, change the value of fs.s3.maxRetries parameter. ( https://aws.amazon.com/premiumsupport/knowledge-center/emr-s3-503-slow-down/ )
D - WRONG, dynamic partition pruning decreases the number of requests to S3, because it helps select which prefixes to read
E - WRONG, increasing the partitions increases the number of Spark tasks, hence the number of write requests to S3
A C
https://aws.amazon.com/premiumsupport/knowledge-center/emr-s3-503-slow-down/
This error occurs when you exceed the Amazon Simple Storage Service (Amazon S3) request rate. The request rate is 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per prefix in a bucket.
There are three ways to resolve this problem:
Add more prefixes to the S3 bucket.
Reduce the number of Amazon S3 requests.
Increase the EMR File System (EMRFS) retry limit.
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
rocky48
Highly Voted 2 years, 9 months agoalfredofmt
Highly Voted 2 years, 9 months agopk349
Most Recent 1 year, 12 months agoCleverMonkey092
2 years, 1 month agofl0resi3nsis
3 years agoSeb23495786234
2 years, 11 months ago