An organization uses a custom map reduce application to build monthly reports based on many small data files in an Amazon S3 bucket. The data is submitted from various business units on a frequent but unpredictable schedule. As the dataset continues to grow, it becomes increasingly difficult to process all of the data in one day. The organization has scaled up its Amazon EMR cluster, but other optimizations could improve performance.
The organization needs to improve performance with minimal changes to existing processes and applications.
What action should the organization take?
jay1ram2
Highly Voted 3 years, 6 months agoME2000
3 years, 6 months agoguruguru
Highly Voted 3 years, 6 months agokriscool
Most Recent 3 years, 6 months agoBulti
3 years, 6 months agoramz123
3 years, 6 months agoCorram
3 years, 6 months agosan2020
3 years, 6 months agomarwan
3 years, 6 months agoantoneti
3 years, 6 months agoRaju_k
3 years, 6 months agoviduvivek
3 years, 6 months ago[Removed]
3 years, 7 months agocybe001
3 years, 7 months agoM2
3 years, 7 months agoexams
3 years, 7 months agojlpl
3 years, 7 months agomuhsin
3 years, 7 months agomattyb123
3 years, 7 months agomattyb123
3 years, 7 months ago