An organization currently runs a large Hadoop environment in their data center and is in the process of creating an alternative Hadoop environment on AWS, using Amazon EMR.
They generate around 20 TB of data on a monthly basis. Also on a monthly basis, files need to be grouped and copied to Amazon S3 to be used for the Amazon
EMR environment. They have multiple S3 buckets across AWS accounts to which data needs to be copied. There is a 10G AWS Direct Connect setup between their data center and AWS, and the network team has agreed to allocate 50% of AWS Direct Connect bandwidth to data transfer. The data transfer cannot take more than two days.
What would be the MOST efficient approach to transfer data to AWS on a monthly basis?
yogesh88
3 years, 6 months agoawane
3 years, 7 months agok115
3 years, 7 months agosrirampc
3 years, 7 months agoviru
3 years, 7 months agoBulti
3 years, 7 months agosusan8840
3 years, 7 months agoDerekKey
3 years, 6 months agosan2020
3 years, 8 months agobigdatalearner
3 years, 8 months agojlpl
3 years, 8 months agomattyb123
3 years, 8 months agomattyb123
3 years, 8 months agomattyb123
3 years, 8 months agomattyb123
3 years, 8 months agokttttt
3 years, 7 months ago