exam questions

Exam AWS Certified Solutions Architect - Professional All Questions

View all questions & answers for the AWS Certified Solutions Architect - Professional exam

Exam AWS Certified Solutions Architect - Professional topic 1 question 507 discussion

A company has a large on-premises Apache Hadoop cluster with a 20 PB HDFS database. The cluster is growing every quarter by roughly 200 instances and 1
PB. The company's goals are to enable resiliency for its Hadoop data, limit the impact of losing cluster nodes, and significantly reduce costs. The current cluster runs 24/7 and supports a variety of analysis workloads, including interactive queries and batch processing.
Which solution would meet these requirements with the LEAST expense and down time?

  • A. Use AWS Snowmobile to migrate the existing cluster data to Amazon S3. Create a persistent Amazon EMR cluster initially sized to handle the interactive workload based on historical data from the on-premises cluster. Store the data on EMRFS. Minimize costs using Reserved Instances for master and core nodes and Spot Instances for task nodes, and auto scale task nodes based on Amazon CloudWatch metrics. Create job-specific, optimized clusters for batch workloads that are similarly optimized.
  • B. Use AWS Snowmobile to migrate the existing cluster data to Amazon S3. Create a persistent Amazon EMR cluster of a similar size and configuration to the current cluster. Store the data on EMRFS. Minimize costs by using Reserved Instances. As the workload grows each quarter, purchase additional Reserved Instances and add to the cluster.
  • C. Use AWS Snowball to migrate the existing cluster data to Amazon S3. Create a persistent Amazon EMR cluster initially sized to handle the interactive workloads based on historical data from the on-premises cluster. Store the data on EMRFS. Minimize costs using Reserved Instances for master and core nodes and Spot Instances for task nodes, and auto scale task nodes based on Amazon CloudWatch metrics. Create job-specific, optimized clusters for batch workloads that are similarly optimized.
  • D. Use AWS Direct Connect to migrate the existing cluster data to Amazon S3. Create a persistent Amazon EMR cluster initially sized to handle the interactive workload based on historical data from the on-premises cluster. Store the data on EMRFS. Minimize costs using Reserved Instances for master and core nodes and Spot Instances for task nodes, and auto scale task nodes based on Amazon CloudWatch metrics. Create job-specific, optimized clusters for batch workloads that are similarly optimized.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️
To migrate large datasets of 10 PB or more in a single location, you should use Snowmobile. For datasets less than 10 PB or distributed in multiple locations, you should use Snowball. In addition, you should evaluate the amount of available bandwidth in your network backbone. If you have a high speed backbone with hundreds of Gb/s of spare throughput, then you can use Snowmobile to migrate the large datasets all at once. If you have limited bandwidth on your backbone, you should consider using multiple Snowballs to migrate the data incrementally.

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
MGM
Highly Voted 3 years, 9 months ago
A Q: How should I choose between Snowmobile and Snowball? To migrate large datasets of 10PB or more in a single location, you should use Snowmobile. For datasets less than 10PB or distributed in multiple locations, you should use Snowball. In addition, you should evaluate the amount of available bandwidth in your network backbone. If you have a high speed backbone with hundreds of Gb/s of spare throughput, then you can use Snowmobile to migrate the large datasets all at once. If you have limited bandwidth on your backbone, you should consider using multiple Snowballs to migrate the data incrementally.
upvoted 29 times
...
Moon
Highly Voted 3 years, 9 months ago
I support answer "A". Snowmobile, is used for PB of data, Snowball can't support that. (so A, or B). Then, A is more cost effective.
upvoted 12 times
Moon
3 years, 9 months ago
for snowball edge, it support 100TB, then you may need 100 of them to make 10PB. So better to have Snowmobile.
upvoted 2 times
[Removed]
3 years, 8 months ago
even less than that. Snowball Edge has 83TB of usable disk space.
upvoted 2 times
...
...
...
SkyZeroZx
Most Recent 2 years ago
Selected Answer: A
A Q: How should I choose between Snowmobile and Snowball? To migrate large datasets of 10PB or more in a single location, you should use Snowmobile. For datasets less than 10PB or distributed in multiple locations, you should use Snowball. In addition, you should evaluate the amount of available bandwidth in your network backbone. If you have a high speed backbone with hundreds of Gb/s of spare throughput, then you can use Snowmobile to migrate the large datasets all at once. If you have limited bandwidth on your backbone, you should consider using multiple Snowballs to migrate the data incrementally.
upvoted 1 times
...
Student1950
2 years, 11 months ago
never mind, we can do autoscaling with spot instance pooling as the link. It should be A https://aws.amazon.com/getting-started/hands-on/ec2-auto-scaling-spot-instances/
upvoted 1 times
...
Student1950
2 years, 11 months ago
with A, Can we apply autoscaling on spot instances ? I believe it should be B then Minimize costs using Reserved Instances for master and core nodes and Spot Instances for task nodes, and auto scale task nodes based on Amazon CloudWatch metrics
upvoted 1 times
...
Anhdd
3 years ago
Selected Answer: A
A for sure, no doubt
upvoted 1 times
...
CGJoon
3 years, 4 months ago
The question says: "The present cluster is available 24 hours a day". Doesn't that mean that using spot instances for task nodes in option B might not give you 24 hours a day availability? In that case, wouldn't the correct answer be option A?
upvoted 1 times
...
cldy
3 years, 5 months ago
A. Snowmobile for PB data.
upvoted 1 times
...
Ni_yot
3 years, 6 months ago
A for me. Snowmobile supports PBs of data
upvoted 1 times
Ni_yot
3 years, 6 months ago
You also want to use spot instances for batch jobs
upvoted 1 times
...
...
AzureDP900
3 years, 6 months ago
A is right
upvoted 1 times
...
WhyIronMan
3 years, 7 months ago
I'll go with A
upvoted 2 times
...
Waiweng
3 years, 7 months ago
it's A
upvoted 3 times
...
Kian1
3 years, 7 months ago
going with A
upvoted 1 times
...
Ebi
3 years, 7 months ago
Answer is A not C, Snowmobile is for data sets over 10PB
upvoted 3 times
...
Ashodwbi
3 years, 7 months ago
Guys, A and C are same answer
upvoted 1 times
Justu
3 years, 7 months ago
SnowMobile is not the same as SnowBall!!! Over 10PB of data -> USE SnowMobile! -> A
upvoted 2 times
...
...
consultsk
3 years, 8 months ago
I am not sure if anyone noticed. A and C both are having the same verbiage. Word to Word. I am not sure of the arguments made here. A is correct and eventually C also. :) A or C.
upvoted 1 times
consultsk
3 years, 8 months ago
Sorry, my misunderstanding ... A is correct. A is SnowMobile, C is SnowBall. Except that all are the same. A is only correct.
upvoted 1 times
...
...
petebear55
3 years, 8 months ago
I was initailly drawn to C ,, however it is clearly A having read this
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...