Exam AWS Certified Machine Learning - Specialty topic 1 question 195 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 195
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

A geospatial analysis company processes thousands of new satellite images each day to produce vessel detection data for commercial shipping. The company stores the training data in Amazon S3. The training data incrementally increases in size with new images each day.

The company has configured an Amazon SageMaker training job to use a single ml.p2.xlarge instance with File input mode to train the built-in Object Detection algorithm. The training process was successful last month but is now failing because of a lack of storage. Aside from the addition of training data, nothing has changed in the model training process.

A machine learning (ML) specialist needs to change the training configuration to fix the problem. The solution must optimize performance and must minimize the cost of training.

Which solution will meet these requirements?

A. Modify the training configuration to use two ml.p2.xlarge instances.
B. Modify the training configuration to use Pipe input mode.
C. Modify the training configuration to use a single ml.p3.2xlarge instance.
D. Modify the training configuration to use Amazon Elastic File System (Amazon EFS) instead of Amazon S3 to store the input training data.

Show Suggested Answer

Suggested Answer: B 🗳️

by dunhill at Nov. 28, 2022, 4:45 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

tsangckl

Highly Voted 1 year, 8 months ago

Selected Answer: B

Agreed with B https://aws.amazon.com/blogs/machine-learning/using-pipe-input-mode-for-amazon-sagemaker-algorithms/

upvoted 10 times

...

Mickey321

Most Recent 12 months ago

Selected Answer: B

Pipe input mode.

upvoted 1 times

...

oso0348

1 year, 4 months ago

Selected Answer: B

The correct solution would be to modify the training configuration to use Pipe input mode. This will allow the training data to stream directly into the training instance as it is being consumed, rather than first being downloaded from S3 into the instance's local storage. This can help reduce storage requirements and optimize performance, while also minimizing costs. Using more or larger instances may help with processing power, but it will not address the storage issue, and may even increase costs unnecessarily. Using Amazon EFS may also be an option, but it may come with additional costs and operational overhead.

upvoted 2 times

ZSun

1 year, 3 months ago

the explanation of EFS is not correct. EFS is the default storage for sagemaker instance. That means, when you use the file mode, data is firstly copied to EFS and then fit model. So the issue" model is failing because of lack of storage" indicates EFS is not capable to store all s3 data. We have to use pipe mode to incrementally send data from s3 to EFS.

upvoted 2 times

...

Peeking

1 year, 8 months ago

Selected Answer: B

Pipe mode solves the problem without incurring extra storage cost. Data is streamed directly to the training algorithm without the need to be stored in the EBS volume.

upvoted 3 times

...

dunhill

1 year, 8 months ago

I think the answer is B. D is incorrect because EFS is more expensive than S3. It looks like scaling up or out is no help for storage issue. Therefore A and C are not helpful.

upvoted 3 times

...