exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 113 discussion

A Machine Learning Specialist is designing a scalable data storage solution for Amazon SageMaker. There is an existing TensorFlow-based model implemented as a train.py script that relies on static training data that is currently stored as TFRecords.
Which method of providing training data to Amazon SageMaker would meet the business requirements with the LEAST development overhead?

  • A. Use Amazon SageMaker script mode and use train.py unchanged. Point the Amazon SageMaker training invocation to the local path of the data without reformatting the training data.
  • B. Use Amazon SageMaker script mode and use train.py unchanged. Put the TFRecord data into an Amazon S3 bucket. Point the Amazon SageMaker training invocation to the S3 bucket without reformatting the training data.
  • C. Rewrite the train.py script to add a section that converts TFRecords to protobuf and ingests the protobuf data instead of TFRecords.
  • D. Prepare the data in the format accepted by Amazon SageMaker. Use AWS Glue or AWS Lambda to reformat and store the data in an Amazon S3 bucket.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
[Removed]
Highly Voted 3 years, 6 months ago
I would select B. Based on the following AWS documentation it appears this is the right approach: https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/using_tf.html https://github.com/aws-samples/amazon-sagemaker-script-mode/blob/master/tf-horovod-inference-pipeline/train.py
upvoted 21 times
...
SophieSu
Highly Voted 3 years, 6 months ago
B is my answer. Reading Data filenames = ["s3://bucketname/path/to/file1.tfrecord", "s3://bucketname/path/to/file2.tfrecord"] dataset = tf.data.TFRecordDataset(filenames)
upvoted 12 times
...
MultiCloudIronMan
Most Recent 7 months, 1 week ago
Selected Answer: B
This approach leverages the existing TFRecord format and minimizes changes to the current setup, ensuring a smooth transition to using Amazon SageMaker with minimal development effort.
upvoted 1 times
...
Mickey321
1 year, 8 months ago
Selected Answer: B
option B
upvoted 1 times
...
kaike_reis
1 year, 9 months ago
Selected Answer: B
Letters C and D need code development and are therefore discarded. As we want a scalable data storage, it is recommended to use the Letter B, since S3 is scalable. Letter A is wrong as your personal computer is not scalable.
upvoted 2 times
...
ashton777
1 year, 10 months ago
Where had the Capslock Donald gone? I kinda miss his answers
upvoted 8 times
...
himanshu10k
2 years ago
Internet connectivity issue: then how IOT can be a solution? (Correct answer should be A)
upvoted 2 times
...
AjoseO
2 years, 1 month ago
Selected Answer: B
Amazon SageMaker script mode enables training a machine learning model using a script that you provide. By using the unchanged train.py script and putting the TFRecord data into an Amazon S3 bucket, you can easily point the Amazon SageMaker training invocation to the S3 bucket without reformatting the training data. This option avoids the need to rewrite the train.py script or to prepare the data in a different format. It also leverages the scalability and cost-effectiveness of Amazon S3 for storing large amounts of data, which is important for training machine learning models.
upvoted 3 times
ccpmad
1 year, 9 months ago
thank you chatgpt
upvoted 1 times
...
...
apprehensive_scar
3 years, 2 months ago
B, obviously
upvoted 1 times
...
KM226
3 years, 4 months ago
Selected Answer: B
I like answer B
upvoted 2 times
...
Zhubajie
3 years, 5 months ago
Why not A? Why can't we train it from local?
upvoted 1 times
AddiWei
3 years, 2 months ago
Sagemaker to my understanding requires the data to be in S3.
upvoted 5 times
...
...
Huy
3 years, 6 months ago
B. https://aws.amazon.com/about-aws/whats-new/2019/01/amazon-sagemaker-batch-transform-now-supports-tfrecord-format/
upvoted 2 times
...
cnethers
3 years, 6 months ago
Unfortunilty you can't use the script unchanged, there are some things that need to be added: 1. Make sure your script can handle --model_dir as an additional command line argument. If you did not specify a location when you created the TensorFlow estimator, an S3 location under the default training job bucket is used. Distributed training with parameter servers requires you to use the tf.estimator.train_and_evaluate API and to provide an S3 location as the model directory during training. 2. Load input data from the input channels. The input channels are defined when fit is called. ## https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/using_tf.html Beause of the pre-rec Ans A and B are an easy disqualifcation. There is no need to change the training format so option C is a red herring Ans is D Not the most obvious answer
upvoted 4 times
SophieSu
3 years, 6 months ago
according your explaination, the correct answer should be B
upvoted 2 times
...
akgarg00
1 year, 5 months ago
It mentions using sagemaker in "script mode" Which is different from working on Sagemaker using python SDK.
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago