exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 218 discussion

A company stores its documents in Amazon S3 with no predefined product categories. A data scientist needs to build a machine learning model to categorize the documents for all the company's products.

Which solution will meet these requirements with the MOST operational efficiency?

  • A. Build a custom clustering model. Create a Dockerfile and build a Docker image. Register the Docker image in Amazon Elastic Container Registry (Amazon ECR). Use the custom image in Amazon SageMaker to generate a trained model.
  • B. Tokenize the data and transform the data into tabular data. Train an Amazon SageMaker k-means model to generate the product categories.
  • C. Train an Amazon SageMaker Neural Topic Model (NTM) model to generate the product categories.
  • D. Train an Amazon SageMaker Blazing Text model to generate the product categories.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
AjoseO
Highly Voted 2 years, 2 months ago
Selected Answer: C
C. Train an Amazon SageMaker Neural Topic Model (NTM) model to generate the product categories. The task is to build a machine learning model to categorize documents for all the company's products. Among the given options, training an Amazon SageMaker Neural Topic Model (NTM) model would be the most efficient and effective solution. An NTM model can identify topics in text data and group similar documents into specific categories, making it a suitable model for document categorization. With an NTM model, the data scientist would not need to define product categories beforehand, as the model would automatically group similar documents into topics. This saves time and resources compared to the other options.
upvoted 11 times
ccpmad
1 year, 9 months ago
thank you chatgpt
upvoted 1 times
teka112233
1 year, 8 months ago
Thanks to ChatGPT and also thanks Ajose O for saving our time looking for some evidence or a proof to the right answer Ajose you made some good work bringing this clarification for us, so Thank you so much, Gracias amigo :)
upvoted 3 times
...
...
...
loict
Highly Voted 1 year, 8 months ago
Selected Answer: C
A. NO - no need to build a custom model B. NO - k-means is supervised model C. YES - unsupervised clustering algorithm D. NO - Blazing Text will do word embedding, not classification
upvoted 5 times
wendaz
1 year, 7 months ago
No, k-means is an unsupervised learning algorithm. I
upvoted 5 times
...
...
sheetalconect
Most Recent 11 months, 1 week ago
Selected Answer: B
C. Train an Amazon SageMaker Neural Topic Model (NTM) model to generate the product categories. -- option doesn't talk about any classification activity
upvoted 1 times
...
Mickey321
1 year, 9 months ago
Selected Answer: C
Neural Topic Model (NTM) is one of the built-in algorithms of Amazon SageMaker that can perform topic modeling on text data. Topic modeling is a technique that can discover latent topics or themes from a collection of documents. Topic modeling can be used for document categorization by assigning each document to one or more topics based on its content.
upvoted 1 times
...
kaike_reis
1 year, 9 months ago
Selected Answer: C
Blazing Text is only for supervised problems.
upvoted 1 times
...
mawsman
2 years, 1 month ago
Selected Answer: D
Assign pre-defined categories to documents in a corpus: categorize books in a library into academic disciplines - BlazingText algorithm https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html
upvoted 2 times
mawsman
2 years, 1 month ago
Reading it again - C
upvoted 3 times
...
...
oso0348
2 years, 2 months ago
Selected Answer: B
Option C is wrong because it suggests using a Neural Topic Model (NTM) to categorize documents. While NTM can be used to discover the underlying topics in a corpus of documents, it may not be the most efficient solution for categorizing documents for specific products. NTM is more suited for unsupervised learning problems where the goal is to discover the underlying themes or topics of the document corpus. In this scenario, the data scientist needs to categorize documents based on predefined product categories. Therefore, a supervised learning algorithm like a text classification model would be more suitable. Amazon SageMaker Blazing Text algorithm provides an efficient and scalable solution for text classification problems.
upvoted 1 times
zshafi13
2 years, 2 months ago
"no predefined product categories" -> unsupervised learning, C.
upvoted 4 times
vbal
1 year, 9 months ago
what is K-means?
upvoted 1 times
kaike_reis
1 year, 9 months ago
Good catch. The problem with B is the fact that is an incomplete question: "Tokenize the data and transform the data into tabular data" how are you going to do this conrad?
upvoted 2 times
...
...
...
...
drcok87
2 years, 3 months ago
No predefined product category: topic modeling with NTM or LDA (Organize a set of documents into topics (not known in advance): tag a document as belonging to a medical category based on the terms used in the document.) Predefined product category: topic modeling with blazing text (categorize books in a library into academic disciplines) c
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago