Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 218 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 218
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

A company stores its documents in Amazon S3 with no predefined product categories. A data scientist needs to build a machine learning model to categorize the documents for all the company's products.

Which solution will meet these requirements with the MOST operational efficiency?

A. Build a custom clustering model. Create a Dockerfile and build a Docker image. Register the Docker image in Amazon Elastic Container Registry (Amazon ECR). Use the custom image in Amazon SageMaker to generate a trained model.
B. Tokenize the data and transform the data into tabular data. Train an Amazon SageMaker k-means model to generate the product categories.
C. Train an Amazon SageMaker Neural Topic Model (NTM) model to generate the product categories.
D. Train an Amazon SageMaker Blazing Text model to generate the product categories.

Show Suggested Answer

Suggested Answer: C 🗳️

by drcok87 at Feb. 11, 2023, 12:32 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

AjoseO

Highly Voted 2 years, 6 months ago

Selected Answer: C

C. Train an Amazon SageMaker Neural Topic Model (NTM) model to generate the product categories. The task is to build a machine learning model to categorize documents for all the company's products. Among the given options, training an Amazon SageMaker Neural Topic Model (NTM) model would be the most efficient and effective solution. An NTM model can identify topics in text data and group similar documents into specific categories, making it a suitable model for document categorization. With an NTM model, the data scientist would not need to define product categories beforehand, as the model would automatically group similar documents into topics. This saves time and resources compared to the other options.

upvoted 11 times

ccpmad

2 years ago

thank you chatgpt

upvoted 1 times

teka112233

1 year, 11 months ago

Thanks to ChatGPT and also thanks Ajose O for saving our time looking for some evidence or a proof to the right answer Ajose you made some good work bringing this clarification for us, so Thank you so much, Gracias amigo :)

upvoted 3 times

...

loict

Highly Voted 1 year, 11 months ago

Selected Answer: C

A. NO - no need to build a custom model B. NO - k-means is supervised model C. YES - unsupervised clustering algorithm D. NO - Blazing Text will do word embedding, not classification

upvoted 5 times

wendaz

1 year, 10 months ago

No, k-means is an unsupervised learning algorithm. I

upvoted 5 times

...

sheetalconect

Most Recent 1 year, 2 months ago

Selected Answer: B

C. Train an Amazon SageMaker Neural Topic Model (NTM) model to generate the product categories. -- option doesn't talk about any classification activity

upvoted 1 times

...

Mickey321

2 years ago

Selected Answer: C

Neural Topic Model (NTM) is one of the built-in algorithms of Amazon SageMaker that can perform topic modeling on text data. Topic modeling is a technique that can discover latent topics or themes from a collection of documents. Topic modeling can be used for document categorization by assigning each document to one or more topics based on its content.

upvoted 1 times

...

kaike_reis

2 years ago

Selected Answer: C

Blazing Text is only for supervised problems.

upvoted 1 times

...

mawsman

2 years, 4 months ago

Selected Answer: D

Assign pre-defined categories to documents in a corpus: categorize books in a library into academic disciplines - BlazingText algorithm https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html

upvoted 2 times

mawsman

2 years, 4 months ago

Reading it again - C

upvoted 3 times

...

oso0348

2 years, 5 months ago

Selected Answer: B

Option C is wrong because it suggests using a Neural Topic Model (NTM) to categorize documents. While NTM can be used to discover the underlying topics in a corpus of documents, it may not be the most efficient solution for categorizing documents for specific products. NTM is more suited for unsupervised learning problems where the goal is to discover the underlying themes or topics of the document corpus. In this scenario, the data scientist needs to categorize documents based on predefined product categories. Therefore, a supervised learning algorithm like a text classification model would be more suitable. Amazon SageMaker Blazing Text algorithm provides an efficient and scalable solution for text classification problems.

upvoted 1 times

zshafi13

2 years, 5 months ago

"no predefined product categories" -> unsupervised learning, C.

upvoted 4 times

vbal

2 years ago

what is K-means?

upvoted 1 times

kaike_reis

2 years ago

Good catch. The problem with B is the fact that is an incomplete question: "Tokenize the data and transform the data into tabular data" how are you going to do this conrad?

upvoted 2 times

...

drcok87

2 years, 6 months ago

No predefined product category: topic modeling with NTM or LDA (Organize a set of documents into topics (not known in advance): tag a document as belonging to a medical category based on the terms used in the document.) Predefined product category: topic modeling with blazing text (categorize books in a library into academic disciplines) c

upvoted 1 times

...