Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 76 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 76
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

A company uses a long short-term memory (LSTM) model to evaluate the risk factors of a particular energy sector. The model reviews multi-page text documents to analyze each sentence of the text and categorize it as either a potential risk or no risk. The model is not performing well, even though the Data Scientist has experimented with many different network structures and tuned the corresponding hyperparameters.
Which approach will provide the MAXIMUM performance boost?

A. Initialize the words by term frequency-inverse document frequency (TF-IDF) vectors pretrained on a large collection of news articles related to the energy sector.
B. Use gated recurrent units (GRUs) instead of LSTM and run the training process until the validation loss stops decreasing.
C. Reduce the learning rate and run the training process until the training loss stops decreasing.
D. Initialize the words by word2vec embeddings pretrained on a large collection of news articles related to the energy sector.

Show Suggested Answer

Suggested Answer: D 🗳️

by jiadong at Feb. 3, 2021, 4:58 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

jiadong

Highly Voted 3 years, 8 months ago

I think the right answer is D

upvoted 24 times

...

SophieSu

Highly Voted 3 years, 8 months ago

D is correct. C is not the best the answer because the question states that tuning parameters doesn't help a lot. Transfer learning would be better solution!

upvoted 11 times

...

xicocaio

Most Recent 8 months, 3 weeks ago

Selected Answer: D

Using word2vec embeddings would give the model more accurate representations of words at the start, potentially leading to a significant performance boost for text classification tasks.

upvoted 1 times

...

ninomfr64

12 months ago

Selected Answer: D

A. NO, transfer learning helps, word2vec > TD-ITF as the first keeps into account part of the word context (there is a hyperparameter for this) B. LTSM delivers better results wrt GRU which is in turn a compromise architecture to balance accuracy with training time/cost C. Heperparameters tuning has been already applied, this will not help D. YEs, transfer learning will help and word3vec is better option in this scenario

upvoted 2 times

...

3eb0542

1 year, 4 months ago

Selected Answer: D

How are the 'correct' answers being provided? I'm seeing so many answers that seem to be wrong and usually, the community vote seems to be correct. This is kind of frustrating.

upvoted 3 times

...

Mickey321

1 year, 9 months ago

Selected Answer: D

Word2vec is a technique that can learn distributed representations of words, also known as word embeddings, from large amounts of text data. Word embeddings can capture the semantic and syntactic similarities and relationships between words, and can be used as input features for neural network models. Word2vec can be trained on domain-specific corpora to obtain more relevant and accurate word embeddings for a particular task.

upvoted 3 times

...

kaike_reis

1 year, 10 months ago

Selected Answer: D

From my perspective, B and C are wrong because the DS already tried something close to this. D is correct.

upvoted 1 times

...

vbal

2 years ago

I don't think High Dimensionality is take care by C2V; TF-IDF is required. A.

upvoted 1 times

...

Peeking

2 years, 6 months ago

Selected Answer: D

Transfer learning, in my experience, has been a good way to boost performance when hyperparameter tuning did not work.

upvoted 2 times

...

Sidekick

3 years ago

The case ask for predicting labels for sentences, the appropriate algo should be "Text Classification" Which, just as "wrod2vec,i part of Blazing Text.

upvoted 1 times

...

julpeg

3 years, 2 months ago

Selected Answer: D

The answer should be D. My reasoning is that by using a word embedding which is trained on domain specific material, the embeddings between two words are more domain specific. This means that relations (good or bad) are represented in a better way, which also means that the model should be able to predict the results in a more accurate way.

upvoted 3 times

...

bitsplease

3 years, 4 months ago

both A & D "seem" correct, but word2vec takes ORDER of words into acc (to some extent)--while TF-IDF does not. Thus max boost is from D. B,C are wrong because the DS has tried several network architectures (aka LSTM) and hyperparameter tuning (aka option C)

upvoted 6 times

...

ahmedelbhy

3 years, 7 months ago

i think answer is A as The model reviews multi-page text documents

upvoted 1 times

GiyeonShin

2 years, 5 months ago

I think that the general tf-idf vectors cannot be directly adapted to the deep learning model, because of the large dimension in vector values

upvoted 1 times

...

puffpuff

3 years, 8 months ago

I think it should be B A/D are false flags because the question doesn't specify what kind of data engineering is currently done on the inputs, as a baseline Per wikipedia, for GRUs, "GRUs have been shown to exhibit better performance on certain smaller and less frequent datasets", which fits the context of a particular energy sector

upvoted 2 times

...

ChanduPatil

3 years, 8 months ago

why not B??

upvoted 1 times

GiyeonShin

2 years, 5 months ago

Generally, LSTM has the better performance then GRU in large datasets such as multi-page documents. GRU has advantages of memory allocation and training time.

upvoted 1 times

...

GiyeonShin

2 years, 5 months ago

Early stopping can give the model better performance, but I think that the model needs more condition like patience value for early stopping. This is because the model doesn't always show the performance at its maximum when the validation loss stops decreasing.

upvoted 1 times

...

jkreddy

3 years, 8 months ago

It cannot be C, because hyper parameter tuning didnt work as given in question. Also, A and D are same, however, word2vec model internally implements tf-idf much more efficiently. So answer got to be D

upvoted 4 times

YJ4219

3 years, 8 months ago

but they need to classify the whole sentence i think for such a case we use object2vec not word2vec, but since it's not available in the answers, B is the only answer left.

upvoted 2 times

...

tmld

3 years, 8 months ago

I go for C

upvoted 2 times

...

Load full discussion...