exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 76 discussion

A company uses a long short-term memory (LSTM) model to evaluate the risk factors of a particular energy sector. The model reviews multi-page text documents to analyze each sentence of the text and categorize it as either a potential risk or no risk. The model is not performing well, even though the Data Scientist has experimented with many different network structures and tuned the corresponding hyperparameters.
Which approach will provide the MAXIMUM performance boost?

  • A. Initialize the words by term frequency-inverse document frequency (TF-IDF) vectors pretrained on a large collection of news articles related to the energy sector.
  • B. Use gated recurrent units (GRUs) instead of LSTM and run the training process until the validation loss stops decreasing.
  • C. Reduce the learning rate and run the training process until the training loss stops decreasing.
  • D. Initialize the words by word2vec embeddings pretrained on a large collection of news articles related to the energy sector.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
jiadong
Highly Voted 3 years, 8 months ago
I think the right answer is D
upvoted 24 times
...
SophieSu
Highly Voted 3 years, 8 months ago
D is correct. C is not the best the answer because the question states that tuning parameters doesn't help a lot. Transfer learning would be better solution!
upvoted 11 times
...
xicocaio
Most Recent 8 months, 3 weeks ago
Selected Answer: D
Using word2vec embeddings would give the model more accurate representations of words at the start, potentially leading to a significant performance boost for text classification tasks.
upvoted 1 times
...
ninomfr64
12 months ago
Selected Answer: D
A. NO, transfer learning helps, word2vec > TD-ITF as the first keeps into account part of the word context (there is a hyperparameter for this) B. LTSM delivers better results wrt GRU which is in turn a compromise architecture to balance accuracy with training time/cost C. Heperparameters tuning has been already applied, this will not help D. YEs, transfer learning will help and word3vec is better option in this scenario
upvoted 2 times
...
3eb0542
1 year, 4 months ago
Selected Answer: D
How are the 'correct' answers being provided? I'm seeing so many answers that seem to be wrong and usually, the community vote seems to be correct. This is kind of frustrating.
upvoted 3 times
...
Mickey321
1 year, 9 months ago
Selected Answer: D
Word2vec is a technique that can learn distributed representations of words, also known as word embeddings, from large amounts of text data. Word embeddings can capture the semantic and syntactic similarities and relationships between words, and can be used as input features for neural network models. Word2vec can be trained on domain-specific corpora to obtain more relevant and accurate word embeddings for a particular task.
upvoted 3 times
...
kaike_reis
1 year, 10 months ago
Selected Answer: D
From my perspective, B and C are wrong because the DS already tried something close to this. D is correct.
upvoted 1 times
...
vbal
2 years ago
I don't think High Dimensionality is take care by C2V; TF-IDF is required. A.
upvoted 1 times
...
Peeking
2 years, 6 months ago
Selected Answer: D
Transfer learning, in my experience, has been a good way to boost performance when hyperparameter tuning did not work.
upvoted 2 times
...
Sidekick
3 years ago
The case ask for predicting labels for sentences, the appropriate algo should be "Text Classification" Which, just as "wrod2vec,i part of Blazing Text.
upvoted 1 times
...
julpeg
3 years, 2 months ago
Selected Answer: D
The answer should be D. My reasoning is that by using a word embedding which is trained on domain specific material, the embeddings between two words are more domain specific. This means that relations (good or bad) are represented in a better way, which also means that the model should be able to predict the results in a more accurate way.
upvoted 3 times
...
bitsplease
3 years, 4 months ago
both A & D "seem" correct, but word2vec takes ORDER of words into acc (to some extent)--while TF-IDF does not. Thus max boost is from D. B,C are wrong because the DS has tried several network architectures (aka LSTM) and hyperparameter tuning (aka option C)
upvoted 6 times
...
ahmedelbhy
3 years, 7 months ago
i think answer is A as The model reviews multi-page text documents
upvoted 1 times
GiyeonShin
2 years, 5 months ago
I think that the general tf-idf vectors cannot be directly adapted to the deep learning model, because of the large dimension in vector values
upvoted 1 times
...
...
puffpuff
3 years, 8 months ago
I think it should be B A/D are false flags because the question doesn't specify what kind of data engineering is currently done on the inputs, as a baseline Per wikipedia, for GRUs, "GRUs have been shown to exhibit better performance on certain smaller and less frequent datasets", which fits the context of a particular energy sector
upvoted 2 times
...
ChanduPatil
3 years, 8 months ago
why not B??
upvoted 1 times
GiyeonShin
2 years, 5 months ago
Generally, LSTM has the better performance then GRU in large datasets such as multi-page documents. GRU has advantages of memory allocation and training time.
upvoted 1 times
...
GiyeonShin
2 years, 5 months ago
Early stopping can give the model better performance, but I think that the model needs more condition like patience value for early stopping. This is because the model doesn't always show the performance at its maximum when the validation loss stops decreasing.
upvoted 1 times
...
...
jkreddy
3 years, 8 months ago
It cannot be C, because hyper parameter tuning didnt work as given in question. Also, A and D are same, however, word2vec model internally implements tf-idf much more efficiently. So answer got to be D
upvoted 4 times
YJ4219
3 years, 8 months ago
but they need to classify the whole sentence i think for such a case we use object2vec not word2vec, but since it's not available in the answers, B is the only answer left.
upvoted 2 times
...
...
tmld
3 years, 8 months ago
I go for C
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...