exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 81 discussion

A Data Scientist is training a multilayer perception (MLP) on a dataset with multiple classes. The target class of interest is unique compared to the other classes within the dataset, but it does not achieve and acceptable recall metric. The Data Scientist has already tried varying the number and size of the MLP's hidden layers, which has not significantly improved the results. A solution to improve recall must be implemented as quickly as possible.
Which techniques should be used to meet these requirements?

  • A. Gather more data using Amazon Mechanical Turk and then retrain
  • B. Train an anomaly detection model instead of an MLP
  • C. Train an XGBoost model instead of an MLP
  • D. Add class weights to the MLP's loss function and then retrain
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
[Removed]
Highly Voted 2 years, 7 months ago
For me answer is D, adjust to higher weight for class of interest: https://androidkt.com/set-class-weight-for-imbalance-dataset-in-keras/. More data may/may not be available and a data labeling job will take time.
upvoted 36 times
...
rhuanca
Highly Voted 1 year, 11 months ago
I believe is C, because we already made all changes possible in MLP hidden layers and the results have not improved then we must change model so XGBoot seems the best option
upvoted 5 times
...
Mickey321
Most Recent 8 months, 3 weeks ago
Selected Answer: D
In this case, the data scientist is training a multilayer perceptron (MLP), which is a type of neural network, on a dataset with multiple classes. The target class of interest is unique compared to the other classes within the dataset, but it does not achieve an acceptable recall metric. Recall is a measure of how well the model can identify the relevant examples from the minority class. The data scientist has already tried varying the number and size of the MLP’s hidden layers, which has not significantly improved the results. A solution to improve recall must be implemented as quickly as possible.
upvoted 2 times
...
kaike_reis
9 months, 2 weeks ago
Selected Answer: D
The fastest one is D
upvoted 1 times
...
ADVIT
10 months, 2 weeks ago
"quickly as possible" mean do not change to new stuff, so it's D.
upvoted 1 times
...
kukreti18
11 months, 2 weeks ago
Not C, as the question ask for a quick solution. I accept D.
upvoted 1 times
...
vbal
11 months, 2 weeks ago
Answer C : https://towardsdatascience.com/boosting-techniques-in-python-predicting-hotel-cancellations-62b7a76ffa6c
upvoted 1 times
...
AjoseO
1 year, 3 months ago
Selected Answer: D
Adding class weights to the MLP's loss function balances the class frequencies in the cost function during training, so the optimization process focuses more on the underrepresented class, improving recall.
upvoted 3 times
...
Tomatoteacher
1 year, 4 months ago
Selected Answer: D
I have done this before, class weights help with unbalanced data. Only logical solution that would help if not done, XGBoost could be different, but who knows, both NNs and XGBoost have comparable performance. Answer D!
upvoted 4 times
...
hamuozi
1 year, 7 months ago
Selected Answer: D
In this example, it is necessary to improve recall as soon as possible, so instead of creating additional datasets, it is effective to change the weight of each class during learning.
upvoted 4 times
...
victorlifan
1 year, 8 months ago
C: 'distinct' indicates we can simplify this as a binary classification problem; then, NN is just overkill. plus, retraining a NN is much slower than training an XGboost model
upvoted 2 times
...
exam_prep
1 year, 11 months ago
I feel answer is B. Question says Target is different than the input data which is hint for anomaly detection.
upvoted 2 times
kaike_reis
9 months, 2 weeks ago
stop overthink
upvoted 1 times
...
...
KM226
2 years, 4 months ago
I believe the answer is C because we need to use hyperparameters to improve model performance.
upvoted 2 times
...
ksarda11
2 years, 6 months ago
In case of the quickest possible way, D seems fine. For XGBoost, it will take a bit of time to code again
upvoted 4 times
...
ahquiceno
2 years, 7 months ago
For me Answer A. Why no other model instead xgBoost, the model need more labeled data to be trained and learn more positive examples.
upvoted 2 times
SophieSu
2 years, 6 months ago
A is incorrect. Even if you hire Amazon Mechanical Turk, you won't have more data. This question is NOT asking about "labeling".
upvoted 2 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago