exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 223 discussion

A retail company wants to create a system that can predict sales based on the price of an item. A machine learning (ML) engineer built an initial linear model that resulted in the following residual plot:



Which actions should the ML engineer take to improve the accuracy of the predictions in the next phase of model building? (Choose three.)

  • A. Downsample the data uniformly to reduce the amount of data.
  • B. Create two different models for different sections of the data.
  • C. Downsample the data in sections where Price < 50.
  • D. Offset the input data by a constant value where Price > 50.
  • E. Examine the input data, and apply non-linear data transformations where appropriate.
  • F. Use a non-linear model instead of a linear model.
Show Suggested Answer Hide Answer
Suggested Answer: BEF 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
loict
8 months ago
Selected Answer: BEF
A. NO - reducing data will not help in a better model; the more the merrier :-) B. YES - It can address non-linearity in the full spectrum C. NO - reducing data will not help in a better model; the more the merrier :-) D. NO - residual is not constant when price > 50 E. YES - that can help make non-linear data linear F. YES - it can capture more complex relationships
upvoted 4 times
...
Mickey321
8 months, 3 weeks ago
Selected Answer: BEF
Option E suggests that you examine the input data, and apply non-linear data transformations where appropriate. This option is helpful because it can reduce the non-linearity in your data and make it more suitable for a linear model. For example, you can apply a logarithmic, square root, or inverse transformation to your price variable and see if it improves the fit of your model1. You can also use the Box-Cox transformation, which is a method that automatically finds the best transformation for your data2. Option F suggests that you use a non-linear model instead of a linear model. This option is also helpful because it can capture the non-linear relationship between price and sales that is evident in your residual plot. Option B suggests that you create two different models for different sections of the data. This option is also helpful because it can account for the different behavior of your data at different price ranges.
upvoted 1 times
...
Peng001
10 months ago
Selected Answer: BEF
The linear model y = ax + b works well for x < 50, but for x > 50 the residual increases linearly, meaning that the slope linear model increases, i.e., y = a'x + b' with a' != a. Offset will not help. Downsampling will not help either.
upvoted 1 times
...
Mllb
1 year, 1 month ago
The linear model doesn't capture the data complexity
upvoted 1 times
Mllb
1 year, 1 month ago
Then, BEF
upvoted 3 times
Mllb
1 year, 1 month ago
It appears on 2023-April-03
upvoted 2 times
...
...
...
stjokerli
1 year, 2 months ago
Selected Answer: BEF
As per wolfsong said
upvoted 4 times
...
Chelseajcole
1 year, 2 months ago
Selected Answer: CDE
Two models , add a constant or in-put data transformation
upvoted 1 times
Chelseajcole
1 year, 2 months ago
Bde should be the answer
upvoted 1 times
...
...
AjoseO
1 year, 2 months ago
Selected Answer: CDE
The residual plot shows that the linear model is not fitting the data well, with a clear pattern indicating that the model is underfitting. To improve the accuracy of the predictions, the ML engineer should take the following actions: C. Downsample the data in sections where Price < 50: This could be an option since there seems to be a higher variance in the residuals in the region where Price < 50. D. Offset the input data by a constant value where Price > 50: This could be an option since there seems to be a systematic bias in the residuals in the region where Price > 50. E. Examine the input data, and apply non-linear data transformations where appropriate: This is necessary since the residual plot shows that the linear model is not capturing the non-linear relationships in the data.
upvoted 1 times
GiyeonShin
1 year, 2 months ago
If D is the answer of this question, Isn't B the another answer too? Suppose that the initial linear model means y = aX + b, then D means y = a(X - C) + b --> y = aX + b' (when Price > 50) I think that D means we would use two different linear models for different sections (Price = 50) of the data.
upvoted 1 times
...
wolfsong
1 year, 2 months ago
A good residual plot is a flat line at y = 0. So... - Not sure if D is right. If you offset by a constant value, you're just moving the plot up or down. You'd have to add a term like K*Price, where price > 50 and K > 0, for you to flatten that curve beyond Price > 50. - Also unsure about C. The variance looks fairly good for Price < 50 as it's mostly around zero which is what you want. The problem is the residual value at Price > 50 which goes way off. I'd go with B, E & F: E: obvious F: use non-linear model instead as it will remove the kink in the plot B: Not an answer I like, but if you can't use a nonlinear model, you need to use a piecewise-linear model that separates the data in two. Something like this: https://towardsdatascience.com/piecewise-linear-regression-model-what-is-it-and-when-can-we-use-it-93286cfee452
upvoted 11 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago