exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 236 discussion

An online store is predicting future book sales by using a linear regression model that is based on past sales data. The data includes duration, a numerical feature that represents the number of days that a book has been listed in the online store. A data scientist performs an exploratory data analysis and discovers that the relationship between book sales and duration is skewed and non-linear.

Which data transformation step should the data scientist take to improve the predictions of the model?

  • A. One-hot encoding
  • B. Cartesian product transformation
  • C. Quantile binning
  • D. Normalization
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
CloudHandsOn
Highly Voted 9 months, 3 weeks ago
Selected Answer: C
C. Quantile binning: Quantile binning (or discretization) involves dividing a continuous variable into bins based on quantiles. This can be useful for handling skewed data by distributing the data more evenly across the bins. However, this method transforms the numerical feature into a categorical one, which might not be ideal for preserving the ordinal nature and the detailed variance of the 'duration' feature in a regression model. If the choice must be made from the given options, Option C (Quantile binning) might be the most suitable, albeit not ideal, as it can at least help in dealing with skewed distributions by distributing the data across bins more evenly. However, the data scientist should consider logarithmic or polynomial transformations for a more direct approach to addressing non-linearity.
upvoted 5 times
...
sevosevo
Highly Voted 1 year, 7 months ago
Selected Answer: C
https://docs.aws.amazon.com/machine-learning/latest/dg/data-transformations-reference.html
upvoted 5 times
...
loict
Most Recent 1 year, 1 month ago
Selected Answer: C
A. NO - One-hot encoding is for featurization of categories B. NO - C. YES - Quantile binning can make data linear (https://docs.aws.amazon.com/machine-learning/latest/dg/data-transformations-reference.html#quantile-binning-transformation) D. NO - Normalization will recenter the data, not change the relationship
upvoted 2 times
...
Mickey321
1 year, 2 months ago
Selected Answer: C
quantile binning
upvoted 1 times
...
jackzhao
1 year, 7 months ago
C is correct
upvoted 3 times
...
blanco750
1 year, 7 months ago
Selected Answer: C
C is the best answer I guess
upvoted 3 times
...
oso0348
1 year, 7 months ago
Selected Answer: C
the correct answer is C, Quantile binning. This transformation divides the data into quantiles (equal-sized intervals) based on the values of the feature (in this case, duration) and replaces the values with the bin number. This transformation can help capture non-linear relationships between features by creating more representative categories for skewed data. The transformed data can then be used to train a non-linear regression model, such as a polynomial regression, to better predict future book sales.
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago