exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 175 discussion

A global financial company is using machine learning to automate its loan approval process. The company has a dataset of customer information. The dataset contains some categorical fields, such as customer location by city and housing status. The dataset also includes financial fields in different units, such as account balances in US dollars and monthly interest in US cents.
The company's data scientists are using a gradient boosting regression model to infer the credit score for each customer. The model has a training accuracy of
99% and a testing accuracy of 75%. The data scientists want to improve the model's testing accuracy.
Which process will improve the testing accuracy the MOST?

  • A. Use a one-hot encoder for the categorical fields in the dataset. Perform standardization on the financial fields in the dataset. Apply L1 regularization to the data.
  • B. Use tokenization of the categorical fields in the dataset. Perform binning on the financial fields in the dataset. Remove the outliers in the data by using the z- score.
  • C. Use a label encoder for the categorical fields in the dataset. Perform L1 regularization on the financial fields in the dataset. Apply L2 regularization to the data.
  • D. Use a logarithm transformation on the categorical fields in the dataset. Perform binning on the financial fields in the dataset. Use imputation to populate missing values in the dataset.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
ckkobe24
Highly Voted 2 years, 5 months ago
Selected Answer: A
agree it's A for me
upvoted 15 times
...
spaceexplorer
Highly Voted 2 years, 6 months ago
A: it's overfitting so regularization is needed, need apply scaling on financial data fields as it's for regression problem; one hot encoding for city of the house field.
upvoted 11 times
...
kyuhuck
Most Recent 8 months, 4 weeks ago
Selected Answer: A
Option A is the most likely to improve the testing accuracy the most effectively because it uses appropriate preprocessing techniques for both categorical and numerical data and applies a regularization technique that can help in reducing overfitting, thereby potentially improving the model's generalization to unseen data.
upvoted 2 times
...
DimLam
1 year ago
Selected Answer: B
I will go with B. (A) suggests applying regularization to the data. It doesn't make sense. (B) answer is well framed. At least it doesn't use the wrong formulation.
upvoted 1 times
DimLam
1 year ago
B also looks suspicious.
upvoted 1 times
...
...
Mickey321
1 year, 3 months ago
Selected Answer: A
Use a one-hot encoder for the categorical fields in the dataset. Perform standardization on the financial fields in the dataset. Apply L1 regularization to the data.
upvoted 1 times
...
Tony_1406
1 year, 6 months ago
Selected Answer: A
Agree with A, but I think the answer is slightly inaccurate. L1 regularization within the model and to the loss function. As a result, some features will be removed in the model. The answer suggest L1 regularization is applied to the dataset directly.
upvoted 1 times
...
AjoseO
1 year, 8 months ago
Selected Answer: A
Option A is the most appropriate approach to improve the testing accuracy of the model. One-hot encoding can effectively represent categorical variables in a numeric format that is suitable for machine learning models. Standardizing the financial fields can make the data more comparable and improve the model's performance. L1 regularization can help in feature selection and avoid overfitting by reducing the number of features.
upvoted 2 times
DimLam
1 year ago
How do you apply regularization to Data and not to the model params?
upvoted 1 times
...
...
Peeking
1 year, 10 months ago
Why are most of the chosen answers by ExamTopics mostly obviously wrong? There is nothing like tokenisation of categorical variable and B should be obviously wrong.
upvoted 1 times
ccpmad
1 year, 3 months ago
When they were published (firtsly, they steal them by photo/camera) they didn't have chatgpt to see the answers, and of course, they don't have any ML specialist or time to resolve them.
upvoted 1 times
...
...
Shailendraa
2 years, 1 month ago
12-sep exam
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago