Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 175 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 175
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

A global financial company is using machine learning to automate its loan approval process. The company has a dataset of customer information. The dataset contains some categorical fields, such as customer location by city and housing status. The dataset also includes financial fields in different units, such as account balances in US dollars and monthly interest in US cents.
The company's data scientists are using a gradient boosting regression model to infer the credit score for each customer. The model has a training accuracy of
99% and a testing accuracy of 75%. The data scientists want to improve the model's testing accuracy.
Which process will improve the testing accuracy the MOST?

A. Use a one-hot encoder for the categorical fields in the dataset. Perform standardization on the financial fields in the dataset. Apply L1 regularization to the data.
B. Use tokenization of the categorical fields in the dataset. Perform binning on the financial fields in the dataset. Remove the outliers in the data by using the z- score.
C. Use a label encoder for the categorical fields in the dataset. Perform L1 regularization on the financial fields in the dataset. Apply L2 regularization to the data.
D. Use a logarithm transformation on the categorical fields in the dataset. Perform binning on the financial fields in the dataset. Use imputation to populate missing values in the dataset.

Show Suggested Answer

Suggested Answer: A 🗳️

by spaceexplorer at April 30, 2022, 9:21 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

ckkobe24

Highly Voted 2 years, 7 months ago

Selected Answer: A

agree it's A for me

upvoted 15 times

...

spaceexplorer

Highly Voted 2 years, 8 months ago

A: it's overfitting so regularization is needed, need apply scaling on financial data fields as it's for regression problem; one hot encoding for city of the house field.

upvoted 11 times

...

kyuhuck

Most Recent 10 months, 3 weeks ago

Selected Answer: A

Option A is the most likely to improve the testing accuracy the most effectively because it uses appropriate preprocessing techniques for both categorical and numerical data and applies a regularization technique that can help in reducing overfitting, thereby potentially improving the model's generalization to unseen data.

upvoted 2 times

...

DimLam

1 year, 2 months ago

Selected Answer: B

I will go with B. (A) suggests applying regularization to the data. It doesn't make sense. (B) answer is well framed. At least it doesn't use the wrong formulation.

upvoted 1 times

DimLam

1 year, 2 months ago

B also looks suspicious.

upvoted 1 times

...

Mickey321

1 year, 4 months ago

Selected Answer: A

Use a one-hot encoder for the categorical fields in the dataset. Perform standardization on the financial fields in the dataset. Apply L1 regularization to the data.

upvoted 1 times

...

Tony_1406

1 year, 8 months ago

Selected Answer: A

Agree with A, but I think the answer is slightly inaccurate. L1 regularization within the model and to the loss function. As a result, some features will be removed in the model. The answer suggest L1 regularization is applied to the dataset directly.

upvoted 1 times

...

AjoseO

1 year, 10 months ago

Selected Answer: A

Option A is the most appropriate approach to improve the testing accuracy of the model. One-hot encoding can effectively represent categorical variables in a numeric format that is suitable for machine learning models. Standardizing the financial fields can make the data more comparable and improve the model's performance. L1 regularization can help in feature selection and avoid overfitting by reducing the number of features.

upvoted 2 times

DimLam

1 year, 2 months ago

How do you apply regularization to Data and not to the model params?

upvoted 1 times

...

Peeking

2 years ago

Why are most of the chosen answers by ExamTopics mostly obviously wrong? There is nothing like tokenisation of categorical variable and B should be obviously wrong.

upvoted 1 times

ccpmad

1 year, 4 months ago

When they were published (firtsly, they steal them by photo/camera) they didn't have chatgpt to see the answers, and of course, they don't have any ML specialist or time to resolve them.

upvoted 1 times

...

Shailendraa

2 years, 3 months ago

12-sep exam

upvoted 4 times

...