exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 65 discussion

Machine Learning Specialist is building a model to predict future employment rates based on a wide range of economic factors. While exploring the data, the
Specialist notices that the magnitude of the input features vary greatly. The Specialist does not want variables with a larger magnitude to dominate the model.
What should the Specialist do to prepare the data for model training?

  • A. Apply quantile binning to group the data into categorical bins to keep any relationships in the data by replacing the magnitude with distribution.
  • B. Apply the Cartesian product transformation to create new combinations of fields that are independent of the magnitude.
  • C. Apply normalization to ensure each field will have a mean of 0 and a variance of 1 to remove any significant magnitude.
  • D. Apply the orthogonal sparse bigram (OSB) transformation to apply a fixed-size sliding window to generate new features of a similar magnitude.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
rsimham
Highly Voted 2 years, 8 months ago
Ans: C; Normalization is correct
upvoted 34 times
gcpwhiz
2 years, 6 months ago
Ans is not C. What is listed there is the definition of STANDARDIZATION. Normalization just scales and is not useful for reducing the effect of outliers
upvoted 4 times
gcpwhiz
2 years, 6 months ago
nevermind ignore this
upvoted 4 times
...
...
...
Phong
Highly Voted 2 years, 7 months ago
Guys, I passed the exam today. It is a tough one but there are many questions here. Good luck everyone! Thank examtopics
upvoted 14 times
haison8x
2 years, 6 months ago
Hi Phong! Please add my skype: haison8x
upvoted 2 times
...
...
Mickey321
Most Recent 8 months, 3 weeks ago
Selected Answer: C
Ans: C; Normalization is correct
upvoted 2 times
...
kaike_reis
9 months, 3 weeks ago
C (Yep, STANDARDIZATION is the correct name) That's an odd question for me
upvoted 1 times
...
OssamaAbdelatif
1 year, 5 months ago
Selected Answer: C
ans C is correct.
upvoted 1 times
...
Deepsachin
2 years, 6 months ago
ANS should be C as Normalization work best in case of amplitude diff
upvoted 1 times
...
grandgale
2 years, 7 months ago
Hi, guys, First thanks this website for the information it provided. However, the ML exam has updated most of the questions. only 20+ questions here are included in today's test. Anyway, it is still helpful. GOOD LUCK EVERYONE!
upvoted 10 times
joker34
2 years, 7 months ago
So there are 40+ other questions on the exam that aren't included in Examtopics?
upvoted 2 times
...
...
nez15
2 years, 7 months ago
QUESTION 69 A large consumer goods manufacturer has the following products on sale: • 34 different toothpaste variants • 48 different toothbrush variants • 43 different mouthwash variants The entire sales history of all these products is available in Amazon S3. Currently, the company is using custom-built autoregressive integrated moving average (ARIMA) models to forecast demand for these products. The company wants to predict the demand for a new product that will soon be launched. Which solution should a Machine Learning Specialist apply? A. Train a custom ARIMA model to forecast demand for the new product. B. Train an Amazon SageMaker DeepAR algorithm to forecast demand for the new product. C. Train an Amazon SageMaker k-means clustering algorithm to forecast demand for the new product. D. Train a custom XGBoost model to forecast demand for the new product. Correct Answer: B
upvoted 4 times
VB
2 years, 6 months ago
https://aws.amazon.com/blogs/machine-learning/forecasting-time-series-with-dynamic-deep-learning-on-aws/ Answer: B
upvoted 1 times
...
...
nez15
2 years, 7 months ago
QUESTION 68 An agency collects census information within a country to determine healthcare and social program needs by province and city. The census form collects responses for approximately 500 questions from each citizen. Which combination of algorithms would provide the appropriate insights? (Select TWO.) A. The factorization machines (FM) algorithm B. The Latent Dirichlet Allocation (LDA) algorithm C. The principal component analysis (PCA) algorithm D. The k-means algorithm E. The Random Cut Forest (RCF) algorithm Correct Answer: CD
upvoted 5 times
VB
2 years, 6 months ago
https://aws.amazon.com/blogs/machine-learning/analyze-us-census-data-for-population-segmentation-using-amazon-sagemaker/ Answer: C and D
upvoted 4 times
...
cybe001
2 years, 7 months ago
I think the answer is A and B. The census question and answer will be in text. Use LDA (unsupervised algorithm) which takes the census question/answer and groups them into categories. Use the categorization to group the people and identify similar people. Use the Factorization Machine to group the people. For each person identify if they answer a question or not. Find the total questions they answered and that will be the Target variable. Now the problem is similar to movie recommendation (consider each question a movie and the total number of questions answered will be the Rating). Based on the questions a Person answered, Factorization Machine groups the people. Findings from both the algorithms can be used to compare and identify the people for the social programs.
upvoted 2 times
kaike_reis
9 months, 3 weeks ago
it's CD
upvoted 1 times
...
jasonsunbao
2 years, 7 months ago
FM is mainly used in recommendation system to find hidden variables between two known variables to find correlation between two variables.
upvoted 1 times
...
...
...
nez15
2 years, 7 months ago
QUESTION 67 A. Use AWS Lambda to trigger an AWS Step Functions workflow to wait for dataset uploads to complete in Amazon S3. Use AWS Glue to join the datasets. Use an Amazon CloudWatch alarm to send an SNS notification to the Administrator in the case of a failure. B. Develop the ETL workflow using AWS Lambda to start an Amazon SageMaker notebook instance. Use a lifecycle configuration script to join the datasets and persist the results in Amazon S3. Use an Amazon CloudWatch alarm to send an SNS notification to the Administrator in the case of a failure. C. Develop the ETL workflow using AWS Batch to trigger the start of ETL jobs when data is uploaded to Amazon S3. Use AWS Glue to join the datasets in Amazon S3. Use an Amazon CloudWatch alarm to send an SNS notification to the Administrator in the case of a failure. D. Use AWS Lambda to chain other Lambda functions to read and join the datasets in Amazon S3 as soon as the data is uploaded to Amazon S3. Use an Amazon CloudWatch alarm to send an SNS notification to the Administrator in the case of a failure. Correct Answer: A
upvoted 6 times
...
nez15
2 years, 7 months ago
QUESTION 67 A Machine Learning Specialist is developing a daily ETL workflow containing multiple ETL jobs. The workflow consists of the following processes: • Start the workflow as soon as data is uploaded to Amazon S3. • When all the datasets are available in Amazon S3, start an ETL job to join the uploaded datasets with multiple terabyte-sized datasets already stored in Amazon S3. • Store the results of joining datasets in Amazon S3. • If one of the jobs fails, send a notification to the Administrator. Which configuration will meet these requirements?
upvoted 3 times
...
nez15
2 years, 8 months ago
QUESTION 66 A Machine Learning Specialist must build out a process to query a dataset on Amazon S3 using Amazon Athena. The dataset contains more than 800,000 records stored as plaintext CSV files. Each record contains 200 columns and is approximately 1.5 MB in size. Most queries will span 5 to 10 columns only. How should the Machine Learning Specialist transform the dataset to minimize query runtime? A. Convert the records to Apache Parquet format. B. Convert the records to JSON format. C. Convert the records to GZIP CSV format. D. Convert the records to XML format. Correct Answer: A
upvoted 11 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago