exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 73 discussion

An aircraft engine manufacturing company is measuring 200 performance metrics in a time-series. Engineers want to detect critical manufacturing defects in near- real time during testing. All of the data needs to be stored for offline analysis.
What approach would be the MOST effective to perform near-real time defect detection?

  • A. Use AWS IoT Analytics for ingestion, storage, and further analysis. Use Jupyter notebooks from within AWS IoT Analytics to carry out analysis for anomalies.
  • B. Use Amazon S3 for ingestion, storage, and further analysis. Use an Amazon EMR cluster to carry out Apache Spark ML k-means clustering to determine anomalies.
  • C. Use Amazon S3 for ingestion, storage, and further analysis. Use the Amazon SageMaker Random Cut Forest (RCF) algorithm to determine anomalies.
  • D. Use Amazon Kinesis Data Firehose for ingestion and Amazon Kinesis Data Analytics Random Cut Forest (RCF) to perform anomaly detection. Use Kinesis Data Firehose to store data in Amazon S3 for further analysis.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Joe_Zhang
Highly Voted 3 years, 7 months ago
D near-real time
upvoted 43 times
DimLam
1 year, 6 months ago
The main problem with D is that Amazon Kinesis Data Firehose can not be a source service for Amazon Kinesis Data Analytics. The answer would be correct if it said "Using Amazon Kinesis Data Stream to ingest data, using Amazon Kinesis Data Analytics for defect detection and using Amazon Kinesis Data Firehose for storing data for further Analysis" https://docs.aws.amazon.com/firehose/latest/dev/create-name.html
upvoted 2 times
VR10
1 year, 2 months ago
Actually Kinesis Data Firehose can be used for Data Ingestion. So the correct option is still D
upvoted 2 times
...
...
...
cnethers
Highly Voted 3 years, 7 months ago
Glad we are all in agreement D is the correct answer
upvoted 17 times
...
xicocaio
Most Recent 7 months ago
Selected Answer: D
Amazon Kinesis Data Firehose is a fully managed service for real-time data ingestion, which fits the requirement for near-real-time defect detection. It can ingest large volumes of data from various sources and reliably load the data into other AWS services like Amazon S3 for storage. Amazon Kinesis Data Analytics with Random Cut Forest (RCF) is highly efficient for detecting anomalies in streaming data in near real time, which is what the engineers need to catch manufacturing defects during testing. After detecting anomalies, the data can be stored in Amazon S3 via Kinesis Data Firehose for offline analysis.
upvoted 1 times
...
SandyHenshaw
9 months, 1 week ago
Selected Answer: D
D - firehose for near realtime
upvoted 1 times
...
VR10
1 year, 2 months ago
Selected Answer: D
Kinesis Data Firehose is a fully managed service that can ingest streaming data and load it into destinations like S3, Redshift, Elasticsearch. and with Kinesis Data Analytics and RCF and then Data Firehose again to store on S3. D is the best choice.
upvoted 1 times
...
fa0d8b7
1 year, 4 months ago
https://docs.aws.amazon.com/managed-flink/latest/java/get-started-exercise-fh.html
upvoted 2 times
...
endeesa
1 year, 5 months ago
Selected Answer: D
Kinesis seems like the only viable option
upvoted 1 times
...
akgarg00
1 year, 5 months ago
The answer is D. Since, data is continuously coming in Kinesis datafirehose is our streaming application (also we need near Real time defect detection and storage in S3) and anomaly detection can be done by kinesis data application (RCF algorithm).
upvoted 1 times
...
AmeeraM
1 year, 6 months ago
Selected Answer: D
D, near real-time ingestion is the key
upvoted 1 times
...
loict
1 year, 7 months ago
Selected Answer: D
A. NO - AWS IoT will first store the data, then make it available for Analytics/Jupyter (https://docs.aws.amazon.com/iotanalytics/latest/userguide/welcome.html); so not real-time B. NO - not realtime to store the data before analytics C. NO - not realtime to store the data before analytics D. YES - real-time pipe, RCF best for anomalities
upvoted 1 times
...
DavidRou
1 year, 7 months ago
Selected Answer: D
How can someone use S3 for ingestion? Firehose is the right answer
upvoted 1 times
...
Mickey321
1 year, 8 months ago
Selected Answer: D
This option meets the requirements of performing near-real time defect detection, storing all the data for offline analysis, and handling 200 performance metrics in a time-series. Amazon Kinesis Data Firehose is a fully managed service that can ingest streaming data from various sources and deliver it to destinations such as Amazon S3, Amazon OpenSearch Service, and Amazon Redshift. Amazon Kinesis Data Analytics is a service that can process streaming data using SQL or Apache Flink applications. Amazon Kinesis Data Analytics provides a built-in RANDOM_CUT_FOREST function, a machine learning algorithm that can detect anomalies in streaming data1. This function can handle high-dimensional data and assign an anomaly score to each record based on how distant it is from other records1. The anomaly scores can then be delivered to another destination using Kinesis Data Firehose or consumed by other applications using Kinesis Data Streams.
upvoted 1 times
...
kaike_reis
1 year, 9 months ago
D is the correct If the question says "data streaming", "real time data" or "near real time" you should look for kinesis services. B and C are totally wrong: It's not possible to use S3 to ingestion, only storage.
upvoted 2 times
...
ADVIT
1 year, 10 months ago
D, https://docs.aws.amazon.com/kinesisanalytics/latest/sqlref/sqlrf-random-cut-forest.html
upvoted 1 times
...
earthMover
1 year, 11 months ago
Selected Answer: D
At a minimum the moderators should put some explanation when the community vote overwhelmingly for a different option.
upvoted 4 times
...
oso0348
2 years ago
Selected Answer: C
Option D is not necessarily incorrect, but it may not be the most effective approach to perform near-real time defect detection in this scenario. Here are some potential drawbacks of this approach: Amazon Kinesis Data Firehose is primarily used for data ingestion and delivery to other services, and may not be the best choice for real-time analysis. Using Amazon Kinesis Data Analytics for anomaly detection may be less flexible than using Amazon SageMaker, which provides a wide range of algorithms and models for anomaly detection. Random Cut Forest (RCF) is a popular anomaly detection algorithm used for time-series data, and Amazon SageMaker provides an RCF implementation that can be used for anomaly detection in real-time or offline. While Amazon Kinesis Data Analytics also provides RCF, using Amazon SageMaker may be a better choice for scalability and flexibility.
upvoted 1 times
...
oso0348
2 years ago
Selected Answer: C
Yes, option C can provide near real-time defect detection. Amazon SageMaker's Random Cut Forest (RCF) algorithm is designed to work with streaming data and can detect anomalies in near real-time. It can process data in batches as small as a single data point, making it well-suited for real-time anomaly detection. In this scenario, if the manufacturing process is generating data in real-time, it can be ingested into Amazon S3 and processed by Amazon SageMaker's RCF algorithm, allowing for near real-time detection of critical manufacturing defects during testing.
upvoted 1 times
ZSun
2 years ago
this is ridiculous. How can you store in s3 and then conduct real-time analysis?
upvoted 2 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago