exam questions

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 29 discussion

Actual exam question from Google's Professional Machine Learning Engineer
Question #: 29
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You have trained a model on a dataset that required computationally expensive preprocessing operations. You need to execute the same preprocessing at prediction time. You deployed the model on AI Platform for high-throughput online prediction. Which architecture should you use?

  • A. Validate the accuracy of the model that you trained on preprocessed data. Create a new model that uses the raw data and is available in real time. Deploy the new model onto AI Platform for online prediction.
  • B. Send incoming prediction requests to a Pub/Sub topic. Transform the incoming data using a Dataflow job. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
  • C. Stream incoming prediction request data into Cloud Spanner. Create a view to abstract your preprocessing logic. Query the view every second for new records. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
  • D. Send incoming prediction requests to a Pub/Sub topic. Set up a Cloud Function that is triggered when messages are published to the Pub/Sub topic. Implement your preprocessing logic in the Cloud Function. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
SparkExpedition
Highly Voted 3 years, 9 months ago
Supporting B ..https://cloud.google.com/architecture/data-preprocessing-for-ml-with-tf-transform-pt1#where_to_do_preprocessing
upvoted 31 times
...
inder0007
Highly Voted 3 years, 11 months ago
I think it should b B
upvoted 14 times
q4exam
3 years, 7 months ago
I also agree with B, this is how I would advise clients to do it as well
upvoted 4 times
...
...
IrribarraC
Most Recent 2 months, 2 weeks ago
Selected Answer: B
Dataflow has autoscale. And in my experience, you use Cloud Functions to small stuff.
upvoted 1 times
...
ship123
4 months, 1 week ago
Selected Answer: D
You are an ML engineer who has trained a model on a dataset that required computationally expensive preprocessing operations. You need to execute the same preprocessing at prediction time. You deployed the model on the Vertex AI platform for high‐throughput online prediction. Which architecture should you use? Answer is . Send incoming prediction requests to a Pub/Sub topic. Set up a Cloud Function that is triggered when messages are published to the Pub/Sub topic. Implement your preprocessing logic in the Cloud Function. Submit a prediction request to the Vertex AI platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
upvoted 1 times
...
rajshiv
5 months ago
Selected Answer: D
B is incorrect. Dataflow is a great option for large-scale data processing but may introduce additional complexity and overhead for a real-time prediction scenario where you just need to preprocess data on-the-fly. This is more appropriate for batch processing or when large volumes of data need to be processed in parallel. Option D is better as it leverages Pub/Sub, Cloud Functions, and AI Platform to preprocess data and obtain predictions without needing complex infrastructure or additional systems like Dataflow or Cloud Spanner.
upvoted 1 times
...
f084277
5 months, 3 weeks ago
Selected Answer: B
Dataflow is superior to Cloud Functions for doing data transformations at high volume. The answer is clearly B.
upvoted 2 times
...
bludw
10 months, 1 week ago
Selected Answer: D
D. The issue with B is that DataFlow does not work well with high throughput
upvoted 1 times
f084277
5 months, 3 weeks ago
You are incorrect. Dataflow can handle MUCH higher volumes of data than Cloud Functions
upvoted 1 times
...
desertlotus1211
6 months, 2 weeks ago
Dataflow is ideal for handling computationally expensive preprocessing operations, as it scales automatically and can process the data in a distributed manner.
upvoted 1 times
...
...
PhilipKoku
11 months ago
Selected Answer: B
B) Pub/Sub + Dataflow
upvoted 1 times
...
Liting
1 year, 10 months ago
Selected Answer: B
Went with B, using dataflow for large amount data transformation is the best option
upvoted 3 times
...
SamuelTsch
1 year, 10 months ago
Selected Answer: B
I went to B. A is completely wrong. C: 1st cloud spanner is not designed for high throughput, also it is not for preprocessing. D: cloud function could not be get enough resource to do the high computational transformation.
upvoted 2 times
...
ashu381
1 year, 11 months ago
Selected Answer: B
Because the concern here is high throughput and not specifically the latency so better to go with option B
upvoted 1 times
...
Voyager2
1 year, 11 months ago
Selected Answer: D
B. Send incoming prediction requests to a Pub/Sub topic. Transform the incoming data using a Dataflow job. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue https://dataintegration.info/building-streaming-data-pipelines-on-google-cloud
upvoted 1 times
...
M25
1 year, 12 months ago
Selected Answer: B
Went with B
upvoted 1 times
...
e707
2 years ago
Selected Answer: D
I think it's D as B is not a good choice because it requires you to run a Dataflow job for each prediction request. This is inefficient and can lead to latency issues.
upvoted 3 times
f084277
5 months, 3 weeks ago
The question doesn't mention anything about latency
upvoted 1 times
...
lucaluca1982
2 years ago
Yes i agree Dataflow can introduce latency
upvoted 2 times
...
...
lucaluca1982
2 years ago
Selected Answer: D
I go for D. Option B has Dataflow that it is more suitable for batch
upvoted 1 times
...
SergioRubiano
2 years, 1 month ago
Selected Answer: B
It's B
upvoted 1 times
...
MithunDesai
2 years, 4 months ago
Selected Answer: B
yes ans B
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago