Unlimited Access

Get Unlimited Contributor Access to the all ExamTopics Exams!
Take advantage of PDF Files for 1000+ Exams along with community discussions and pass IT Certification Exams Easily.

Get Unlimited Access

Google Discussions

Exam Professional Cloud Architect topic 1 question 119 discussion

Actual exam question from Google's Professional Cloud Architect

Question #: 119
Topic #: 1

[All Professional Cloud Architect Questions]

You need to migrate Hadoop jobs for your company's Data Science team without modifying the underlying infrastructure. You want to minimize costs and infrastructure management effort. What should you do?

A. Create a Dataproc cluster using standard worker instances.
B. Create a Dataproc cluster using preemptible worker instances.
C. Manually deploy a Hadoop cluster on Compute Engine using standard instances.
D. Manually deploy a Hadoop cluster on Compute Engine using preemptible instances.

Show Suggested Answer

Suggested Answer: A 🗳️
Reference:
https://cloud.google.com/architecture/hadoop/hadoop-gcp-migration-jobs

by TotoroChina at July 1, 2021, 8:47 a.m.

Comments

Submit Cancel

TotoroChina

Highly Voted 2 years, 10 months ago

Should be B, you want to minimize costs. https://cloud.google.com/dataproc/docs/concepts/compute/secondary-vms#preemptible_and_non-preemptible_secondary_workers

upvoted 65 times

J19G

2 years, 6 months ago

Agree, the migration guide also recommends to think about preemptible worker nodes: https://cloud.google.com/architecture/hadoop/hadoop-gcp-migration-jobs#using_preemptible_worker_nodes

upvoted 3 times

ale_brd_

1 year, 5 months ago

I think it's A. The question does not mention anything about minimize the costs, all the questions in GCP exams that require minimize the costs as requirement literally mention that in the question. Also in order to minimize the costs you need to build jobs that are fault tolerant, as workers instances are preemptible. This also requires some kind of Dev investment of work. So if not mentioned in the question fault tolerant and minimize costs then is not required/needed. Doc states below: Only use preemptible nodes for jobs that are fault-tolerant or that are low enough priority that occasional job failure won't disrupt your business.

upvoted 2 times

grejao

1 year ago

OMG, you again? zetalexg says: It's dissapointing that you waste your time writting on this topic instead of paying attention at the questions.

upvoted 2 times

...

XDevX

2 years, 10 months ago

Hi TotoroChina, I had the same thought when I first read the question - the problem I see is, in real business I think you would try to mix preemtible instances and on-demand instances... Here you have to choose between only preemtible instances and on-demand instances... Preemptible instances have some downsides - so we would need more details and ideally a mixed approach. That's why both answers might be correcy, a) and b)... Do you see that different? Thanks! Cheers, D.

upvoted 5 times

kopper2019

2 years, 9 months ago

but you need to reduce management overhead so B if you create a cluster manually and create and maintain GCE is not the way to go

upvoted 5 times

HenkH

1 year, 5 months ago

B requires to create new instances at least every 24h.

upvoted 2 times

...

Sukon_Desknot

1 year, 6 months ago

"without modifying the underlying infrastructure" is the watch word. Most likely did not utilize preemptible on-premises

upvoted 7 times

...

Yogi42

1 year, 3 months ago

A cost-savings consideration: Using preemptible VMs does not always save costs since preemptions can cause longer job execution with resulting higher job costs. This is mentioned in above link So I think Ans should be A

upvoted 3 times

...

Load full discussion...

...

firecloud

Highly Voted 2 years, 9 months ago

It's A, the primary workers can only be standard, where secondary workers can be preemtible.------In addition to using standard Compute Engine VMs as Dataproc workers (called "primary" workers), Dataproc clusters can use "secondary" workers. There are two types of secondary workers: preemptible and non-preemptible. All secondary workers in your cluster must be of the same type, either preemptible or non-preemptible. The default is preemptible.

upvoted 33 times

Manh

2 years, 7 months ago

agreed

upvoted 2 times

...

Gino17m

Most Recent 4 days, 23 hours ago

Selected Answer: A

A - only secondary workers can be preemptible and "Using preemptible VMs does not always save costs since preemptions can cause longer job execution with resulting higher job costs" according to: https://cloud.google.com/dataproc/docs/concepts/compute/secondary-vms#preemptible_and_non-preemptible_secondary_workers

upvoted 1 times

...

dija123

1 week, 3 days ago

Selected Answer: B

Agree with B

upvoted 1 times

...

Diwz

1 month ago

Answer is B. The secondary worker type instance for default Dataproc cluster is preemptible VMs. https://cloud.google.com/dataproc/docs/concepts/compute/secondary-vms

upvoted 1 times

...

shashii82

1 month, 2 weeks ago

Dataproc: Dataproc is a fully managed Apache Spark and Hadoop service on Google Cloud Platform. It allows you to run clusters without the need to manually deploy and manage Hadoop clusters on Compute Engine. Preemptible Worker Instances: Preemptible instances are short-lived, cost-effective virtual machine instances that are suitable for fault-tolerant and batch processing workloads. Since Hadoop jobs can often tolerate interruptions, using preemptible instances can significantly reduce costs. Option B leverages the benefits of Dataproc for managing Hadoop clusters without the need for manual deployment and takes advantage of preemptible instances to minimize costs. This aligns well with the goal of minimizing both costs and infrastructure management efforts.

upvoted 1 times

...

VidhyaBupesh

2 months ago

Using preemptible VMs does not always save costs since preemptions can cause longer job execution with resulting higher job costs

upvoted 1 times

...

Amrita2012

2 months, 1 week ago

Selected Answer: A

Using standard Compute Engine VMs as Dataproc workers (called "primary" workers), Preemptible can be only used for secondary workers hence A is valid answer

upvoted 1 times

...

Pime13

2 months, 3 weeks ago

Selected Answer: B

minimize costs -> preemtipble

upvoted 2 times

...

d0094d6

2 months, 3 weeks ago

Selected Answer: B

You want to minimize costs and infrastructure management effort > B

upvoted 2 times

...

d0094d6

2 months, 3 weeks ago

"You want to minimize costs and infrastructure management effort" -> B

upvoted 1 times

...

didek1986

3 months, 1 week ago

Selected Answer: A

It is A

upvoted 1 times

...

Romio2023

3 months, 2 weeks ago

Selected Answer: B

Answer should be B, because minizing the costs is wanted.

upvoted 1 times

...

discuss24

3 months, 3 weeks ago

A is the correct response. Per documentation "You can gain low-cost processing power for your jobs by adding preemptible worker nodes to your cluster. These nodes use preemptible virtual machines." The focus of the question is to reduce cost, hence preempttible VM works best

upvoted 1 times

discuss24

3 months, 3 weeks ago

B not A

upvoted 1 times

...

odacir

5 months, 1 week ago

Selected Answer: B

B. - migrate Hadoop jobs -> dataproc - saving money -> preemptible(spot)

upvoted 1 times

...

CyanideX

6 months, 1 week ago

Selected Answer: B

The Answer should be Spot VMs (This has been changed in actual Exam) Spot VM's have no expiration.

upvoted 2 times

...

cchiaramelli

6 months, 1 week ago

Selected Answer: B

I think it's B because it literaly says "minimize costs" for a Job-like workload

upvoted 1 times

...

Load full discussion...

Unlimited Access

Exam Professional Cloud Architect topic 1 question 119 discussion

Comments

TotoroChina

J19G

ale_brd_

grejao

XDevX

kopper2019

HenkH

Sukon_Desknot

Yogi42

firecloud

Manh

Gino17m

dija123

Diwz

shashii82

VidhyaBupesh

Amrita2012

Pime13

d0094d6

d0094d6

didek1986

Romio2023

discuss24

discuss24

odacir

CyanideX

cchiaramelli

Get IT Certification

New Version GCP Professional Cloud Architect Certificate & Helpful Information

The 5 Most In-Demand Project Management Certifications of 2019