exam questions

Exam DP-200 All Questions

View all questions & answers for the DP-200 exam

Exam DP-200 topic 2 question 9 discussion

Actual exam question from Microsoft's DP-200
Question #: 9
Topic #: 2
[All DP-200 Questions]

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:
✑ A workload for data engineers who will use Python and SQL
✑ A workload for jobs that will run notebooks that use Python, Scala, and SQL
✑ A workload that data scientists will use to perform ad hoc analysis in Scala and R
The enterprise architecture team at your company identifies the following standards for Databricks environments:
✑ The data engineers must share a cluster.
✑ The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
✑ All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.
You need to create the Databricks clusters for the workloads.
Solution: You create a Standard cluster for each data scientist, a High Concurrency cluster for the data engineers, and a High Concurrency cluster for the jobs.
Does this meet the goal?

  • A. Yes
  • B. No
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️
We need a High Concurrency cluster for the data engineers and the jobs.
Note:
Standard clusters are recommended for a single user. Standard can run workloads developed in any language: Python, R, Scala, and SQL.
A high concurrency cluster is a managed cloud resource. The key benefits of high concurrency clusters are that they provide Apache Spark-native fine-grained sharing for maximum resource utilization and minimum query latencies.
References:
https://docs.azuredatabricks.net/clusters/configure.html

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
P_11
Highly Voted 5 years ago
High Concurrency does not work for Scala...should B is correct answer Note High concurrency clusters work only for SQL, Python, and R. The performance, security, and fault isolation of high concurrency clusters is provided by running user code in separate processes, which is not possible in Scala. The Table Access Control checkbox is available only for high concurrency cluster https://docs.microsoft.com/en-gb/azure/databricks/clusters/configure
upvoted 31 times
MMM777
4 years ago
actually Data Engineers do not require Scala - so High Concurrency is ok
upvoted 1 times
RyuHayabusa
3 years, 11 months ago
What about reading the question again?
upvoted 1 times
...
Williammm
4 years ago
it says that the will use scala...so A
upvoted 1 times
...
...
MLCL
4 years, 11 months ago
Data Engineers do not use Scala.
upvoted 4 times
...
Ambujinee
4 years, 1 month ago
Agreed with you
upvoted 1 times
...
ichacas
4 years, 12 months ago
"You create a Standard cluster for each data scientist, a High Concurrency cluster for the data engineers, and a Standard cluster for the jobs" Data engineers share same cluster and only use Python and SQL -> High Concurrency Each Data Scientists needs a cluster and they are going to work with R and Scala -> Standard Jobs are going to be executed with Scala -> Standard. So correct answer is Yes
upvoted 39 times
HeywwooodJab
4 years, 2 months ago
That's not what the question says ...
upvoted 6 times
...
Ambujinee
4 years, 1 month ago
What you have explained is correct hence the ans is B. As the ans contain high concurrence cluster for job which is wrong.
upvoted 2 times
...
...
...
azmun
Most Recent 4 years, 1 month ago
Answer is No High Concurrency clusters work only for SQL, Python, and R. The performance and security of High Concurrency clusters is provided by running user code in separate processes, which is not possible in Scala. https://docs.microsoft.com/en-us/azure/databricks/clusters/configure
upvoted 1 times
...
cadio30
4 years, 2 months ago
This statement "A workload for jobs that will run notebooks that use Python, Scala, and SQL" pertains to the propose solution "High Concurrency cluster for the jobs" and as we all know, high concurrency doesn't work for SCALA. Therefore, the answer is NO.
upvoted 2 times
...
sharma21
4 years, 2 months ago
Answer is NO
upvoted 1 times
...
UmashankarJanakiraman
4 years, 2 months ago
https://github.com/Azure/AzureDatabricksBestPractices/blob/master/Table2.PNG Answer is NO
upvoted 1 times
...
Hassan_Mazhar_Khan
4 years, 2 months ago
Correct Answer is B as High Concurrency does not work for Scala.
upvoted 1 times
...
brcdbrcd
4 years, 7 months ago
No - because a single node cluster is appropriate for jobs. Not because of any other reasons... A Single Node cluster has no workers and runs Spark jobs on the driver node. In contrast, Standard mode clusters require at least one Spark worker node in addition to the driver node to execute Spark jobs. https://docs.microsoft.com/en-us/azure/databricks/clusters/configure#--single-node-clusters
upvoted 3 times
...
seaun
4 years, 8 months ago
Jobs should be standard cluster, correct answer is Yes
upvoted 2 times
...
sandGrain
4 years, 9 months ago
That is correct answer should be "Yes". High Concurrency does not support Scala
upvoted 3 times
...
avix
4 years, 11 months ago
Answer is wrong as High Concurrency cluster can't support scala
upvoted 2 times
hart232
4 years, 9 months ago
There is no need to use Scala for the users of high concurrency. This is mentioned in the question.
upvoted 1 times
sirshanam
4 years, 8 months ago
hart232 its mentioned as a requirement: A workload for jobs that will run notebooks that use Python, Spark, Scala, and SQL
upvoted 2 times
...
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...