exam questions

Exam DP-200 All Questions

View all questions & answers for the DP-200 exam

Exam DP-200 topic 2 question 28 discussion

Actual exam question from Microsoft's DP-200
Question #: 28
Topic #: 2
[All DP-200 Questions]

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:
✑ A workload for data engineers who will use Python and SQL
✑ A workload for jobs that will run notebooks that use Python, Scala, and SQL
✑ A workload that data scientists will use to perform ad hoc analysis in Scala and R
The enterprise architecture team at your company identifies the following standards for Databricks environments:
✑ The data engineers must share a cluster.
✑ The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
✑ All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.
You need to create the Databricks clusters for the workloads.
Solution: You create a Standard cluster for each data scientist, a Standard cluster for the data engineers, and a High Concurrency cluster for the jobs.
Does this meet the goal?

  • A. Yes
  • B. No
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️
We need a High Concurrency cluster for the data engineers and the jobs.
Note:
Standard clusters are recommended for a single user. Standard can run workloads developed in any language: Python, R, Scala, and SQL.
A high concurrency cluster is a managed cloud resource. The key benefits of high concurrency clusters are that they provide Apache Spark-native fine-grained sharing for maximum resource utilization and minimum query latencies.
References:
https://docs.azuredatabricks.net/clusters/configure.html

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Nieswurz
Highly Voted 4 years, 11 months ago
As job notebooks include scala and high concurrency clusters do not support scala, the answer should be no.
upvoted 26 times
Equalizer
4 years, 9 months ago
Correct, check: https://docs.microsoft.com/en-us/azure/databricks/clusters/configure
upvoted 3 times
...
TashaP
2 years, 9 months ago
this is 100% correct, it's as simple as this, job requires scala, high concurrency does not support scala. The answer is no.
upvoted 1 times
...
...
avix
Highly Voted 4 years, 10 months ago
If job needs to use scala then high concurrency cluster can't be used as that wont support Scala
upvoted 9 times
...
ffgghhjj
Most Recent 10 months, 1 week ago
B. as high concurrency clusters do not support Scala.
upvoted 1 times
...
hoangton
4 years, 1 month ago
B is corect There are two options for cluster mode: Standard: Single user / small group clusters - can use any language. High Concurrency: A cluster built for minimizing latency in high concurrency workloads. There are a few main reasons you would use a Standard cluster over a high concurrency cluster. The first is if you are a single user of Databricks exploring the technology. For most PoCs and exploration, a Standard cluster should suffice. The second is if you are a Scala user, as high concurrency clusters do not support Scala. The third is if your use case simply does not require high concurrency processes. High concurrency clusters, in addition to performance gains, also allow you utilize table access control, which is not supported in Standard clusters. Please note that High Concurrency clusters do not automatically set the auto shutdown field, whereas standard clusters default it to 120 minutes. https://www.mssqltips.com/sqlservertip/6604/azure-databricks-cluster-configuration/
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...