You create an Azure Databricks cluster and specify an additional library to install. When you attempt to load the library to a notebook, the library in not found. You need to identify the cause of the issue. What should you review?
I should say Cluster Event logs:
Azure Databricks provides three kinds of logging of cluster-related activity:
Cluster event logs, which capture cluster lifecycle events, like creation, termination, configuration edits, and so on.
Apache Spark driver and worker logs, which you can use for debugging.
Cluster init-script logs, valuable for debugging init scripts.
https://docs.microsoft.com/en-us/azure/databricks/clusters/clusters-manage#event-log
B. cluster event logs.
Explanation:
Cluster event logs provide information about the cluster's lifecycle events, including the initialization process.
When you specify an additional library to install on the Databricks cluster, the installation process is part of the cluster initialization.
Reviewing the cluster event logs can help you determine whether the library installation process encountered any errors or issues that prevented the library from being installed successfully.
Any errors or warnings during the library installation process would likely be logged in the cluster event logs, providing insights into the cause of the issue.
The correct answer is B
Cluster event logs capture two init script events: INIT_SCRIPTS_STARTED and INIT_SCRIPTS_FINISHED, indicating which scripts are scheduled for execution and which have completed successfully. INIT_SCRIPTS_FINISHED also captures execution duration.
https://docs.databricks.com/en/init-scripts/logs.html
ChatGpt :
if the library was to be installed through:
- Standard Databricks library installation methods: Check the cluster event logs (B).
- A global init script: Check the global init scripts logs (C).
Without additional context or explicit mention of an init script being used, option B is typically the more standard choice for initial troubleshooting.
Legacy global init scripts and cluster-named init scripts are deprecated and cannot be used in new workspaces starting February 21, 2023. On September 1st, 2023, Azure Databricks will disable legacy global init scripts for all workspaces.
Cluster event logs in Azure Databricks provide detailed information about the cluster's lifecycle events, including the installation and initialization of libraries. By reviewing the cluster event logs, you can examine the events related to library installation and determine if any errors or issues occurred during the process.
Cluster event logs do not log init script events for each cluster node; only one node is selected to represent them all.
https://learn.microsoft.com/en-us/azure/databricks/clusters/init-scripts
That's incorrect. It is a part of init scripts.
Some examples of tasks performed by init scripts include:
Set system properties and environment variables used by the JVM.
Modify Spark configuration parameters.
Modify the JVM system classpath in special cases.
Install packages and libraries not included in Databricks Runtime. To install Python packages, use the Azure Databricks pip binary located at /databricks/python/bin/pip to ensure that Python packages install into the Azure Databricks Python virtual environment rather than the system Python environment. For example, /databricks/python/bin/pip install <package-name>.
https://learn.microsoft.com/en-us/azure/databricks/init-scripts/
Additional libraries are installed in global init scripts, so correct answer is C.
Some examples of tasks performed by init scripts include:
- Install packages and libraries not included in Databricks Runtime. To install Python packages, use the Azure Databricks pip binary located at /databricks/python/bin/pip to ensure that Python packages install into the Azure Databricks Python virtual environment rather than the system Python environment. For example, /databricks/python/bin/pip install <package-name>.
- Modify the JVM system classpath in special cases.
- Set system properties and environment variables used by the JVM.
- Modify Spark configuration parameters.
ref: https://learn.microsoft.com/en-us/azure/databricks/clusters/init-scripts
There are two primary ways to install a library on a cluster:
- Install a workspace library that has been already been uploaded to the workspace.
- Install a library for use with a specific cluster only.
the best option in this scenario would be to review the cluster event logs to identify the cause of the issue where an additional library is not found in the Azure Databricks cluster.
Answer C.
A global init script runs on every cluster created in your workspace. Global init scripts are useful when you want to enforce organization-wide library configurations or security screens. Only admins can create global init scripts. You can create them using either the UI or REST API.
This section is not available anymore. Please use the main Exam Page.DP-203 Exam Questions
Log in to ExamTopics
Sign in:
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
Dizzystar
Highly Voted 3 years agodragos_dragos62000
Highly Voted 3 years, 4 months agoElanche
Most Recent 7 months agobe8a152
9 months agodakku987
9 months, 4 weeks agojongert
10 months agoMomoanwar
10 months, 1 week agokkk5566
1 year, 2 months agokkk5566
1 year, 2 months ago[Removed]
1 year, 3 months agovctrhugo
1 year, 4 months agoauwia
1 year, 4 months agovctrhugo
1 year, 4 months agobch9994
1 year, 2 months agoaemilka
1 year, 6 months agovctrhugo
1 year, 4 months agokornat
1 year, 7 months agoesaade
1 year, 7 months agolafita
1 year, 8 months agoyoungbug
1 year, 9 months ago