exam questions

Exam DP-100 All Questions

View all questions & answers for the DP-100 exam

Exam DP-100 topic 2 question 2 discussion

Actual exam question from Microsoft's DP-100
Question #: 2
Topic #: 2
[All DP-100 Questions]

Your team is building a data engineering and data science development environment.
The environment must support the following requirements:
✑ support Python and Scala
✑ compose data storage, movement, and processing services into automated data pipelines
✑ the same tool should be used for the orchestration of both data engineering and data science
✑ support workload isolation and interactive workloads
✑ enable scaling across a cluster of machines
You need to create the environment.
What should you do?

  • A. Build the environment in Apache Hive for HDInsight and use Azure Data Factory for orchestration.
  • B. Build the environment in Azure Databricks and use Azure Data Factory for orchestration.
  • C. Build the environment in Apache Spark for HDInsight and use Azure Container Instances for orchestration.
  • D. Build the environment in Azure Databricks and use Azure Container Instances for orchestration.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Adi06
Highly Voted 3 years, 6 months ago
Is the answer not D?? They are trying to build a development environment (line 1). Nowhere it says its for production environment.
upvoted 9 times
allanm
3 years ago
Agreed. If it was production environment, it should be Kubernetes services. Since it's development it should be container services. https://docs.microsoft.com/en-us/learn/modules/register-and-deploy-model-with-amls/2-deploy-model
upvoted 1 times
levm39
2 years, 11 months ago
you cant do orchestration with ACI, only with data factory, answer is correct.
upvoted 8 times
...
...
prashantjoge
3 years ago
definitely d
upvoted 2 times
...
strikchao
2 years, 10 months ago
Not D. There is no autoscaling with ACI
upvoted 1 times
...
...
phdykd
Highly Voted 1 year, 4 months ago
B. Azure Databricks is a fast, easy, and collaborative Apache Spark-based analytics platform that provides a complete environment for data engineering, machine learning, and data science. It supports Python and Scala, and allows you to compose data storage, movement, and processing services into automated data pipelines. Azure Data Factory, on the other hand, is a cloud-based data integration service that allows you to create, schedule, and orchestrate your data pipelines. By using both Databricks and Data Factory together, you can have a unified platform for both data engineering and data science that also supports workload isolation and interactive workloads, as well as enables scaling across a cluster of machines.
upvoted 8 times
...
phydev
Most Recent 10 months, 3 weeks ago
Selected Answer: B
Azure Databricks is an obvious choice for the environment. To decide between Data Factory vs Container Instances for orchestration makes the difference here. ADF would be a more suitable choice compared to ACI for orchestration in this scenario due to 1. Native Orchestration Capabilities 2. Visual Workflow Designer 3. Integration with Diverse Data Sources and Services 4. Built-in Monitoring and Management Therefore, for the purpose of orchestrating data pipelines in a data engineering and data science environment, ADF would be the recommended choice due to its dedicated orchestration features, data integration capabilities, visual workflow designer, and integration with diverse data sources and services.
upvoted 4 times
...
dija123
2 years, 6 months ago
Selected Answer: B
B without doubts
upvoted 3 times
...
kolakone
2 years, 10 months ago
B is the right answer. C and D are out as there is need for data engineering Since there is need for "both data engineering and data science", there is need for Data Factory, hence C and D are out. Due to need for Scala and Python support, Databricks (B) is the correct answer.
upvoted 6 times
...
Navishmamta1111111111111
2 years, 11 months ago
B is correct
upvoted 3 times
...
okeyken1
2 years, 11 months ago
the correct answer is B
upvoted 1 times
...
MAGGCol
3 years ago
Previewed last year, Microsoft's Azure Container Instances (ACI) is now ready for production usage, according to the company. ... Microsoft promises an uptime service level agreement of 99.9 percent for any container group. Each container is secured and isolated through a VM hypervisor.
upvoted 1 times
...
prashantjoge
3 years ago
azure databricks support clustering while azure data factory supports orchestration (https://docs.microsoft.com/en-us/azure/databricks/clusters/configure). The orchestration here should be in the context of data processing (think SSIS, ETL, informatica etc.) Answer should be B. Azure containers instances provide some basic orchestration capabilities, but then again the context is different. https://docs.microsoft.com/en-us/azure/container-instances/container-instances-orchestrator-relationship
upvoted 5 times
...
chaudha4
3 years, 1 month ago
I think the best answer is B. ACI is used to deploy a model. ACI is just like docker - for orchestration you would need something like kubernetes not docker.
upvoted 5 times
...
LakeSky
3 years, 1 month ago
Wow, so what's the correct answer really? Why is Azure Container not an option?
upvoted 2 times
cab123
3 years, 1 month ago
I think Azure Container instances cannot do orchestration
upvoted 2 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...