Exam DP-200 topic 2 question 4 discussion

Actual exam question from Microsoft's DP-200

Question #: 4
Topic #: 2

DRAG DROP -
You develop data engineering solutions for a company.
A project requires analysis of real-time Twitter feeds. Posts that contain specific keywords must be stored and processed on Microsoft Azure and then displayed by using Microsoft Power BI. You need to implement the solution.
Which five actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Select and Place:

Show Suggested Answer

Suggested Answer:

Step 1: Create an HDInisght cluster with the Spark cluster type
Step 2: Create a Jyputer Notebook

Step 3: Create a table -
The Jupyter Notebook that you created in the previous step includes code to create an hvac table.
Step 4: Run a job that uses the Spark Streaming API to ingest data from Twitter
Step 5: Load the hvac table into Power BI Desktop
You use Power BI to create visualizations, reports, and dashboards from the Spark cluster data.
References:
https://acadgild.com/blog/streaming-twitter-data-using-spark
https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-use-with-data-lake-store

by JohnCrawford at April 9, 2021, 2:09 a.m.

Comments

Submit Cancel

cadio30

Highly Voted 4 years, 1 month ago

The propose solution is correct. A table cannot be created if the notebook is not yet available, the scenario is in assumption the table is within the hdinsight spark cluster.

upvoted 6 times