exam questions

Exam DP-201 All Questions

View all questions & answers for the DP-201 exam

Exam DP-201 topic 7 question 1 discussion

Actual exam question from Microsoft's DP-201
Question #: 1
Topic #: 7
[All DP-201 Questions]

Inventory levels must be calculated by subtracting the current day's sales from the previous day's final inventory.
Which two options provide Litware with the ability to quickly calculate the current inventory levels by store and product? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.

  • A. Consume the output of the event hub by using Azure Stream Analytics and aggregate the data by store and product. Output the resulting data directly to Azure Synapse Analytics. Use Transact-SQL to calculate the inventory levels.
  • B. Output Event Hubs Avro files to Azure Blob storage. Use Transact-SQL to calculate the inventory levels by using PolyBase in Azure Synapse Analytics.
  • C. Consume the output of the event hub by using Databricks. Use Databricks to calculate the inventory levels and output the data to Azure Synapse Analytics.
  • D. Consume the output of the event hub by using Azure Stream Analytics and aggregate the data by store and product. Output the resulting data into Databricks. Calculate the inventory levels in Databricks and output the data to Azure Blob storage.
  • E. Output Event Hubs Avro files to Azure Blob storage. Trigger an Azure Data Factory copy activity to run every 10 minutes to load the data into Azure Synapse Analytics. Use Transact-SQL to aggregate the data by store and product.
Show Suggested Answer Hide Answer
Suggested Answer: AE 🗳️
A: Azure Stream Analytics is a fully managed service providing low-latency, highly available, scalable complex event processing over streaming data in the cloud.
You can use your Azure Synapse Analytics (SQL Data warehouse) database as an output sink for your Stream Analytics jobs.
E: Event Hubs Capture is the easiest way to get data into Azure. Using Azure Data Lake, Azure Data Factory, and Azure HDInsight, you can perform batch processing and other analytics using familiar tools and platforms of your choosing, at any scale you need.
Note: Event Hubs Capture creates files in Avro format.
Captured data is written in Apache Avro format: a compact, fast, binary format that provides rich data structures with inline schema. This format is widely used in the Hadoop ecosystem, Stream Analytics, and Azure Data Factory.
Scenario: The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure.
Reference:
https://docs.microsoft.com/bs-latn-ba/azure/sql-data-warehouse/sql-data-warehouse-integrate-azure-stream-analytics https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-capture-overview

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
pravinDataSpecialist
Highly Voted 5 years ago
should be A & C as the final result should result in synapse
upvoted 21 times
extraego
4 years, 10 months ago
C is incorrect because there is no mention of a step that Azure DataBricks is connected to Azure Synapse to get the previous inventory level. Therefore, you cannot calculate the current inventory level from DataBricks. A & E are the only options that output sales data to Azure Synapse and calculate the inventory level from Azure Synapse. I've noticed that many people in DP-201 imagine the steps that are not stated, causing a lot of confusion.
upvoted 13 times
Rambaldi
4 years, 7 months ago
but the inventory levels are not calculated in E
upvoted 1 times
sturcu
4 years, 4 months ago
Yes they are. You load the sales into synapse -upserts ( you will substract the sold items). Hence aggregating will give you Current Inventory
upvoted 1 times
...
...
...
...
NikP
Highly Voted 4 years, 10 months ago
I believe daily inventory data are going through ADLK Gen2 (As needed for staging) and go to to Analytical Data Store (Synapse). Another thing is "Daily inventory data comes from a Microsoft SQL server located on a private network". I believe this is on prem server. They didn't mentioned how this sql server consume the daily inventory data. Previous day sales data which are already in Synapse (Assumption is daily inventory data migrated from sql server to datalake and then to Synapse). Current Day's sales data can be ingested to Synapse directly through event hub by using Azure Stream Analytics. OR current day's sales data can be store to blob storage from Event Hub. You can use ADF to load those files to Synapse every 10 min. Then after you can create direct Power BI dashboard which can run T-SQL to calculate inventory level in Synapse and feed it to Power BI report. This can be live report/dashboard which can be share to every store. Answer should be A and E.
upvoted 9 times
...
Prashantprp
Most Recent 4 years, 3 months ago
Some outputs types support partitioning, and output batch sizes vary to optimize throughput. The following table shows features that are supported for each output type: TABLE 1 Output type Partitioning Sehttps://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-define-outputs ASA cannot output to Databricks Azure Data Lake Storage Gen 1 Yes Azure Active Directory user , Managed Identity Azure SQL Database Yes, optional. SQL user auth, Managed Identity (preview) Azure Synapse Analytics Yes SQL user auth, Managed Identity (preview) Blob storage and Azure Data Lake Gen 2 Yes Access key, Managed Identity (preview) Azure Event Hubs Yes, need to set the partition key column in output configuration. Access key, Managed Identity (preview) Power BI No Azure Active Directory user, Managed Identity Azure Table storage Yes Account key Azure Service Bus queues Yes Access key Azure Service Bus topics Yes Access key Azure Cosmos DB Yes Access key Azure Functions Yes Access key
upvoted 1 times
...
TaherAli2020
4 years, 4 months ago
Previous day's data will be in the DW so better to do the calculation on the DW. Use Transact-SQL to calculate the inventory levels.
upvoted 1 times
...
spiitr
4 years, 4 months ago
I will not go for databricks because it is mentioned expressroute or VPN will not be there. https://docs.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/on-prem-network
upvoted 1 times
spiitr
4 years, 4 months ago
So Polybase (not feasible on Avro format) and data bricks options are ruled out and given answer is correct.
upvoted 1 times
...
...
Aditya167
4 years, 5 months ago
E- using AVRO not useful as polybase does not support AVRO So only left option is A and C
upvoted 2 times
...
syu31svc
4 years, 6 months ago
"Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store" D is out PolyBase does not support Avro B is out Running DataFactory every 10 minutes is unncessary E is out A and C are therefore correct
upvoted 8 times
...
M0e
4 years, 8 months ago
D is obviously not an option since it does not fulfil the requirement of calculating quickly! Option C should be selected instead - it receives the last day's inventory data from the Data lake which is mentioned in the case study (it is used for staging the data before it is loaded into the Data Warehouse.) -> Answer: A & C
upvoted 2 times
M0e
4 years, 8 months ago
Sorry, I meant E is not an option.
upvoted 1 times
...
...
Trove
4 years, 8 months ago
If it is A and E , then aggregation by store and product is done twice, which may not give the right result. So , from a business perspective, A and C makes more sense.
upvoted 2 times
...
NikP
4 years, 10 months ago
B: is incorrect because Avro files doesn't support Polybase. C: is incorrect because first of all I am not sure you can output to result directly to Azure databrick. Even if you can then again you need to get previous sales data from datalake or from synapse. Then calculate the inventory level by comparing both of them and send it to Synapse and feed the Power BI report. This doesn't make sense. In Synapse, based on right distribution key on large table, your join query works faster. I believe it will be fast and cheaper if you calculate the inventory level in Synapse compare to databrick. D: is incorrect because you don't want to store result in blob storage. It just doesn't make sense. With Synapse and Power BI, you can see historical inventory level by day, by store, by product etc if you configure it right way.
upvoted 7 times
...
sharnav
4 years, 11 months ago
Answer is A n E only as in the question it has been stated that "Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store" , so the final output should go to analytical data store and azure storage should be used as staging which in question been told to remove the files from it as soon as data is uploaded into the final stage.
upvoted 3 times
...
LeonLeon
4 years, 11 months ago
E can not be correct, because the use of an time scheduled trigger instead of an event blob-trigger. Update have to be done as close to real-time as possible!!
upvoted 4 times
...
Tommy65
4 years, 11 months ago
Clearly the assumption is that the inventory of the day before is already in Synapse Analytics. On that basis the answer is correct: in A you aggregate the data, then load them into Synapse and there you use T-SQL statement to calculate the difference with the day before. In E you copy in Synapse the raw data, then use T-SQL to aggregate them and more T-SQL to calculate the difference with the day before
upvoted 3 times
...
azurearch
5 years, 1 month ago
A, B seems to be correct as well. with 10 minutes schedule using adf as mentioned in E option, there is a delay.
upvoted 2 times
azurearch
5 years, 1 month ago
b is wrong as polybase does not support avr format. a, c seems to be correct
upvoted 5 times
...
...
willdy123
5 years, 1 month ago
Option E does not calculate the inventory. Only groups input by store and prodcut. Databricks can calculate the inventory by reading eventhub data and inventory data from data lake (staged) or synapse. Since this is neither mentioned nor left out of option C. I would suggest options A and C to be the correct answers.
upvoted 5 times
...
hokigir
5 years, 2 months ago
Why not A, C?
upvoted 7 times
vistran
5 years, 1 month ago
As I see the inventory dataas of prev date is in Azure Synapse Analytics. So inventory level cant be calculated in Azure Databricks unless there's a feed back from Azure Synapse to Databricks
upvoted 5 times
...
santafe
5 years ago
It says they need to avoid VM. So datafactory is not suggested
upvoted 1 times
santafe
5 years ago
So C should be correct
upvoted 1 times
...
Arsa
4 years, 10 months ago
VM is a IAAS and ADF is a PAAS.. so it can be used.
upvoted 1 times
...
...
envy
4 years, 11 months ago
databrick doesn't have source for eventhub https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/
upvoted 2 times
peppele
4 years, 11 months ago
Incorrect, https://docs.databricks.com/spark/latest/structured-streaming/streaming-event-hubs.html
upvoted 3 times
...
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...