Exam DP-201 All Questions

View all questions & answers for the DP-201 exam

Exam DP-201 topic 7 question 1 discussion

Actual exam question from Microsoft's DP-201

Question #: 1
Topic #: 7

Inventory levels must be calculated by subtracting the current day's sales from the previous day's final inventory.
Which two options provide Litware with the ability to quickly calculate the current inventory levels by store and product? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.

A. Consume the output of the event hub by using Azure Stream Analytics and aggregate the data by store and product. Output the resulting data directly to Azure Synapse Analytics. Use Transact-SQL to calculate the inventory levels.
B. Output Event Hubs Avro files to Azure Blob storage. Use Transact-SQL to calculate the inventory levels by using PolyBase in Azure Synapse Analytics.
C. Consume the output of the event hub by using Databricks. Use Databricks to calculate the inventory levels and output the data to Azure Synapse Analytics.
D. Consume the output of the event hub by using Azure Stream Analytics and aggregate the data by store and product. Output the resulting data into Databricks. Calculate the inventory levels in Databricks and output the data to Azure Blob storage.
E. Output Event Hubs Avro files to Azure Blob storage. Trigger an Azure Data Factory copy activity to run every 10 minutes to load the data into Azure Synapse Analytics. Use Transact-SQL to aggregate the data by store and product.

Show Suggested Answer

Suggested Answer: AE 🗳️
A: Azure Stream Analytics is a fully managed service providing low-latency, highly available, scalable complex event processing over streaming data in the cloud.
You can use your Azure Synapse Analytics (SQL Data warehouse) database as an output sink for your Stream Analytics jobs.
E: Event Hubs Capture is the easiest way to get data into Azure. Using Azure Data Lake, Azure Data Factory, and Azure HDInsight, you can perform batch processing and other analytics using familiar tools and platforms of your choosing, at any scale you need.
Note: Event Hubs Capture creates files in Avro format.
Captured data is written in Apache Avro format: a compact, fast, binary format that provides rich data structures with inline schema. This format is widely used in the Hadoop ecosystem, Stream Analytics, and Azure Data Factory.
Scenario: The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure.
Reference:
https://docs.microsoft.com/bs-latn-ba/azure/sql-data-warehouse/sql-data-warehouse-integrate-azure-stream-analytics https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-capture-overview

by hokigir at April 28, 2020, 5:40 p.m.

Comments

Submit Cancel

pravinDataSpecialist

Highly Voted 5 years ago

should be A & C as the final result should result in synapse

upvoted 21 times

extraego

4 years, 10 months ago

C is incorrect because there is no mention of a step that Azure DataBricks is connected to Azure Synapse to get the previous inventory level. Therefore, you cannot calculate the current inventory level from DataBricks. A & E are the only options that output sales data to Azure Synapse and calculate the inventory level from Azure Synapse. I've noticed that many people in DP-201 imagine the steps that are not stated, causing a lot of confusion.

upvoted 13 times

Rambaldi

4 years, 7 months ago

but the inventory levels are not calculated in E

upvoted 1 times

sturcu

4 years, 4 months ago

Yes they are. You load the sales into synapse -upserts ( you will substract the sold items). Hence aggregating will give you Current Inventory

upvoted 1 times

...

NikP

Highly Voted 4 years, 10 months ago

I believe daily inventory data are going through ADLK Gen2 (As needed for staging) and go to to Analytical Data Store (Synapse). Another thing is "Daily inventory data comes from a Microsoft SQL server located on a private network". I believe this is on prem server. They didn't mentioned how this sql server consume the daily inventory data. Previous day sales data which are already in Synapse (Assumption is daily inventory data migrated from sql server to datalake and then to Synapse). Current Day's sales data can be ingested to Synapse directly through event hub by using Azure Stream Analytics. OR current day's sales data can be store to blob storage from Event Hub. You can use ADF to load those files to Synapse every 10 min. Then after you can create direct Power BI dashboard which can run T-SQL to calculate inventory level in Synapse and feed it to Power BI report. This can be live report/dashboard which can be share to every store. Answer should be A and E.

upvoted 9 times

...

Prashantprp

Most Recent 4 years, 3 months ago

Some outputs types support partitioning, and output batch sizes vary to optimize throughput. The following table shows features that are supported for each output type: TABLE 1 Output type Partitioning Sehttps://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-define-outputs ASA cannot output to Databricks Azure Data Lake Storage Gen 1 Yes Azure Active Directory user , Managed Identity Azure SQL Database Yes, optional. SQL user auth, Managed Identity (preview) Azure Synapse Analytics Yes SQL user auth, Managed Identity (preview) Blob storage and Azure Data Lake Gen 2 Yes Access key, Managed Identity (preview) Azure Event Hubs Yes, need to set the partition key column in output configuration. Access key, Managed Identity (preview) Power BI No Azure Active Directory user, Managed Identity Azure Table storage Yes Account key Azure Service Bus queues Yes Access key Azure Service Bus topics Yes Access key Azure Cosmos DB Yes Access key Azure Functions Yes Access key

upvoted 1 times

...

TaherAli2020

4 years, 4 months ago

Previous day's data will be in the DW so better to do the calculation on the DW. Use Transact-SQL to calculate the inventory levels.

upvoted 1 times

...

spiitr

4 years, 4 months ago

I will not go for databricks because it is mentioned expressroute or VPN will not be there. https://docs.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/on-prem-network

upvoted 1 times

spiitr

4 years, 4 months ago

So Polybase (not feasible on Avro format) and data bricks options are ruled out and given answer is correct.

upvoted 1 times

...

Aditya167

4 years, 5 months ago

E- using AVRO not useful as polybase does not support AVRO So only left option is A and C

upvoted 2 times

...

syu31svc

4 years, 6 months ago

"Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store" D is out PolyBase does not support Avro B is out Running DataFactory every 10 minutes is unncessary E is out A and C are therefore correct

upvoted 8 times

...

M0e

4 years, 8 months ago

D is obviously not an option since it does not fulfil the requirement of calculating quickly! Option C should be selected instead - it receives the last day's inventory data from the Data lake which is mentioned in the case study (it is used for staging the data before it is loaded into the Data Warehouse.) -> Answer: A & C

upvoted 2 times

M0e

4 years, 8 months ago

Sorry, I meant E is not an option.

upvoted 1 times

...

Trove

4 years, 8 months ago

If it is A and E , then aggregation by store and product is done twice, which may not give the right result. So , from a business perspective, A and C makes more sense.

upvoted 2 times

...

NikP

4 years, 10 months ago

B: is incorrect because Avro files doesn't support Polybase. C: is incorrect because first of all I am not sure you can output to result directly to Azure databrick. Even if you can then again you need to get previous sales data from datalake or from synapse. Then calculate the inventory level by comparing both of them and send it to Synapse and feed the Power BI report. This doesn't make sense. In Synapse, based on right distribution key on large table, your join query works faster. I believe it will be fast and cheaper if you calculate the inventory level in Synapse compare to databrick. D: is incorrect because you don't want to store result in blob storage. It just doesn't make sense. With Synapse and Power BI, you can see historical inventory level by day, by store, by product etc if you configure it right way.

upvoted 7 times

...

sharnav

4 years, 11 months ago

Answer is A n E only as in the question it has been stated that "Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store" , so the final output should go to analytical data store and azure storage should be used as staging which in question been told to remove the files from it as soon as data is uploaded into the final stage.

upvoted 3 times

...

LeonLeon

4 years, 11 months ago

E can not be correct, because the use of an time scheduled trigger instead of an event blob-trigger. Update have to be done as close to real-time as possible!!

upvoted 4 times

...

Tommy65

4 years, 11 months ago

Clearly the assumption is that the inventory of the day before is already in Synapse Analytics. On that basis the answer is correct: in A you aggregate the data, then load them into Synapse and there you use T-SQL statement to calculate the difference with the day before. In E you copy in Synapse the raw data, then use T-SQL to aggregate them and more T-SQL to calculate the difference with the day before

upvoted 3 times

...

azurearch

5 years, 1 month ago

A, B seems to be correct as well. with 10 minutes schedule using adf as mentioned in E option, there is a delay.

upvoted 2 times

azurearch

5 years, 1 month ago

b is wrong as polybase does not support avr format. a, c seems to be correct

upvoted 5 times

...

willdy123

5 years, 1 month ago

Option E does not calculate the inventory. Only groups input by store and prodcut. Databricks can calculate the inventory by reading eventhub data and inventory data from data lake (staged) or synapse. Since this is neither mentioned nor left out of option C. I would suggest options A and C to be the correct answers.

upvoted 5 times

...

hokigir

5 years, 2 months ago

Why not A, C?

upvoted 7 times

vistran

5 years, 1 month ago

As I see the inventory dataas of prev date is in Azure Synapse Analytics. So inventory level cant be calculated in Azure Databricks unless there's a feed back from Azure Synapse to Databricks

upvoted 5 times

...