exam questions

Exam DP-201 All Questions

View all questions & answers for the DP-201 exam

Exam DP-201 topic 2 question 47 discussion

Actual exam question from Microsoft's DP-201
Question #: 47
Topic #: 2
[All DP-201 Questions]

You are planning a streaming data solution that will use Azure Databricks. The solution will stream sales transaction data from an online store. The solution has the following specifications:
✑ The output data will contain items purchased, quantity, line total sales amount, and line total tax amount.
✑ Line total sales amount and line total tax amount will be aggregated in Databricks.
✑ Sales transactions will never be updated. Instead, new rows will be added to adjust a sale.
You need to recommend an output mode for the dataset that will be processed by using Structured Streaming. The solution must minimize duplicate data.
What should you recommend?

  • A. Append
  • B. Complete
  • C. Update
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️
Append Mode: Only new rows appended in the result table since the last trigger are written to external storage. This is applicable only for the queries where existing rows in the Result Table are not expected to change.
Incorrect Answers:
B: Complete Mode: The entire updated result table is written to external storage. It is up to the storage connector to decide how to handle the writing of the entire table.
C: Update Mode: Only the rows that were updated in the result table since the last trigger are written to external storage. This is different from Complete Mode in that Update Mode outputs only the rows that have changed since the last trigger. If the query doesn't contain aggregations, it is equivalent to Append mode.
Reference:
https://docs.microsoft.com/en-us/azure/databricks/getting-started/spark/streaming

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
toandm
Highly Voted 3 years, 11 months ago
Same question in DP 200
upvoted 5 times
...
lorenzoV
Most Recent 2 years, 8 months ago
there will be a new line for each updated transaction (record). So 'append' is correct
upvoted 1 times
...
nefarious_smalls
2 years, 11 months ago
I think the answer is correct because it says data will be aggregated using Databricks. As far as the streaming mode The data is only being appended. Aggregations will be calculated separately.
upvoted 1 times
...
MayankSh
3 years, 10 months ago
Sales transactions will never be updated --> No updates meaning, no need to perform merge operation or updates, Hence append is the correct answer
upvoted 1 times
...
erssiws
3 years, 10 months ago
The required conditions are confusing: condition2-> update condition3-> append
upvoted 2 times
...
maynard13x8
4 years ago
I think it should be update because of the possible new additions of new data to already copied rows. Any opinions?
upvoted 1 times
maynard13x8
4 years ago
sorry, I haven't read third condition. I think answer is correct.
upvoted 5 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago