exam questions

Exam DP-201 All Questions

View all questions & answers for the DP-201 exam

Exam DP-201 topic 1 question 34 discussion

Actual exam question from Microsoft's DP-201
Question #: 34
Topic #: 1
[All DP-201 Questions]

DRAG DROP -
You have data on the 75,000 employees of your company. The data contains the properties shown in the following table.

You need to store the employee data in an Azure Cosmos DB container. Most queries on the data will filter by the Current Department and the Employee
Surname properties.
Which partition key and item ID should you use for the container? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Select and Place:

Show Suggested Answer Hide Answer
Suggested Answer:
Partition key: Current Department

Item ID: Employee ID -
Reference:
https://docs.microsoft.com/en-us/rest/api/storageservices/designing-a-scalable-partitioning-strategy-for-azure-table-storage

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Luke97
Highly Voted 5 years, 2 months ago
I think the partition key should be Department rather than Surname. The reason for this is the read latency. As the question stated, "most of query would filter by Current Department or Employer Surname". Having Surname as partition key, you would have 40,000 partition (40,000 unique value), and when you filter your query by Department, you query will need to go through 40,000 partition which would be real bad on performance. Other another hand, having Department as partition key, you would have 25 partition, and to filter surname query, it would be much faster compare to query on 40,000 partition.
upvoted 80 times
Gashurb
5 years, 2 months ago
Yepp, i agree too. 40k partitions can't be good.
upvoted 1 times
Isio05
4 years, 7 months ago
Current department suggests it's something can change, therefore it can't be parition key. Reasoning that we shouldn't use Surname because it will result in many logical partitions is completely wrong. It's even clearly stated in docs that even item id (with only unique) values is a valid option (however here Surname is more appropriate as we use it a predicate in queries).
upvoted 2 times
...
...
francisco94
4 years, 6 months ago
Wrong, Department can change thus it cant be a partitioning key, moreover your argument is that it would be better if partitioning key would be Department and filtering on Surname because you would need to access only 25 partitions. Yes 25 partition of thousand of values! It is bad either way... but the better one is surname.
upvoted 4 times
...
Manue
5 years, 1 month ago
From https://docs.microsoft.com/en-gb/azure/cosmos-db/partitioning-overview: "For all containers, your partition key should: Be a property that has a value which does not change. If a property is your partition key, you can't update that property's value. Have a high cardinality. In other words, the property should have a wide range of possible values. Spread request unit (RU) consumption and data storage evenly across all logical partitions. This ensures even RU consumption and storage distribution across your physical partitions." Firstly, "Current Department" is something that could change. Secondly, "25" is not high cardinality, and does not guarantee even distribution of data. E.g. if that was a huge IT company, 50k could be in the Engineering department, 50 in HHRR, 50 in MKT, etc. So I think it should be "Surname" and EmployeeID.
upvoted 57 times
LiamRT
3 years, 7 months ago
The 'Sales' department will not change it's name. An employee may transfer from 'Sales' to 'Engineering' but that causes no issue to the partitioning.
upvoted 1 times
...
aksoumi
4 years, 3 months ago
agree 100%
upvoted 1 times
...
Dhaval_Azure
4 years ago
No. It can't be as "Surname" as its values are populated only 99%. Will empty value in Partition key works? I think we need a column that is 100% populated. it can be EmpoyeID or the current department. So, I am thinking to go with Key: Employer ID & Item Id: current department
upvoted 1 times
...
...
...
Ard
Highly Voted 5 years, 3 months ago
i think the answer should be partition by surname ( as it has more unique values than department) and employeeId as itemid since it's unique.
upvoted 31 times
...
manasa203
Most Recent 2 years ago
I think the answer is correct. if you see the data populated column, for the surname it's 99%. A partition key column should not have null values. for department it's 100%, hence department is the best choice here
upvoted 1 times
...
satyamkishoresingh
3 years, 8 months ago
Isn't employee ID a good candidate for partitioning ?
upvoted 1 times
...
Marcus1612
3 years, 9 months ago
The anwser is wrong ! look at this because the Department could change. https://docs.microsoft.com/en-gb/azure/cosmos-db/partitioning-overview#choose-partitionkey The answer would be good for a large containers. But in this use case we have a small one. Partition strategy depends on the container size. Since we do not have an Read-Heavy container. We shoud use a property that does not change. Partion key = EmployeeID . (both Department and Surnames can change). " For large read-heavy containers, however, you might want to choose a partition key that appears frequently as a filter in your queries. Queries can be efficiently routed to only the relevant physical partitions by including the partition key in the filter predicate. If most of your workload's requests are queries and most of your queries have an equality filter on the same property, this property can be a good partition key choice."
upvoted 1 times
...
hello_there_
3 years, 10 months ago
The partition key should be employee_id. From the documentation (https://docs.microsoft.com/en-us/azure/cosmos-db/partitioning-overview): For all containers, your partition key should: Be a property that has a value which does not change. If a property is your partition key, you can't update that property's value. Have a high cardinality. In other words, the property should have a wide range of possible values. Spread request unit (RU) consumption and data storage evenly across all logical partitions. This ensures even RU consumption and storage distribution across your physical partitions. currentDepartment has a low cardinality and can change. Lastname can change (people get married) and 1% has null lastname, which creates one large partition and thus uneven distribution. The same documentation states: "For small read-heavy containers or write-heavy containers of any size, the item ID is naturally a great choice for the partition key.". This container certainly qualifies as small, it's just 75.000 employee records.
upvoted 1 times
...
Durga123
4 years, 2 months ago
lot of confusion here. what is the correct answer?
upvoted 3 times
alain2
4 years, 1 month ago
pk: Employee Surname id: Employee Id
upvoted 6 times
...
...
Deepu1987
4 years, 4 months ago
I agree with the given solution partion key - current dept item id - emp id
upvoted 5 times
...
syu31svc
4 years, 6 months ago
The answer given is correct. Put aside all the theory and concepts about partitioning and just think about it: A company has different departments and each department has its own employees. Between name/surname and ID, ID is definitely the better identifier.
upvoted 2 times
...
ttAsh
4 years, 6 months ago
partition key should be current department(populated 100%) as surname is only 99% populated. we cannot have a partition key as NULL/ not populated.
upvoted 9 times
BitchNigga
4 years, 1 month ago
Finally someone said it
upvoted 3 times
...
captainbee
3 years, 12 months ago
But also department is a field that can chagne quite easily, which is something that partitions cannot do. So ultimately this question sucks, but if at gunpoint I had to take one of them, I'd go with Surname.
upvoted 2 times
...
...
essdeecee
4 years, 8 months ago
I suspect it's surname rather than department. Firstly there are simply too few variants, its also "current department" so might is likely to change. Surname is similarly bad on the changeable nature (assume a woman getting married e.g.) but assuming all else it's better than department.
upvoted 1 times
...
tdaou
4 years, 9 months ago
I agree with the suggested answer, I would also argue that the distribution of names will be highly uneven and would result in partitions of very different sizes, including 40,000 with one unique entry. So Current Department by elimination really.
upvoted 3 times
...
zglat
4 years, 10 months ago
'Current' department suggests that it changes. Surnames change all the time. As a result neither of them are good choice for partition keys. I believe that leaves Employee ID
upvoted 2 times
...
Ash666
4 years, 11 months ago
From the docs: For all containers, your partition key should: Be a property that has a value which does not change. If a property is your partition key, you can't update that property's value. So current dept can’t be partition key. So obviously it’s surname. Employee ID should be item ID
upvoted 8 times
Ash666
4 years, 11 months ago
https://docs.microsoft.com/en-us/azure/cosmos-db/partitioning-overview
upvoted 1 times
...
...
envy
4 years, 11 months ago
I think the answer is correct as https://docs.microsoft.com/en-us/azure/cosmos-db/partitioning-overview#choose-partitionkey If your container could grow to more than a few physical partitions, then you should make sure you pick a partition key that minimizes cross-partition queries. Your container will require more than a few physical partitions when either of the following are true: • Your container will have over 30,000 RU's provisioned Your container will store over 100 GB of data surname has 40,000 values, which " more than a few physical partitions", we should pick a partition key that minimizes cross-partition queries and used in filter. which is "Current Department"
upvoted 4 times
Needium
4 years, 3 months ago
Your Partition key should be a value that does not change cos you would not be able to change it. More so nothing in this question suggests the size of the database to be so large or would have over 30000 RUs provisioned. Yes, the nulls in the surname and the fact that surname could even change is a concern, but Surname is very unlikely to change compared to Current Department. Current tells us it is even very volatile. I would rather have the Surname as Partitioning key. Thanks for raising this point though, it is worth considering too
upvoted 2 times
...
...
Mathster
5 years, 1 month ago
Surname cannot be a partition key because ~750 records have a null value.
upvoted 5 times
drdean
5 years ago
That's not the worst thing in the world https://sqlstudies.com/2017/05/03/partitioning-on-a-nullable-column/
upvoted 1 times
...
...
HeB
5 years, 2 months ago
I think the answer for Partition Key should be Employee Surname. It has a wider range and more unique values, see: https://docs.microsoft.com/nl-nl/azure/cosmos-db/partitioning-overview#choose-partitionkey
upvoted 9 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...