exam questions

Exam AWS Certified Big Data - Specialty All Questions

View all questions & answers for the AWS Certified Big Data - Specialty exam

Exam AWS Certified Big Data - Specialty topic 1 question 29 discussion

Exam question from Amazon's AWS Certified Big Data - Specialty
Question #: 29
Topic #: 1
[All AWS Certified Big Data - Specialty Questions]

An online retailer is using Amazon DynamoDB to store data related to customer transactions. The items in the table contains several string attributes describing the transaction as well as a JSON attribute containing the shopping cart and other details corresponding to the transaction. Average item size is 250KB, most of which is associated with the JSON attribute. The average customer generates 3GB of data per month.
Customers access the table to display their transaction history and review transaction details as needed.
Ninety percent of the queries against the table are executed when building the transaction history view, with the other 10% retrieving transaction details. The table is partitioned on CustomerID and sorted on transaction date.
The client has very high read capacity provisioned for the table and experiences very even utilization, but complains about the cost of Amazon DynamoDB compared to other NoSQL solutions.
Which strategy will reduce the cost associated with the clients read queries while not degrading quality?

  • A. Modify all database calls to use eventually consistent reads and advise customers that transaction history may be one second out-of-date.
  • B. Change the primary table to partition on TransactionID, create a GSI partitioned on customer and sorted on date, project small attributes into GSI, and then query GSI for summary data and the primary table for JSON details.
  • C. Vertically partition the table, store base attributes on the primary table, and create a foreign key reference to a secondary table containing the JSON data. Query the primary table for summary data and the secondary table for JSON details.
  • D. Create an LSI sorted on date, project the JSON attribute into the index, and then query the primary table for summary data and the LSI for JSON details.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Bulti
Highly Voted 3 years, 7 months ago
Answer : A Not B- Cannot change the primary table’s partition key once created. If GSI was created on transaction ID then we could use the base table for summary transactions and the GSI for transaction details. But even then its hard to reduce the total RCU and WCU from what they currently have as the RCU will now then distributed 90%/10% between the base table and the GSI table. Not C - No concept of a foreign key in Dynamo DB. Not D- Cannot create an LSI after the table is created. Also the table already has date as the sort key already. Besides there are limitation on the table size containing an LSI which is 10GB. Answer is A- by forcing the consumers to use eventual consistency the cost of RCU can be reduced into half.
upvoted 6 times
matthew95
3 years, 7 months ago
That's true: No concept of a foreign key in Dynamo DB. So it must be A
upvoted 1 times
...
...
ranabhay
Highly Voted 3 years, 8 months ago
I think the answer is A as it will reduce cost and history use case don't need strongly consistent reads. The table's partition key and sort key are already correct as well. Are these questions really on actual exam. @mattyB123 did you appear for exam? Please let us know how was it?
upvoted 5 times
mattyb123
3 years, 8 months ago
Yes, @ranabhay majority of these questions were on my exam. But as you have noticed some of the selected answers are incorrect which is why i have been so active to discuss the reasons why for certain answers. As you can tell with these questions they aren't worded very well on purpose to make you either over or under think the solution.
upvoted 7 times
mattyb123
3 years, 8 months ago
@ranabhay i think your right must be A
upvoted 1 times
ranabhay
3 years, 8 months ago
Thanks
upvoted 1 times
...
VB
3 years, 7 months ago
But.. the question says .."Which strategy will reduce the cost associated with the clients read queries while not degrading quality?" ... when you change from STRONG to EVENTUAL consistency, are we not degrading the quality?
upvoted 3 times
42Cert
3 years, 7 months ago
yes, and not only for history. And nothing tells us that we are not already eventual consistency
upvoted 2 times
...
...
...
...
apertus
3 years, 8 months ago
for A, I think it degrade the quality. It can be D as it does not mentioned that you cannot re-creae the table
upvoted 2 times
exams
3 years, 7 months ago
Agree with A
upvoted 1 times
...
...
...
hdesai
Most Recent 3 years, 7 months ago
It has to be B- Question clearly says JSON has majority chunk of data which is being returned in result even though its not needed. Infrequent accessed bulk data has to be separated from small sized data to reduce cost of RCU which is best practice. One such practice is to store data in S3 and just put refrence in DynamoDB. As someone pointed vertical partitioning is way to do this as mentioned in slide page 44. https://es.slideshare.net/AmazonWebServices/advanced-design-patterns-for-amazon-dynamodb-dat403-reinvent-2017
upvoted 2 times
...
alopazo
3 years, 7 months ago
B. Look at page 44: https://es.slideshare.net/AmazonWebServices/advanced-design-patterns-for-amazon-dynamodb-dat403-reinvent-2017
upvoted 3 times
...
ub19
3 years, 7 months ago
90% of queries against the table are executed when building transaction history view. In that case option A makes more sense. It will not impact quality as much and reduce cost immediately. Option B and D requires rebuilding of table. This also requires further evaluation that what is projected in the index will meet most of the time query requirements or not.
upvoted 1 times
...
srirampc
3 years, 7 months ago
A compromises quality B cannot change partition key for existing table D cannnot add a LSI after table is created this leaves C, even though this is not ideal it is possible. So, C is the answer.
upvoted 1 times
...
Bulti
3 years, 7 months ago
A is correct because B and D are not practically doable. Cannot change the partition key and cannot create an LSI after the table is created.
upvoted 1 times
...
Kuang
3 years, 7 months ago
I choose A. B. GSI only provide eventual consistency(same as A), but it also needs to provision capacity. D. LSI max size for each partition key is 10GB, each customer generates 3GB data per month, so it will exceed the size limit.
upvoted 1 times
...
san2020
3 years, 7 months ago
my selection A
upvoted 1 times
...
richardxyz
3 years, 7 months ago
For D, both the primary table and LSI contain the JSON attribute. When you query the primary table, it still returns the JSON attribute
upvoted 2 times
...
yuriy_ber
3 years, 7 months ago
I think B - yes we can not change partition key after creation but the option says "Change the primary table to partition on TransactionID" so we can migrate without downtime. For D - we can not create LSI after creation of table. Futhermore local secondary index shares provisioned throughput settings for read and write activity with the table it is indexing so there will be no improvement. GSI can be added later and has its own provisioned throughput settings for read and write activity that are separate from those of the table so would have positive impact on costs.
upvoted 3 times
yuriy_ber
3 years, 7 months ago
One more thought concerning D - actually our table is already sorted by CustomerID and Date, so the only additional value is projecting Details to LSI. But it doesn't make sense - we can add a ProjectionExpression parameter to return only attributes excluding Details even without LSI, so additional LSI (and GSI) doesn't give us any advantage here. Really very irritating, so A seems only possible solution here but would degrage quality....
upvoted 2 times
...
...
Raju_k
3 years, 7 months ago
I will choose B since there is a limitation of max. 10GB data size per partition key for tables with LSI index and it is given that avg customer generates approx. 3GB data per month which will make the data size exceed 10GB limit in few months per a customer. https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LSI.html#LSI.ItemCollections.SizeLimit
upvoted 2 times
...
asadao
3 years, 7 months ago
C no doubts
upvoted 2 times
...
Ro
3 years, 7 months ago
"Which strategy will reduce the cost associated with the clients read queries while not degrading quality?" - Question is reduce the associated and latency is not quality so eventual consistent should be okay but A doesn't seem right. So C?
upvoted 1 times
...
iwillsky
3 years, 7 months ago
why not C? This item size is 250KB and DDB max item size is 200KB. If we cut down the usual read in 200KB. The cost will drop down.
upvoted 1 times
42Cert
3 years, 7 months ago
C looks to me like B but with relational terms (foreign key) not NoSQL ones
upvoted 1 times
...
...
pkfe
3 years, 7 months ago
Attribute Projections is important skill of tuning Dynamo DB. so answer has to be one of projecting item. B partition using transaction ID looks weird. so D.
upvoted 1 times
42Cert
3 years, 7 months ago
partition on transaction ID helps when going from history to one transaction detail.
upvoted 1 times
...
...
bigdatalearner
3 years, 7 months ago
@mattyb123 you agree with A but what about performance if we choose strongly consistent , also share your exam score if possible
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...