Exam AWS Certified Big Data - Specialty All Questions

View all questions & answers for the AWS Certified Big Data - Specialty exam

Exam AWS Certified Big Data - Specialty topic 1 question 29 discussion

Exam question from Amazon's AWS Certified Big Data - Specialty

Question #: 29
Topic #: 1

[All AWS Certified Big Data - Specialty Questions]

An online retailer is using Amazon DynamoDB to store data related to customer transactions. The items in the table contains several string attributes describing the transaction as well as a JSON attribute containing the shopping cart and other details corresponding to the transaction. Average item size is 250KB, most of which is associated with the JSON attribute. The average customer generates 3GB of data per month.
Customers access the table to display their transaction history and review transaction details as needed.
Ninety percent of the queries against the table are executed when building the transaction history view, with the other 10% retrieving transaction details. The table is partitioned on CustomerID and sorted on transaction date.
The client has very high read capacity provisioned for the table and experiences very even utilization, but complains about the cost of Amazon DynamoDB compared to other NoSQL solutions.
Which strategy will reduce the cost associated with the clients read queries while not degrading quality?

A. Modify all database calls to use eventually consistent reads and advise customers that transaction history may be one second out-of-date.
B. Change the primary table to partition on TransactionID, create a GSI partitioned on customer and sorted on date, project small attributes into GSI, and then query GSI for summary data and the primary table for JSON details.
C. Vertically partition the table, store base attributes on the primary table, and create a foreign key reference to a secondary table containing the JSON data. Query the primary table for summary data and the secondary table for JSON details.
D. Create an LSI sorted on date, project the JSON attribute into the index, and then query the primary table for summary data and the LSI for JSON details.

Show Suggested Answer

Suggested Answer: D 🗳️

by mattyb123 at Aug. 13, 2019, 5:20 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

Bulti

Highly Voted 3 years, 7 months ago

Answer : A Not B- Cannot change the primary table’s partition key once created. If GSI was created on transaction ID then we could use the base table for summary transactions and the GSI for transaction details. But even then its hard to reduce the total RCU and WCU from what they currently have as the RCU will now then distributed 90%/10% between the base table and the GSI table. Not C - No concept of a foreign key in Dynamo DB. Not D- Cannot create an LSI after the table is created. Also the table already has date as the sort key already. Besides there are limitation on the table size containing an LSI which is 10GB. Answer is A- by forcing the consumers to use eventual consistency the cost of RCU can be reduced into half.

upvoted 6 times

matthew95

3 years, 7 months ago

That's true: No concept of a foreign key in Dynamo DB. So it must be A

upvoted 1 times

...

ranabhay

Highly Voted 3 years, 8 months ago

I think the answer is A as it will reduce cost and history use case don't need strongly consistent reads. The table's partition key and sort key are already correct as well. Are these questions really on actual exam. @mattyB123 did you appear for exam? Please let us know how was it?

upvoted 5 times

mattyb123

3 years, 8 months ago

Yes, @ranabhay majority of these questions were on my exam. But as you have noticed some of the selected answers are incorrect which is why i have been so active to discuss the reasons why for certain answers. As you can tell with these questions they aren't worded very well on purpose to make you either over or under think the solution.

upvoted 7 times

mattyb123

3 years, 8 months ago

@ranabhay i think your right must be A

upvoted 1 times

ranabhay

3 years, 8 months ago

Thanks

upvoted 1 times

...

VB

3 years, 7 months ago

But.. the question says .."Which strategy will reduce the cost associated with the clients read queries while not degrading quality?" ... when you change from STRONG to EVENTUAL consistency, are we not degrading the quality?

upvoted 3 times

42Cert

3 years, 7 months ago

yes, and not only for history. And nothing tells us that we are not already eventual consistency

upvoted 2 times

...

apertus

3 years, 8 months ago

for A, I think it degrade the quality. It can be D as it does not mentioned that you cannot re-creae the table

upvoted 2 times

exams

3 years, 7 months ago

Agree with A

upvoted 1 times

...

hdesai

Most Recent 3 years, 7 months ago

It has to be B- Question clearly says JSON has majority chunk of data which is being returned in result even though its not needed. Infrequent accessed bulk data has to be separated from small sized data to reduce cost of RCU which is best practice. One such practice is to store data in S3 and just put refrence in DynamoDB. As someone pointed vertical partitioning is way to do this as mentioned in slide page 44. https://es.slideshare.net/AmazonWebServices/advanced-design-patterns-for-amazon-dynamodb-dat403-reinvent-2017

upvoted 2 times

...

alopazo

3 years, 7 months ago

B. Look at page 44: https://es.slideshare.net/AmazonWebServices/advanced-design-patterns-for-amazon-dynamodb-dat403-reinvent-2017

upvoted 3 times

...

ub19

3 years, 7 months ago

90% of queries against the table are executed when building transaction history view. In that case option A makes more sense. It will not impact quality as much and reduce cost immediately. Option B and D requires rebuilding of table. This also requires further evaluation that what is projected in the index will meet most of the time query requirements or not.

upvoted 1 times

...

srirampc

3 years, 7 months ago

A compromises quality B cannot change partition key for existing table D cannnot add a LSI after table is created this leaves C, even though this is not ideal it is possible. So, C is the answer.

upvoted 1 times

...

Bulti

3 years, 7 months ago

A is correct because B and D are not practically doable. Cannot change the partition key and cannot create an LSI after the table is created.

upvoted 1 times

...

Kuang

3 years, 7 months ago

I choose A. B. GSI only provide eventual consistency(same as A), but it also needs to provision capacity. D. LSI max size for each partition key is 10GB, each customer generates 3GB data per month, so it will exceed the size limit.

upvoted 1 times

...

san2020

3 years, 7 months ago

my selection A

upvoted 1 times

...

richardxyz

3 years, 7 months ago

For D, both the primary table and LSI contain the JSON attribute. When you query the primary table, it still returns the JSON attribute

upvoted 2 times

...

yuriy_ber

3 years, 7 months ago

I think B - yes we can not change partition key after creation but the option says "Change the primary table to partition on TransactionID" so we can migrate without downtime. For D - we can not create LSI after creation of table. Futhermore local secondary index shares provisioned throughput settings for read and write activity with the table it is indexing so there will be no improvement. GSI can be added later and has its own provisioned throughput settings for read and write activity that are separate from those of the table so would have positive impact on costs.

upvoted 3 times

yuriy_ber

3 years, 7 months ago

One more thought concerning D - actually our table is already sorted by CustomerID and Date, so the only additional value is projecting Details to LSI. But it doesn't make sense - we can add a ProjectionExpression parameter to return only attributes excluding Details even without LSI, so additional LSI (and GSI) doesn't give us any advantage here. Really very irritating, so A seems only possible solution here but would degrage quality....

upvoted 2 times

...

Raju_k

3 years, 7 months ago

I will choose B since there is a limitation of max. 10GB data size per partition key for tables with LSI index and it is given that avg customer generates approx. 3GB data per month which will make the data size exceed 10GB limit in few months per a customer. https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LSI.html#LSI.ItemCollections.SizeLimit

upvoted 2 times

...

asadao

3 years, 7 months ago

C no doubts

upvoted 2 times

...

Ro

3 years, 7 months ago

"Which strategy will reduce the cost associated with the clients read queries while not degrading quality?" - Question is reduce the associated and latency is not quality so eventual consistent should be okay but A doesn't seem right. So C?

upvoted 1 times

...

iwillsky

3 years, 7 months ago

why not C? This item size is 250KB and DDB max item size is 200KB. If we cut down the usual read in 200KB. The cost will drop down.

upvoted 1 times

42Cert

3 years, 7 months ago

C looks to me like B but with relational terms (foreign key) not NoSQL ones

upvoted 1 times

...

pkfe

3 years, 7 months ago

Attribute Projections is important skill of tuning Dynamo DB. so answer has to be one of projecting item. B partition using transaction ID looks weird. so D.

upvoted 1 times

42Cert

3 years, 7 months ago

partition on transaction ID helps when going from history to one transaction detail.

upvoted 1 times

...

bigdatalearner

3 years, 7 months ago

@mattyb123 you agree with A but what about performance if we choose strongly consistent , also share your exam score if possible

upvoted 1 times

...

Load full discussion...