exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 79 discussion

A trucking company is collecting live image data from its fleet of trucks across the globe. The data is growing rapidly and approximately 100 GB of new data is generated every day. The company wants to explore machine learning uses cases while ensuring the data is only accessible to specific IAM users.
Which storage option provides the most processing flexibility and will allow access control with IAM?

  • A. Use a database, such as Amazon DynamoDB, to store the images, and set the IAM policies to restrict access to only the desired IAM users.
  • B. Use an Amazon S3-backed data lake to store the raw images, and set up the permissions using bucket policies.
  • C. Setup up Amazon EMR with Hadoop Distributed File System (HDFS) to store the files, and restrict access to the EMR instances using IAM policies.
  • D. Configure Amazon EFS with IAM policies to make the data available to Amazon EC2 instances owned by the IAM users.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Paul_NoName
Highly Voted 3 years, 7 months ago
B is the right answer
upvoted 33 times
...
clawo
Highly Voted 3 years, 3 months ago
Selected Answer: B
B to use as storage with policies
upvoted 7 times
...
xicocaio
Most Recent 7 months, 1 week ago
Selected Answer: B
- Amazon S3-backed data lake: S3 is the best storage option for large and rapidly growing datasets like images from trucks. S3 scales easily, handles large volumes of data, and is cost-effective for long-term storage, making it a natural choice for this scenario. - IAM access control: You can use bucket policies in S3 to set very specific access controls, ensuring that only certain IAM users have permission to access or modify the data. This satisfies the requirement for access control using IAM. - Processing flexibility: Storing the images in S3 offers flexibility for future machine learning use cases. The data stored in S3 can easily be integrated with other AWS services like SageMaker, Athena, EMR, and more for processing and analysis.
upvoted 1 times
...
endeesa
1 year, 5 months ago
Selected Answer: B
EMR/HDFS is not more 'flexible' than S3
upvoted 1 times
...
loict
1 year, 7 months ago
Selected Answer: B
A. NO - volume too big for a DB B. YES C. NO - instance access will not control HDFS access D. NO - EFS does not use IAM policies (it is unix)
upvoted 1 times
...
Mickey321
1 year, 8 months ago
Selected Answer: B
S3 indeed
upvoted 1 times
...
JK1977
1 year, 11 months ago
Selected Answer: B
S3 always
upvoted 1 times
...
Nadia0012
2 years, 1 month ago
Selected Answer: B
I would say the answer is B not because of the cost on EMR,. that is also a current answer. however: "most processing flexibility" indicates that S3 is a better option. because all ML solutions and work flows integrate with S3. it hasn't spoken what the ML solution and which services so I take the safe side and go with S3
upvoted 2 times
...
KlaudYu
2 years, 9 months ago
Selected Answer: B
C is not affordable because it is ephemeral storage. https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-file-systems.html "HDFS is used by the master and core nodes. One advantage is that it's fast; a disadvantage is that it's ephemeral storage which is reclaimed when the cluster ends. It's best used for caching the results produced by intermediate job-flow steps."
upvoted 4 times
ZSun
2 years ago
the question does not require long-term storage.
upvoted 1 times
...
...
geekgirl007
3 years, 3 months ago
Selected Answer: C
C is correct. it says real time data and to be used for ml process so EMR more suitable. also S3 bucket policies not same as IAM users so B is not correct.
upvoted 4 times
ovokpus
2 years, 10 months ago
Why will you need to spin up servers (EMR) just to store visual data for ML?
upvoted 5 times
...
...
Abdo702
3 years, 5 months ago
I think Amazon EMR is more appropriate, as the data scheme stated is a big data scheme. https://aws.amazon.com/emr/?whats-new-cards.sort-by=item.additionalFields.postDateTime&whats-new-cards.sort-order=desc
upvoted 3 times
Sourabh1703
3 years, 3 months ago
IAM support is required for storage feature , that is not possible as per options described as IAM is supported for HDFS for the instance running on top of it, hence B should be correct
upvoted 2 times
...
...
Vita_Rasta84444
3 years, 6 months ago
B is the right answer
upvoted 2 times
...
srinu3054
3 years, 6 months ago
S3 is the easy, scalable and secure option to store the image data.
upvoted 1 times
...
astonm13
3 years, 6 months ago
B is the right answer
upvoted 1 times
...
zzaibis
3 years, 6 months ago
B is an appropriate choice
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago