exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 41 discussion

A data engineer must ingest a source of structured data that is in .csv format into an Amazon S3 data lake. The .csv files contain 15 columns. Data analysts need to run Amazon Athena queries on one or two columns of the dataset. The data analysts rarely query the entire file.
Which solution will meet these requirements MOST cost-effectively?

  • A. Use an AWS Glue PySpark job to ingest the source data into the data lake in .csv format.
  • B. Create an AWS Glue extract, transform, and load (ETL) job to read from the .csv structured data source. Configure the job to ingest the data into the data lake in JSON format.
  • C. Use an AWS Glue PySpark job to ingest the source data into the data lake in Apache Avro format.
  • D. Create an AWS Glue extract, transform, and load (ETL) job to read from the .csv structured data source. Configure the job to write the data into the data lake in Apache Parquet format.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
imymoco
7 months, 2 weeks ago
Why not B? I think Athena also be able to handle json.
upvoted 1 times
...
pypelyncar
1 year ago
Selected Answer: D
Athena is optimized for querying data stored in Parquet format. It can efficiently scan only the necessary columns for a specific query, reducing the amount of data processed. This translates to faster query execution times and lower query costs for data analysts who primarily focus on one or two columns
upvoted 2 times
...
FunkyFresco
1 year ago
Selected Answer: D
Cost effectively, and they are going to use only one or two columns, columnar.
upvoted 2 times
...
GiorgioGss
1 year, 3 months ago
Selected Answer: D
MOST cost-effectively = parquet
upvoted 3 times
...
atu1789
1 year, 4 months ago
Selected Answer: D
Glue + Parquet for cost efectiveness
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...