exam questions

Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 64 discussion

A large financial company is running its ETL process. Part of this process is to move data from Amazon S3 into an Amazon Redshift cluster. The company wants to use the most cost-efficient method to load the dataset into Amazon Redshift.
Which combination of steps would meet these requirements? (Choose two.)

  • A. Use the COPY command with the manifest file to load data into Amazon Redshift.
  • B. Use S3DistCp to load files into Amazon Redshift.
  • C. Use temporary staging tables during the loading process.
  • D. Use the UNLOAD command to upload data into Amazon Redshift.
  • E. Use Amazon Redshift Spectrum to query files from Amazon S3.
Show Suggested Answer Hide Answer
Suggested Answer: AC 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Priyanka_01
Highly Voted 3 years, 7 months ago
A & C Copy command and loading into temp staging tables
upvoted 31 times
...
carol1522
Highly Voted 3 years, 7 months ago
A and c, because the goal is move data from s3 to redshift, and in the E we are not moving.
upvoted 14 times
...
Debi_mishra
Most Recent 1 year, 11 months ago
A & C. But If you are going to appear exam in near future - redshift auto copy is now a new no-ETL feature and may replace these options.
upvoted 2 times
...
pk349
2 years ago
AC: I passed the test
upvoted 1 times
...
cloudlearnerhere
2 years, 6 months ago
Selected Answer: AC
Correct answers are A & C. Option B is wrong as S3DistCp is used to copy data between S3 and HDFS. Option D is wrong as UNLOAD helps unloading the data from Redshift to S3. Option E is wrong as Redshift Spectrum does not load the data into Redshift, but the requirement is to load.
upvoted 8 times
cloudlearnerhere
2 years, 6 months ago
Option A as the COPY command loads data in parallel from Amazon S3, Amazon EMR, Amazon DynamoDB, or multiple data sources on remote hosts. COPY loads large amounts of data much more efficiently than using INSERT statements, and stores the data more effectively as well. Amazon S3 provides eventual consistency for some operations. Thus, it's possible that new data won't be available immediately after the upload, which can result in an incomplete data load or loading stale data. You can manage data consistency by using a manifest file to load data Option C as you can efficiently update and insert new data by loading your data into a staging table first. Amazon Redshift doesn't support a single merge statement (update or insert, also known as an upsert) to insert and update data from a single data source. However, you can effectively perform a merge operation. To do so, load your data into a staging table and then join the staging table with your target table for an UPDATE statement and an INSERT statement.
upvoted 4 times
...
...
dushmantha
2 years, 8 months ago
Selected Answer: AC
B is not correct because its used with EMR. D is not correct because UNLOAD is used to put data from Redshift to S3. C seems to be involve lot of work, but E does not allow to move data to Redshift but the organization requires that and A is anyway correct. So I would go with A nd C
upvoted 1 times
...
rocky48
2 years, 9 months ago
Selected Answer: AC
A, C are correct
upvoted 1 times
...
Bik000
2 years, 11 months ago
Selected Answer: AC
Answer is A & C
upvoted 1 times
...
jrheen
3 years ago
Answer - A,C
upvoted 1 times
...
aws2019
3 years, 5 months ago
A and C
upvoted 1 times
...
gunjan4392
3 years, 6 months ago
A, C are correct
upvoted 1 times
...
lostsoul07
3 years, 6 months ago
A,C is the right answer
upvoted 2 times
...
Subho_in
3 years, 6 months ago
https://aws.amazon.com/blogs/big-data/top-8-best-practices-for-high-performance-etl-processing-using-amazon-redshift/ Point number 1 and 2. Option A and C must be the answer
upvoted 10 times
Ramshizzle
2 years, 10 months ago
Point 5 is also important to note in the article mentioned by Subho_in. Also look at this why to use Staging tables: https://docs.aws.amazon.com/redshift/latest/dg/merge-create-staging-table.html
upvoted 1 times
...
...
gtourkas
3 years, 6 months ago
I disagree with C. Question is about Loading data. Staging tables is about Transformation. It's A and E for me.
upvoted 1 times
APIsche
2 years, 8 months ago
"The organization want to load the dataset onto Amazon Redshift". answer E is not moving any data not does help with it
upvoted 1 times
...
...
jove
3 years, 6 months ago
It's asking a "combination of steps", so they are A and C..
upvoted 2 times
...
sanjaym
3 years, 7 months ago
A and C
upvoted 2 times
...
syu31svc
3 years, 7 months ago
A & C for sure; the rest are clearly wrong
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago