exam questions

Exam CCD-410 All Questions

View all questions & answers for the CCD-410 exam

Exam CCD-410 topic 1 question 21 discussion

Actual exam question from Cloudera's CCD-410
Question #: 21
Topic #: 1
[All CCD-410 Questions]

What is the disadvantage of using multiple reducers with the default HashPartitioner and distributing your workload across you cluster?

  • A. You will not be able to compress the intermediate data.
  • B. You will longer be able to take advantage of a Combiner.
  • C. By using multiple reducers with the default HashPartitioner, output files may not be in globally sorted order.
  • D. There are no concerns with this approach. It is always advisable to use multiple reduces.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️
Multiple reducers and total ordering
If your sort job runs with multiple reducers (either because mapreduce.job.reduces in mapred-site.xml has been set to a number larger than 1, or because you've used the -r option to specify the number of reducers on the command-line), then by default Hadoop will use the HashPartitioner to distribute records across the reducers. Use of the HashPartitioner means that you can't concatenate your output files to create a single sorted output file. To do this you'll need total ordering,
Reference: Sorting text files with MapReduce

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Currently there are no comments in this discussion, be the first to comment!
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...