New Year Sale ! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Hortonworks Exam HDPCD Topic 3 Question 50 Discussion

Actual exam question for Hortonworks's HDPCD exam
Question #: 50
Topic #: 3
[All HDPCD Questions]

You need to create a job that does frequency analysis on input dat

a. You will do this by writing a Mapper that uses TextInputFormat and splits each value (a line of text from an input file) into individual characters. For each one of these characters, you will emit the character as a key and an InputWritable as the value. As this will produce proportionally more intermediate data than input data, which two resources should you expect to be bottlenecks?

Show Suggested Answer Hide Answer
Suggested Answer: B

Contribute your Thoughts:

Alease
7 months ago
I believe it's more about Processor and disk I/O, as we are dealing with individual characters.
upvoted 0 times
...
Irving
7 months ago
I see your point, Eugene. Maybe it's a combination of both.
upvoted 0 times
...
Eugene
7 months ago
But wouldn't Disk I/O and network I/O also be significant bottlenecks?
upvoted 0 times
...
Emilio
7 months ago
I agree with Irving, that makes sense since we are processing a lot of data.
upvoted 0 times
...
Irving
8 months ago
I think the bottlenecks will be Processor and network I/O.
upvoted 0 times
...

Save Cancel