New Year Sale ! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Hortonworks Exam HDPCD Topic 3 Question 62 Discussion

Actual exam question for Hortonworks's HDPCD exam
Question #: 62
Topic #: 3
[All HDPCD Questions]

You want to perform analysis on a large collection of images. You want to store this data in HDFS and process it with MapReduce but you also want to give your data analysts and data scientists the ability to process the data directly from HDFS with an interpreted high-level programming language like Python. Which format should you use to store this data in HDFS?

Show Suggested Answer Hide Answer
Suggested Answer: B

Contribute your Thoughts:

Junita
4 months ago
XML? Hmm, I thought we were trying to avoid bloat and complexity. Avro or JSON seem like the way to go for this use case.
upvoted 0 times
...
Sharmaine
4 months ago
HTML? Really? I thought this was a data storage question, not a web design exam. Let's stick to the actual file formats, shall we?
upvoted 0 times
Viki
3 months ago
HTML? Really? I thought this was a data storage question, not a web design exam. Let's stick to the actual file formats, shall we?
upvoted 0 times
...
Amber
3 months ago
F) CSV
upvoted 0 times
...
Erasmo
3 months ago
E) XML
upvoted 0 times
...
Leah
3 months ago
C) JSON
upvoted 0 times
...
Linn
3 months ago
B) Avro
upvoted 0 times
...
Yolande
3 months ago
A) SequenceFiles
upvoted 0 times
...
...
Winifred
4 months ago
CSV is a classic choice, but it might not be the best for complex data structures. I'd say Avro or JSON are better options here.
upvoted 0 times
...
Kate
4 months ago
I'd go with JSON. It's human-readable, easy to parse, and widely supported by data tools.
upvoted 0 times
Elvis
3 months ago
JSON it is then. It's versatile and fits our requirements.
upvoted 0 times
...
Sheridan
3 months ago
I agree, JSON is a popular format for storing data in HDFS.
upvoted 0 times
...
Paris
3 months ago
JSON is a good choice for that. It's flexible and works well with Python.
upvoted 0 times
...
Denise
3 months ago
F) CSV
upvoted 0 times
...
Shaun
3 months ago
E) XML
upvoted 0 times
...
Mariann
3 months ago
D) HTML
upvoted 0 times
...
Timmy
3 months ago
C) JSON
upvoted 0 times
...
An
3 months ago
B) Avro
upvoted 0 times
...
Maryanne
3 months ago
A) SequenceFiles
upvoted 0 times
...
...
Brice
4 months ago
Avro, definitely! It's a compact, efficient, and schema-based format that works great with MapReduce and Python.
upvoted 0 times
Elin
4 months ago
I agree, it's a great choice for working with MapReduce and Python.
upvoted 0 times
...
Leonida
4 months ago
Avro is definitely the way to go for storing data in HDFS.
upvoted 0 times
...
...
Selma
5 months ago
I think CSV would be a good option as it is simple and easy to work with for data analysts and data scientists.
upvoted 0 times
...
Elly
5 months ago
I prefer JSON because it is human-readable and widely supported by programming languages like Python.
upvoted 0 times
...
Valentin
5 months ago
I agree with Leonor, Avro is a good choice for storing data in HDFS and processing it with MapReduce.
upvoted 0 times
...
Leonor
6 months ago
I think we should use Avro because it supports schema evolution and is efficient for MapReduce processing.
upvoted 0 times
...

Save Cancel