Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Cloudera CCA175 Exam Questions

Status: RETIRED
Exam Name: CCA Spark and Hadoop Developer
Exam Code: CCA175
Related Certification(s): Cloudera Certified Associate Certification
Certification Provider: Cloudera
Number of CCA175 practice questions in our database: 96 (updated: 16-08-2024)
Expected CCA175 Exam Topics, as suggested by Cloudera :
  • Topic 1: Understand the fundamentals of querying datasets in Spark/ Write the results back into HDFS using Spark
  • Topic 2: Write queries that calculate aggregate statistics/ Load data from HDFS for use in Spark applications
  • Topic 3: Use meta store tables as an input source or an output sink for Spark applications/ Filter data using Spark
  • Topic 4: Generate reports by using queries against loaded data/ Produce ranked or sorted data
  • Topic 5: Perform standard extract, transform, load (ETL) processes on data using the Spark API/ Join disparate datasets using Spark
  • Topic 6: Use Spark SQL to interact with the meta store programmatically in your applications/ Read and write files in a variety of file formats
Disscuss Cloudera CCA175 Topics, Questions or Ask Anything Related

Olga

8 months ago
Passing the Cloudera CCA Spark and Hadoop Developer exam was a great achievement for me, and I attribute my success to practicing with Pass4Success practice questions. The exam tested my knowledge of writing queries that calculate aggregate statistics and loading data from HDFS for use in Spark applications. One question that I found particularly tricky was about writing results back into HDFS using Spark. Despite my initial uncertainty, I was able to answer it correctly and pass the exam.
upvoted 0 times
...

Larae

9 months ago
My exam experience was successful as I passed the Cloudera CCA Spark and Hadoop Developer exam. The topics of loading data from HDFS for use in Spark applications were crucial for the exam. One question that I remember was about understanding the fundamentals of querying datasets in Spark. It was a challenging question, but I was able to answer it correctly and pass the exam.
upvoted 0 times
...

Daisy

10 months ago
Just passed the CCA Spark and Hadoop Developer exam! Be prepared for hands-on questions on Spark SQL transformations. Focus on understanding window functions and their applications. Thanks to Pass4Success for the spot-on practice questions that helped me prepare efficiently!
upvoted 0 times
...

Sharmaine

10 months ago
I recently passed the Cloudera CCA Spark and Hadoop Developer exam with the help of Pass4Success practice questions. The exam covered topics such as querying datasets in Spark and writing results back into HDFS. One question that stood out to me was related to writing queries that calculate aggregate statistics. I was a bit unsure of the answer, but I managed to pass the exam.
upvoted 0 times
...

Free Cloudera CCA175 Exam Actual Questions

Note: Premium Questions for CCA175 were last updated On 16-08-2024 (see below)

Question #1

Problem Scenario 32 : You have given three files as below.

spark3/sparkdir1/file1.txt

spark3/sparkd ir2ffile2.txt

spark3/sparkd ir3Zfile3.txt

Each file contain some text.

spark3/sparkdir1/file1.txt

Apache Hadoop is an open-source software framework written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common and should be automatically handled by the framework

spark3/sparkdir2/file2.txt

The core of Apache Hadoop consists of a storage part known as Hadoop Distributed File System (HDFS) and a processing part called MapReduce. Hadoop splits files into large blocks and distributes them across nodes in a cluster. To process data, Hadoop transfers packaged code for nodes to process in parallel based on the data that needs to be processed.

spark3/sparkdir3/file3.txt

his approach takes advantage of data locality nodes manipulating the data they have access to to allow the dataset to be processed faster and more efficiently than it would be in a more conventional supercomputer architecture that relies on a parallel file system where computation and data are distributed via high-speed networking

Now write a Spark code in scala which will load all these three files from hdfs and do the word count by filtering following words. And result should be sorted by word count in reverse order.

Filter words ("a","the","an", "as", "a","with","this","these","is","are","in", "for", "to","and","The","of")

Also please make sure you load all three files as a Single RDD (All three files must be loaded using single API call).

You have also been given following codec

import org.apache.hadoop.io.compress.GzipCodec

Please use above codec to compress file, while saving in hdfs.

Reveal Solution Hide Solution
Correct Answer: A

Question #2

Problem Scenario 79 : You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.orders

table=retail_db.order_items

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Columns of products table : (product_id | product categoryid | product_name | product_description | product_prtce | product_image )

Please accomplish following activities.

1. Copy "retaildb.products" table to hdfs in a directory p93_products

2. Filter out all the empty prices

3. Sort all the products based on price in both ascending as well as descending order.

4. Sort all the products based on price as well as product_id in descending order.

5. Use the below functions to do data ordering or ranking and fetch top 10 elements top()

takeOrdered() sortByKey()

Reveal Solution Hide Solution
Correct Answer: A

Question #3

Problem Scenario 80 : You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.products

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Columns of products table : (product_id | product_category_id | product_name | product_description | product_price | product_image )

Please accomplish following activities.

1. Copy "retaildb.products" table to hdfs in a directory p93_products

2. Now sort the products data sorted by product price per category, use productcategoryid colunm to group by category

Reveal Solution Hide Solution
Correct Answer: A

Question #4

Problem Scenario 94 : You have to run your Spark application on yarn with each executor 20GB and number of executors should be 50.Please replace XXX, YYY, ZZZ

export HADOOP_CONF_DIR=XXX

./bin/spark-submit \

-class com.hadoopexam.MyTask \

xxx\

-deploy-mode cluster \ # can be client for client mode

YYY\

222 \

/path/to/hadoopexam.jar \

1000

Reveal Solution Hide Solution
Correct Answer: A

Question #5

Problem Scenario 88 : You have been given below three files

product.csv (Create this file in hdfs)

productID,productCode,name,quantity,price,supplierid

1001,PEN,Pen Red,5000,1.23,501

1002,PEN,Pen Blue,8000,1.25,501

1003,PEN,Pen Black,2000,1.25,501

1004,PEC,Pencil 2B,10000,0.48,502

1005,PEC,Pencil 2H,8000,0.49,502

1006,PEC,Pencil HB,0,9999.99,502

2001,PEC,Pencil 3B,500,0.52,501

2002,PEC,Pencil 4B,200,0.62,501

2003,PEC,Pencil 5B,100,0.73,501

2004,PEC,Pencil 6B,500,0.47,502

supplier.csv

supplierid,name,phone

501,ABC Traders,88881111

502,XYZ Company,88882222

503,QQ Corp,88883333

products_suppliers.csv

productID,supplierID

2001,501

2002,501

2003,501

2004,502

2001,503

Now accomplish all the queries given in solution.

1. It is possible that, same product can be supplied by multiple supplier. Now find each product, its price according to each supplier.

2. Find all the supllier name, who are supplying 'Pencil 3B'

3. Find all the products , which are supplied by ABC Traders.

Reveal Solution Hide Solution
Correct Answer: B


Unlock Premium CCA175 Exam Questions with Advanced Practice Test Features:
  • Select Question Types you want
  • Set your Desired Pass Percentage
  • Allocate Time (Hours : Minutes)
  • Create Multiple Practice tests with Limited Questions
  • Customer Support
Get Full Access Now

Save Cancel