Databricks Exam Databricks Certified Associate Developer for Apache Spark 3.0 Topic 2 Question 15 Discussion

Actual exam question for Databricks's Databricks Certified Associate Developer for Apache Spark 3.0 exam

Question #: 15
Topic #: 2

[All Databricks Certified Associate Developer for Apache Spark 3.0 Questions]

Which of the following code blocks shuffles DataFrame transactionsDf, which has 8 partitions, so that it has 10 partitions?

AtransactionsDf.repartition(transactionsDf.getNumPartitions()+2)

BtransactionsDf.repartition(transactionsDf.rdd.getNumPartitions()+2)

CtransactionsDf.coalesce(10)

DtransactionsDf.coalesce(transactionsDf.getNumPartitions()+2)

EtransactionsDf.repartition(transactionsDf._partitions+2)

Show Suggested Answer

Suggested Answer: B

transactionsDf.repartition(transactionsDf.rdd.getNumPartitions()+2)

Correct. The repartition operator is the correct one for increasing the number of partitions. calling getNumPartitions() on DataFrame.rdd returns the current number of partitions.

transactionsDf.coalesce(10)

No, after this command transactionsDf will continue to only have 8 partitions. This is because coalesce() can only decreast the amount of partitions, but not increase it.

transactionsDf.repartition(transactionsDf.getNumPartitions()+2)

Incorrect, there is no getNumPartitions() method for the DataFrame class.

transactionsDf.coalesce(transactionsDf.getNumPartitions()+2)

Wrong, coalesce() can only be used for reducing the number of partitions and there is no getNumPartitions() method for the DataFrame class.

transactionsDf.repartition(transactionsDf._partitions+2)

No, DataFrame has no _partitions attribute. You can find out the current number of partitions of a DataFrame with the DataFrame.rdd.getNumPartitions() method.

More info: pyspark.sql.DataFrame.repartition --- PySpark 3.1.2 documentation, pyspark.RDD.getNumPartitions --- PySpark 3.1.2 documentation

Static notebook | Dynamic notebook: See test 3, Question: 23 (Databricks import instructions)

by Cecily at May 08, 2022, 09:44 AM

Limited Time Offer

25%

Off

Get Premium Databricks Certified Associate Developer for Apache Spark 3.0 Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Currently there are no comments in this discussion, be the first to comment!