New Year Sale ! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Exam Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Topic 1 Question 5 Discussion

Actual exam question for Databricks's Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 exam
Question #: 5
Topic #: 1
[All Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Questions]

Which of the following code blocks displays various aggregated statistics of all columns in DataFrame transactionsDf, including the standard deviation and minimum of values in each column?

Show Suggested Answer Hide Answer
Suggested Answer: E

The DataFrame.summary() command is very practical for quickly calculating statistics of a DataFrame. You need to call .show() to display the results of the calculation. By default, the command

calculates various statistics (see documentation linked below), including standard deviation and minimum. Note that the answer that lists many options in the summary() parentheses does not

include the minimum, which is asked for in the question.

Answer options that include agg() do not work here as shown, since DataFrame.agg() expects more complex, column-specific instructions on how to aggregate values.

More info:

- pyspark.sql.DataFrame.summary --- PySpark 3.1.2 documentation

- pyspark.sql.DataFrame.agg --- PySpark 3.1.2 documentation

Static notebook | Dynamic notebook: See test 3, Question: 46 (Databricks import instructions)


Contribute your Thoughts:

Currently there are no comments in this discussion, be the first to comment!


Save Cancel