BlackFriday 2024! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Exam Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Topic 2 Question 66 Discussion

Actual exam question for Databricks's Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 exam
Question #: 66
Topic #: 2
[All Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Questions]

The code block shown below should add column transactionDateForm to DataFrame transactionsDf. The column should express the unix-format timestamps in column transactionDate as string

type like Apr 26 (Sunday). Choose the answer that correctly fills the blanks in the code block to accomplish this.

transactionsDf.__1__(__2__, from_unixtime(__3__, __4__))

Show Suggested Answer Hide Answer
Suggested Answer: C

Correct code block:

transactionsDf.withColumn('transactionDateForm', from_unixtime('transactionDate', 'MMM d (EEEE)'))

The Question: specifically asks about 'adding' a column. In the context of all presented answers, DataFrame.withColumn() is the correct command for this. In theory, DataFrame.select() could

also be

used for this purpose, if all existing columns are selected and a new one is added. DataFrame.withColumnRenamed() is not the appropriate command, since it can only rename existing columns, but

cannot add a new column or change the value of a column.

Once DataFrame.withColumn() is chosen, you can read in the documentation (see below) that the first input argument to the method should be the column name of the new column.

The final difficulty is the date format. The Question: indicates that the date format Apr 26 (Sunday) is desired. The answers give 'MMM d (EEEE)' and 'MM d (EEE)' as options. It can be hard

to

know the details of the date format that is used in Spark. Specifically, knowing the differences between MMM and MM is probably not something you deal with every day. But, there is an easy way

to remember the difference: M (one letter) is usually the shortest form: 4 for April. MM includes padding: 04 for April. MMM (three letters) is the three-letter month abbreviation: Apr for April. And

MMMM is the longest possible form: April. Knowing this four-letter sequence helps you select the correct option here.

More info: pyspark.sql.DataFrame.withColumn --- PySpark 3.1.2 documentation

Static notebook | Dynamic notebook: See test 3, Question: 35 (Databricks import instructions)


Contribute your Thoughts:

Mari
14 days ago
Option E is using the wrong method, 'withColumnRenamed' won't create a new column, it'll just rename an existing one.
upvoted 0 times
...
Norah
15 days ago
Haha, option D has the right idea, but 'MM d (EEE)' is not the correct date format. It should be 'MMM d (EEEE)'.
upvoted 0 times
...
Cruz
16 days ago
Option C is close, but the order of the arguments for from_unixtime() is wrong. It should be 'transactionDate' for the third argument.
upvoted 0 times
Elroy
5 days ago
A) 1. withColumn 2. transactionDateForm 3. MMM d (EEEE) 4. transactionDate
upvoted 0 times
...
...
Jamal
17 days ago
I'm not sure about option B, it seems to be selecting the 'transactionDate' column instead of creating a new one.
upvoted 0 times
...
Fredric
18 days ago
Option A looks good, the blanks are filled correctly to add the new column 'transactionDateForm' to the DataFrame.
upvoted 0 times
...
Abraham
19 days ago
Option E? Really? Who would want to rename the 'transactionDate' column? That's not what the question is asking for.
upvoted 0 times
Edison
1 days ago
B) 1. select 2. transactionDate 3. transactionDateForm 4. MMM d (EEEE)
upvoted 0 times
...
Dion
4 days ago
C) 1. withColumn 2. transactionDateForm 3. transactionDate 4. MMM d (EEEE)
upvoted 0 times
...
Tran
7 days ago
A) 1. withColumn 2. transactionDateForm 3. MMM d (EEEE) 4. transactionDate
upvoted 0 times
...
...
Domonique
24 days ago
I'm going with option D. 'MM d (EEE)' is a nice, concise way to display the date. Plus, it's shorter than 'MMM d (EEEE)'.
upvoted 0 times
Nieves
1 days ago
I disagree, I believe option A is the correct one. It specifies the column name and format clearly.
upvoted 0 times
...
Yong
14 days ago
I think option D is the best choice too. It's simple and clear.
upvoted 0 times
...
...
Earlean
1 months ago
Hmm, I'm torn between options C and D. Both seem to use the correct columns, but the date format in option D looks cleaner.
upvoted 0 times
...
Rolland
1 months ago
I think option D is the correct answer. The date format 'MM d (EEE)' seems more appropriate for the requirements.
upvoted 0 times
Yun
18 days ago
Option D with 'MM d (EEE)' format seems suitable.
upvoted 0 times
...
Lavera
19 days ago
I believe option D is the correct choice.
upvoted 0 times
...
Ma
21 days ago
I think option D is the correct answer.
upvoted 0 times
...
...
Salena
1 months ago
I agree with Margart. Option A seems to be the most logical choice for adding the new column.
upvoted 0 times
...
Ciara
1 months ago
I'm not sure, but I think option C could also work. It's a tough choice between A and C.
upvoted 0 times
...
Margart
2 months ago
I think the correct answer is A because we need to add a new column with the specified format.
upvoted 0 times
...
Carmelina
2 months ago
Option A looks good, but I'm not sure about the date format. Shouldn't it be 'MMM d (EEE)' instead of 'MMM d (EEEE)'?
upvoted 0 times
Annamae
18 days ago
Let's go with option A but change the date format to 'MMM d (EEE)'.
upvoted 0 times
...
Precious
19 days ago
Yeah, I also think the date format should be 'MMM d (EEE)'.
upvoted 0 times
...
Judy
22 days ago
I agree, but I think the date format should be 'MMM d (EEE)' instead of 'MMM d (EEEE)'.
upvoted 0 times
...
Charlette
28 days ago
I think option A is the correct one.
upvoted 0 times
...
...

Save Cancel