The code block displayed below contains an error. The code block should produce a DataFrame with color as the only column and three rows with color values of red, blue, and green, respectively.
Find the error.
Code block:
1. spark.createDataFrame([("red",), ("blue",), ("green",)], "color")
Instead of calling spark.createDataFrame, just DataFrame should be called.
Correct code block:
spark.createDataFrame([('red',), ('blue',), ('green',)], ['color'])
The createDataFrame syntax is not exactly straightforward, but luckily the documentation (linked below) provides several examples on how to use it. It also shows an example very similar to the
code block presented here which should help you answer this Question: correctly.
More info: pyspark.sql.SparkSession.createDataFrame --- PySpark 3.1.2 documentation
Static notebook | Dynamic notebook: See test 2, Question: 23 (Databricks import instructions)
Currently there are no comments in this discussion, be the first to comment!