Which of the following code blocks returns a DataFrame that matches the multi-column DataFrame itemsDf, except that integer column itemId has been converted into a string column?
itemsDf.withColumn('itemId', col('itemId').cast('string'))
Correct. You can convert the data type of a column using the cast method of the Column class. Also note that you will have to use the withColumn method on itemsDf for replacing the existing itemId
column with the new version that contains strings.
itemsDf.withColumn('itemId', col('itemId').convert('string'))
Incorrect. The Column object that col('itemId') returns does not have a convert method.
itemsDf.withColumn('itemId', convert('itemId', 'string'))
Wrong. Spark's spark.sql.functions module does not have a convert method. The Question: is trying to mislead you by using the word 'converted'. Type conversion is also called 'type
casting'. This
may help you remember to look for a cast method instead of a convert method (see correct answer).
itemsDf.select(astype('itemId', 'string'))
False. While astype is a method of Column (and an alias of Column.cast), it is not a method of pyspark.sql.functions (what the code block implies). In addition, the Question: asks to return a
full
DataFrame that matches the multi-column DataFrame itemsDf. Selecting just one column from itemsDf as in the code block would just return a single-column DataFrame.
spark.cast(itemsDf, 'itemId', 'string')
No, the Spark session (called by spark) does not have a cast method. You can find a list of all methods available for the Spark session linked in the documentation below.
More info:
- pyspark.sql.Column.cast --- PySpark 3.1.2 documentation
- pyspark.sql.Column.astype --- PySpark 3.1.2 documentation
- pyspark.sql.SparkSession --- PySpark 3.1.2 documentation
Static notebook | Dynamic notebook: See test 3, Question: 42 (Databricks import instructions)
Currently there are no comments in this discussion, be the first to comment!