You are using a Python notebook in an Apache Spark pool in Azure Synapse Analytics.
You need to present the data distribution statistics from a DataFrame in a tabular view.
Which method should you invoke on the DataFrame?
pandas.DataFrame.corr computes pairwise correlation of columns, excluding NA/null values.
Incorrect:
* freqItems
pyspark.sql.DataFrame.freqItems
Finding frequent items for columns, possibly with false positives. Using the frequent element count algorithm described in https://doi.org/10.1145/762471.762473, proposed by Karp, Schenker, and Papadimitriou.'
* summary is used for index.
* There is no panda method for rollup. Rollup would not be correct anyway.
Roselle
6 months agoVonda
7 months agoHuey
7 months agoRosendo
7 months agoJohnathon
5 months agoArminda
5 months agoNina
5 months agoDiane
5 months agoLezlie
5 months agoGilbert
5 months agoJaime
6 months agoAmber
6 months agoDevorah
6 months agoWhitney
7 months agoLourdes
7 months agoNadine
6 months agoGlory
7 months agoAlecia
7 months agoPamella
7 months agoLyla
6 months agoLilli
6 months agoPeggy
6 months agoHuey
6 months agoCherelle
7 months agoChauncey
7 months ago