You are using a Python notebook in an Apache Spark pool in Azure Synapse Analytics.
You need to present the data distribution statistics from a DataFrame in a tabular view.
Which method should you invoke on the DataFrame?
pandas.DataFrame.corr computes pairwise correlation of columns, excluding NA/null values.
Incorrect:
* freqItems
pyspark.sql.DataFrame.freqItems
Finding frequent items for columns, possibly with false positives. Using the frequent element count algorithm described in https://doi.org/10.1145/762471.762473, proposed by Karp, Schenker, and Papadimitriou.'
* summary is used for index.
* There is no panda method for rollup. Rollup would not be correct anyway.
Roselle
8 months agoVonda
8 months agoHuey
8 months agoRosendo
8 months agoJohnathon
6 months agoArminda
6 months agoNina
6 months agoDiane
6 months agoLezlie
6 months agoGilbert
7 months agoJaime
7 months agoAmber
7 months agoDevorah
8 months agoWhitney
8 months agoLourdes
8 months agoNadine
8 months agoGlory
8 months agoAlecia
9 months agoPamella
9 months agoLyla
7 months agoLilli
7 months agoPeggy
7 months agoHuey
7 months agoCherelle
8 months agoChauncey
8 months ago