You are using a Python notebook in an Apache Spark pool in Azure Synapse Analytics.
You need to present the data distribution statistics from a DataFrame in a tabular view.
Which method should you invoke on the DataFrame?
pandas.DataFrame.corr computes pairwise correlation of columns, excluding NA/null values.
Incorrect:
* freqItems
pyspark.sql.DataFrame.freqItems
Finding frequent items for columns, possibly with false positives. Using the frequent element count algorithm described in https://doi.org/10.1145/762471.762473, proposed by Karp, Schenker, and Papadimitriou.'
* summary is used for index.
* There is no panda method for rollup. Rollup would not be correct anyway.
Roselle
5 months agoVonda
5 months agoHuey
5 months agoRosendo
6 months agoJohnathon
4 months agoArminda
4 months agoNina
4 months agoDiane
4 months agoLezlie
4 months agoGilbert
4 months agoJaime
5 months agoAmber
5 months agoDevorah
5 months agoWhitney
6 months agoLourdes
6 months agoNadine
5 months agoGlory
6 months agoAlecia
6 months agoPamella
6 months agoLyla
5 months agoLilli
5 months agoPeggy
5 months agoHuey
5 months agoCherelle
6 months agoChauncey
6 months ago