You are using a Python notebook in an Apache Spark pool in Azure Synapse Analytics.
You need to present the data distribution statistics from a DataFrame in a tabular view.
Which method should you invoke on the DataFrame?
pandas.DataFrame.corr computes pairwise correlation of columns, excluding NA/null values.
Incorrect:
* freqItems
pyspark.sql.DataFrame.freqItems
Finding frequent items for columns, possibly with false positives. Using the frequent element count algorithm described in https://doi.org/10.1145/762471.762473, proposed by Karp, Schenker, and Papadimitriou.'
* summary is used for index.
* There is no panda method for rollup. Rollup would not be correct anyway.
Roselle
4 months agoVonda
4 months agoHuey
4 months agoRosendo
5 months agoJohnathon
3 months agoArminda
3 months agoNina
3 months agoDiane
3 months agoLezlie
3 months agoGilbert
3 months agoJaime
3 months agoAmber
4 months agoDevorah
4 months agoWhitney
5 months agoLourdes
5 months agoNadine
4 months agoGlory
4 months agoAlecia
5 months agoPamella
5 months agoLyla
4 months agoLilli
4 months agoPeggy
4 months agoHuey
4 months agoCherelle
5 months agoChauncey
5 months ago