A data analyst runs the following command:
SELECT age, country
FROM my_table
WHERE age >= 75 AND country = 'canada';
Which of the following tables represents the output of the above command?
A)
B)
C)
D)
E)
Option A uses theSELECT DISTINCTstatement to remove duplicate rows from thetable_bronzeand create a new tabletable_silverwith the deduplicated data.This is the correct way to deduplicate data using Spark SQL12. Option B simply inserts all the rows fromtable_bronzeintotable_silver, without removing any duplicates. Option C is not a valid syntax for Spark SQL, as there is noMERGE DEDUPLICATEstatement. Option D appends all the rows fromtable_bronzeintotable_silver, without removing any duplicates. Option E overwrites the existing data intable_silverwith the data fromtable_bronze, without removing any duplicates.Reference:Delete Duplicate using SPARK SQL,Spark SQL - How to Remove Duplicate Rows
Limited Time Offer
25%
Off
Teri
4 months agoThomasena
4 months agoJules
4 months agoTaryn
4 months agoMarya
5 months agoAntonio
5 months agoNobuko
5 months agoYoulanda
5 months agoJacqueline
5 months agoAntonio
5 months agoLuis
5 months agoMakeda
5 months agoRaina
5 months agoIsidra
6 months agoXochitl
6 months agoMichell
6 months agoLorean
6 months agoMelda
10 months agoAdolph
9 months agoThea
9 months agoLelia
9 months agoBettina
11 months agoJesusa
9 months agoShantell
9 months agoRolland
9 months agoCoral
11 months agoJesusita
11 months agoGlynda
10 months agoKris
10 months agoWilliam
10 months agoYoulanda
11 months agoLilli
11 months agoShasta
10 months agoPa
10 months agoTeri
11 months agoMarla
12 months agoFelicidad
12 months ago