I'm taking the 30-day Spark course, but I still have a few doubts, which are as follows:
DataFrame filtering
To read the contents of a table into a dataframe we can write:
df = spark.sql( “SELECT * FROM SparkSetember.propertysales LIMIT 1000”
Subsequently, we can, for example, Filter the table with the function:
df.filter(df.City.startswith(“L”)).show() My question is this:
Why don't we do it straight away:
df = spark.sql( “SELECT * FROM SparkSetember.propertysales WHERE Citi Like ‘L%’ ”