Pyspark DataFrame Operations - Basics
In this post, we will be discussing on how to perform different dataframe operations such as a aggregations, ordering, joins and other similar data manipulations on a spark dataframe. Introduction Spark provides the Dataframe API, which is a very powerful API which enables the user to perform parallel and distrivuted structured data processing on the input data. A Spark dataframe is a dataet with a named set of columns.