Data Wrangling Using Grouping Method
The grouping method in Data wrangling is used to provide results in terms of various groups taken out from Large Data. This method of pandas is used to group the outset of data from the large data set.
Example: There is a Car Selling company and this company have different Brands of various Car Manufacturing Company like Maruti, Toyota, Mahindra, Ford, etc., and have data on where different cars are sold in different years. So the Company wants to wrangle only that data where cars are sold during the year 2010. For this problem, we use another data Wrangling technique which is a pandas groupby() method.
Creating dataframe to use Grouping methods[Car selling datasets]:
Python3
# Import module import pandas as pd # Creating Data car_selling_data = { 'Brand' : [ 'Maruti' , 'Maruti' , 'Maruti' , 'Maruti' , 'Hyundai' , 'Hyundai' , 'Toyota' , 'Mahindra' , 'Mahindra' , 'Ford' , 'Toyota' , 'Ford' ], 'Year' : [ 2010 , 2011 , 2009 , 2013 , 2010 , 2011 , 2011 , 2010 , 2013 , 2010 , 2010 , 2011 ], 'Sold' : [ 6 , 7 , 9 , 8 , 3 , 5 , 2 , 8 , 7 , 2 , 4 , 2 ]} # Creating Dataframe of car_selling_data df = pd.DataFrame(car_selling_data) # printing Dataframe print (df) |
Output:
Creating Dataframe to use Grouping methods[DATA OF THE YEAR 2010]:
Python3
# Import module import pandas as pd # Creating Data car_selling_data = { 'Brand' : [ 'Maruti' , 'Maruti' , 'Maruti' , 'Maruti' , 'Hyundai' , 'Hyundai' , 'Toyota' , 'Mahindra' , 'Mahindra' , 'Ford' , 'Toyota' , 'Ford' ], 'Year' : [ 2010 , 2011 , 2009 , 2013 , 2010 , 2011 , 2011 , 2010 , 2013 , 2010 , 2010 , 2011 ], 'Sold' : [ 6 , 7 , 9 , 8 , 3 , 5 , 2 , 8 , 7 , 2 , 4 , 2 ]} # Creating Dataframe for Provided Data df = pd.DataFrame(car_selling_data) # Group the data when year = 2010 grouped = df.groupby( 'Year' ) print (grouped.get_group( 2010 )) |
Output:
Data Wrangling in Python
Data Wrangling is the process of gathering, collecting, and transforming Raw data into another format for better understanding, decision-making, accessing, and analysis in less time. Data Wrangling is also known as Data Munging.