Summarize
Summarization includes counting, describing all the data present in data frame. We can summarize the data present in the data frame using describe() method. This method is used to get min, max, sum, count values from the data frame along with data types of that particular column.
- describe(): This method elaborates the type of data and its attributes.
Syntax:
dataframe_name.describe()
- unique(): This method is used to get all unique values from the given column.
Syntax:
dataframe[‘column_name].unique()
- nunique(): This method is similar to unique but it will return the count the unique values.
Syntax:
dataframe_name[‘column_name].nunique()
- info(): This command is used to get the data types and columns information
Syntax:
dataframe.info()
- columns: This command is used to display all the column names present in data frame
Syntax:
dataframe.columns
Example:
We are going to analyze the student marks data in this example.
Python3
# importing pandas as pd for using data frame import pandas as pd # creating dataframe with student details dataframe = pd.DataFrame({ 'id' : [ 7058 , 4511 , 7014 , 7033 ], 'name' : [ 'sravan' , 'manoj' , 'aditya' , 'bhanu' ], 'Maths_marks' : [ 99 , 97 , 88 , 90 ], 'Chemistry_marks' : [ 89 , 99 , 99 , 90 ], 'telugu_marks' : [ 99 , 97 , 88 , 80 ], 'hindi_marks' : [ 99 , 97 , 56 , 67 ], 'social_marks' : [ 79 , 97 , 78 , 90 ], }) # display dataframe dataframe |
Output:
Python3
# describing the data frame print (dataframe.describe()) print ( "-----------------------------" ) # finding unique values print (dataframe[ 'Maths_marks' ].unique()) print ( "-----------------------------" ) # counting unique values print (dataframe[ 'Maths_marks' ].nunique()) print ( "-----------------------------" ) # display the columns in the data frame print (dataframe.columns) print ( "-----------------------------" ) # information about dataframe print (dataframe.info()) |
Output:
Pandas Groupby: Summarising, Aggregating, and Grouping data in Python
GroupBy is a pretty simple concept. We can create a grouping of categories and apply a function to the categories. It’s a simple concept, but it’s an extremely valuable technique that’s widely used in data science. In real data science projects, you’ll be dealing with large amounts of data and trying things over and over, so for efficiency, we use Groupby concept. Groupby concept is really important because of its ability to summarize, aggregate, and group data efficiently.