Pandas.melt()
melt() is used to convert a wide dataframe into a longer form. This function can be used when there are requirements to consider a specific column as an identifier.
Syntax: pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name=’value’, col_level=None)
Example 1:
Initialize the dataframe with data regarding ‘Days‘, ‘Patients‘ and ‘Recovery‘.
Python3
# importing pandas library import pandas as pd # creating and initializing a list values = [[ 'Monday' , 65000 , 50000 ], [ 'Tuesday' , 68000 , 45000 ], [ 'Wednesday' , 70000 , 55000 ], [ 'Thursday' , 60000 , 47000 ], [ 'Friday' , 49000 , 25000 ], [ 'Saturday' , 54000 , 35000 ], [ 'Sunday' , 100000 , 70000 ]] # creating a pandas dataframe df = pd.DataFrame(values, columns = [ 'DAYS' , 'PATIENTS' , 'RECOVERY' ]) # displaying the data frame df |
Output:
Now, we reshape the data frame using pandas.melt() around column ‘DAYS‘.
Python3
# melting with DAYS as column identifier reshaped_df = df.melt(id_vars = [ 'DAYS' ]) # displaying the reshaped data frame reshaped_df |
Output:
Example 2:
Now, to the dataframe used above a new column named ‘Deaths‘ is introduced.
Python3
# importing pandas library import pandas as pd # creating and initializing a dataframe values = [[ 'Monday' , 65000 , 50000 , 1500 ], [ 'Tuesday' , 68000 , 45000 , 7250 ], [ 'Wednesday' , 70000 , 55000 , 1400 ], [ 'Thursday' , 60000 , 47000 , 4200 ], [ 'Friday' , 49000 , 25000 , 3000 ], [ 'Saturday' , 54000 , 35000 , 2000 ], [ 'Sunday' , 100000 , 70000 , 4550 ]] # creating a pandas dataframe df = pd.DataFrame(values, columns = [ 'DAYS' , 'PATIENTS' , 'RECOVERY' , 'DEATHS' ]) # displaying the data frame df |
Output:
we reshaped the data frame using pandas.melt() around column ‘PATIENTS‘.
Python3
# reshaping data frame # using pandas.melt() reshaped_df = df.melt(id_vars = [ 'PATIENTS' ]) # displaying the reshaped data frame reshaped_df |
Output:
Reshaping Pandas Dataframes using Melt And Unmelt
Pandas is an open-source, BSD-licensed library written in Python Language. Pandas provide high performance, fast, easy to use data structures and data analysis tools for manipulating numeric data and time series. Pandas is built on the Numpy library and written in languages like Python, Cython, and C. In 2008, Wes McKinney developed the Pandas library. In pandas, we can import data from various file formats like JSON, SQL, Microsoft Excel, etc. The dataframes feature is used to load and do manipulations on the data.
Sometimes we need to reshape the Pandas data frame to perform analysis in a better way. Reshaping plays a crucial role in data analysis. Pandas provide function like melt and unmelt for reshaping.