Slicing Pandas Dataframe
With the help of Pandas, we can perform slicing in Dataframe. Slicing in pandas dataframes using iloc[]
is a powerful technique in Python for extracting specific subsets of data. The iloc[]
method allows you to locate and extract rows and columns based on their integer positions.
To perform slicing with iloc[]
, you specify the row and column indices you want to include in your sliced dataframe. The syntax is similar to traditional array slicing, making it intuitive for Python users. For example, df.iloc[1:5, 2:4]
extracts rows 2 to 5 and columns 3 to 4 from the dataframe.
Slicing a DataFrame in Pandas includes the following steps:
- Create a DataFrame
- Slice the DataFrame
Let’s import pandas library , and create pandas dataframe from custom nested list.
Python3
import pandas as pd # Initializing the nested list with Data set player_list = [[ 'M.S.Dhoni' , 36 , 75 , 5428000 ], [ 'A.B.D Villers' , 38 , 74 , 3428000 ], [ 'V.Kohli' , 31 , 70 , 8428000 ], [ 'S.Smith' , 34 , 80 , 4428000 ], [ 'C.Gayle' , 40 , 100 , 4528000 ], [ 'J.Root' , 33 , 72 , 7028000 ], [ 'K.Peterson' , 42 , 85 , 2528000 ]] # creating a pandas dataframe df = pd.DataFrame(player_list, columns = [ 'Name' , 'Age' , 'Weight' , 'Salary' ]) df # data frame before slicing |
Output:
Name Age Weight Salary
0 M.S.Dhoni 36 75 5428000
1 A.B.D Villers 38 74 3428000
2 V.Kohli 31 70 8428000
3 S.Smith 34 80 4428000
4 C.Gayle 40 100 4528000
5 J.Root 33 72 7028000
6 K.Peterson 42 85 2528000
1. Slicing Using iloc
A. Slicing Rows in dataframe in python
Python3
# Slicing rows in data frame df1 = df.iloc[ 0 : 4 ] # data frame after slicing df1 |
Output:
Name Age Weight Salary
0 M.S.Dhoni 36 75 5428000
1 A.B.D Villers 38 74 3428000
2 V.Kohli 31 70 8428000
3 S.Smith 34 80 4428000
In the above example, we sliced the rows from the data frame.
B. Slicing Columns in dataframe in python
Python3
# Slicing columnss in data frame df1 = df.iloc[:, 0 : 2 ] # data frame after slicing df1 |
Output:
Name Age
0 M.S.Dhoni 36
1 A.B.D Villers 38
2 V.Kohli 31
3 S.Smith 34
4 C.Gayle 40
5 J.Root 33
6 K.Peterson 42
In the above example, we sliced the columns from the data frame.
C. Selecting a Specific Cell in Dataframe in Python
Python3
specific_cell_value = df.iloc[ 2 , 3 ] # Row 2, Column 3 (Salary) print ( "Specific Cell Value:" , specific_cell_value) |
Output:
Specific Cell Value: 8428000
D. Using Boolean Conditions in Dataframe in Python
Python3
filtered_data = df[df[ 'Age' ] > 35 ].iloc[:, :] # Select rows where Age is greater than 35 print ( "\nFiltered Data based on Age > 35:\n" , filtered_data) |
Output:
Filtered Data based on Age > 35:
Name Age Weight Salary
0 M.S.Dhoni 36 75 5428000
1 A.B.D Villers 38 74 3428000
4 C.Gayle 40 100 4528000
6 K.Peterson 42 85 2528000
2. Slicing Using loc[]
We can also, implement slicing through loc there are some limitations:
loc
relies on labels, and if your DataFrame has custom labels, you need to be careful with how you specify them.- If labels are integers, there might be confusion between using integer positions and actual labels.
For this, we need to set index as labels manually with following code:
Python3
df_custom = df.set_index( 'Name' ) df_custom |
Output:
Age Weight Salary
Name
M.S.Dhoni 36 75 5428000
A.B.D Villers 38 74 3428000
V.Kohli 31 70 8428000
S.Smith 34 80 4428000
C.Gayle 40 100 4528000
J.Root 33 72 7028000
K.Peterson 42 85 2528000
A. Slicing Rows in Dataframe in Python
Python3
sliced_rows_custom = df_custom.loc[ 'A.B.D Villers' : 'S.Smith' ] sliced_rows_custom |
Output:
Age Weight Salary
Name
A.B.D Villers 38 74 3428000
V.Kohli 31 70 8428000
S.Smith 34 80 4428000
B. Selecting Specified cell in Dataframe in Python
Python3
specific_cell_value = df_custom.loc[ 'V.Kohli' , 'Salary' ] print ( "\nValue of the Specific Cell (V.Kohli, Salary):" , specific_cell_value) |
Output:
Value of the Specific Cell (V.Kohli, Salary): 8428000
Slicing Pandas Dataframe
Slicing Pandas DataFrames is a powerful technique, allowing extraction of specific data subsets based on integer positions. In this article, let’s understand examples showcasing row and column slicing, cell selection, and boolean conditions.