Boolean Indexing method

In this method, for a specified column condition, each row is checked for true/false. The rows which yield True will be considered for the output. This can be achieved in various ways. The query used is Select rows where the column Pid=’p01′

Example 1: Select rows from a Pandas DataFrame based on values in a column

In this example, we are trying to select those rows that have the value p01 in their column using the equality operator.

Python3




# Choose entries with id p01
df_new = df[df['Pid'] == 'p01']
 
print(df_new)


Output

 

Example 2: Specifying the condition ‘mask’ variable

Here, we will see Pandas select rows by condition the selected rows are assigned to a new Dataframe with the index of rows from the old Dataframe as an index in the new one and the columns remaining the same. 

Python3




# condition mask
mask = df['Pid'] == 'p01'
 
# new dataframe with selected rows
df_new = pd.DataFrame(df[mask])
 
print(df_new)


Output

 

Example 3: Combining mask and dataframes.values property

The query here is to Select the rows with game_id ‘g21’.

Python3




# condition with df.values property
mask = df['game_id'].values == 'g21'
 
# new dataframe
df_new = df[mask]
 
print(df_new)


Output

 

How to select rows from a dataframe based on column values ?

In this article, we will cover how we select rows from a DataFrame based on column values in Python

The rows of a Dataframe can be selected based on conditions as we do use the SQL queries. The various methods to achieve this is explained in this article with examples. 

Similar Reads

Importing Dataset for demonstration

To explain the method a dataset has been created which contains data of points scored by 10 people in various games. The dataset is loaded into the Dataframe and visualized first. Ten people with unique player id(Pid) have played different games with different game id(game_id) and the points scored in each game are added as an entry to the table. Some of the player’s points are not recorded and thus NaN value appears in the table....

Method 1: Boolean Indexing method

...

Method 2: Positional indexing method

In this method, for a specified column condition, each row is checked for true/false. The rows which yield True will be considered for the output. This can be achieved in various ways. The query used is Select rows where the column Pid=’p01′...

Method 3: Using dataframe.query() method

...

Method 3: Using isin() method

...

Method 4: Using Numpy.where() method

...

Method 5: Comparison with other methods

The methods loc() and iloc() can be used for slicing the Dataframes in Python. Among the differences between loc() and iloc(), the important thing to be noted is iloc() takes only integer indices, while loc() can take up boolean indices also....