using duplicated() function In R Language
duplicated() function will return the duplicated rows and !duplicated() function will return the unique rows.
Syntax:
dataframe[!duplicated(dataframe$column_name), ]
Here, dataframe is the input dataframe and column_name is the column in dataframe, based on that column the duplicate data is removed.
Example: R program to remove duplicate data based on particular column
R
# load the package library (dplyr) # create dataframe with three columns # named id,name and address data1= data.frame (id= c (1,2,3,4,5,6,7,1,4,2), name= c ( 'sravan' , 'ojaswi' , 'bobby' , 'gnanesh' , 'rohith' , 'pinkey' , 'dhanush' , 'sravan' , 'gnanesh' , 'ojaswi' ), address= c ( 'hyd' , 'hyd' , 'ponnur' , 'tenali' , 'vijayawada' , 'vijayawada' , 'guntur' , 'hyd' , 'tenali' , 'hyd' )) # remove duplicate rows using duplicated() # function based on name column print (data1[! duplicated (data1$name), ] ) print ( "=====================" ) # remove duplicate rows using duplicated() # function based on id column print (data1[! duplicated (data1$id), ] ) print ( "=====================" ) # remove duplicate rows using duplicated() # function based on address column print (data1[! duplicated (data1$address), ] ) print ( "=====================" ) |
Output:
Remove Duplicate rows in R using Dplyr
In this article, we are going to remove duplicate rows in R programming language using Dplyr package.