Melting and Casting
Data reshaping involves many steps in order to obtain desired or required format. One of the popular methods is melting the data which converts each row into a unique id-variable combination and then casting it. The two functions used for this process:
melt():
It is used to convert a data frame into a molten data frame.
Syntax: melt(data, …, na.rm=FALSE, value.name=”value”)
where,
data: data to be melted
… : arguments
na.rm: converts explicit missings into implicit missings
value.name: storing values
dcast():
It is used to aggregate the molten data frame into a new form.
Syntax: melt(data, formula, fun.aggregate)
where,
data: data to be melted
formula: formula that defines how to cast
fun.aggregate: used if there is a data aggregation
Example:
R
library (reshape2) a <- data.frame (id = c ( "1" , "1" , "2" , "2" ), points = c ( "1" , "2" , "1" , "2" ), x1 = c ( "5" , "3" , "6" , "2" ), x2 = c ( "6" , "5" , "1" , "4" )) # Convert numeric columns to actual numeric values a$x1 <- as.numeric ( as.character (a$x1)) a$x2 <- as.numeric ( as.character (a$x2)) print ( "Melting" ) m <- melt (a, id = c ( "id" , "points" )) print (m) print ( "Casting" ) idmn <- dcast (m, id ~ variable, mean) print (idmn) |
Output:
[1] "Melting"
id points variable value
1 1 1 x1 5
2 1 2 x1 3
3 2 1 x1 6
4 2 2 x1 2
5 1 1 x2 6
6 1 2 x2 5
7 2 1 x2 1
8 2 2 x2 4
[1] "Casting"
id x1 x2
1 1 4 5.5
2 2 4 2.5
Data Reshaping in R Programming
Generally, in R Programming Language, data processing is done by taking data as input from a data frame where the data is organized into rows and columns. Data frames are mostly used since extracting data is much simpler and hence easier. But sometimes we need to reshape the format of the data frame from the one we receive. Hence, in R, we can split, merge and reshape the data frame using various functions.
The various forms of reshaping data in a data frame are:
- Transpose of a Matrix
- Joining Rows and Columns
- Merging of Data Frames
- Melting and Casting