JSON with nested lists
In this case, the nested JSON has a list of JSON objects as the value for some of its attributes. In such a case, we can choose the inner list items to be the records/rows of our dataframe using the record_path attribute.
Python3
# initialising the data data = { 'company' : 'XYZ pvt ltd' , 'location' : 'London' , 'info' : { 'president' : 'Rakesh Kapoor' , 'contacts' : { 'email' : 'contact@xyz.com' , 'tel' : '9876543210' } }, 'employees' : [ { 'name' : 'A' }, { 'name' : 'B' }, { 'name' : 'C' } ] } # converting the data to dataframe df = pd.json_normalize(data) |
Output:
Here, the nested list is not flattened. We need to use record_path attribute to flatten the nested list.
Python3
pd.json_normalize(data,record_path = [ 'employees' ]) |
Output:
Now, we observe that it does not include ‘info’ and other features. To include them we use another attribute, meta. Note that, in the below code, to include an attribute of an inner JSON we have specified the path as “[‘info’, ‘president’]”.
Python3
pd.json_normalize(data, record_path = [ 'employees' ], meta = [ 'company' , 'location' , [ 'info' , 'president' ]]) |
Output:
Now in the case of multiple nested JSON objects, we will get a dataframe with multiple records as shown below.
Python3
data = [ { 'id' : '001' , 'company' : 'XYZ pvt ltd' , 'location' : 'London' , 'info' : { 'president' : 'Rakesh Kapoor' , 'contacts' : { 'email' : 'contact@xyz.com' , 'tel' : '9876543210' } }, 'employees' : [ { 'name' : 'A' }, { 'name' : 'B' }, { 'name' : 'C' } ] }, { 'id' : '002' , 'company' : 'PQR Associates' , 'location' : 'Abu Dhabi' , 'info' : { 'president' : 'Neelam Subramaniyam' , 'contacts' : { 'email' : 'contact@pqr.com' , 'tel' : '8876443210' } }, 'employees' : [ { 'name' : 'L' }, { 'name' : 'M' }, { 'name' : 'N' } ] } ] df = pd.json_normalize(data, record_path = [ 'employees' ], meta = [ 'company' , 'location' , [ 'info' , 'president' ]]) print (df) |
Output :
Converting nested JSON structures to Pandas DataFrames
In this article, we are going to see how to convert nested JSON structures to Pandas DataFrames.