After calling a chomp_*
function or es_search
, if
you had a nested array in the JSON, its corresponding column in the
resulting data.table is a data.frame itself (or a list of vectors). This
function expands that nested column out, adding its data to the original
data.table, and duplicating metadata down the rows as necessary.
This is a side-effect-free function: it returns a new data.table and the input data.table is unmodified.
unpack_nested_data(chomped_df, col_to_unpack)
chomped_df | a data.table |
---|---|
col_to_unpack | a character vector of length one: the column name to unpack |
# A sample raw result from a hits query: result <- '[{"_source":{"timestamp":"2017-01-01","cust_name":"Austin","details":{ "cust_class":"big_spender","location":"chicago","pastPurchases":[{"film":"The Notebook", "pmt_amount":6.25},{"film":"The Town","pmt_amount":8.00},{"film":"Zootopia","pmt_amount":7.50, "matinee":true}]}}},{"_source":{"timestamp":"2017-02-02","cust_name":"James","details":{ "cust_class":"peasant","location":"chicago","pastPurchases":[{"film":"Minions", "pmt_amount":6.25,"matinee":true},{"film":"Rogue One","pmt_amount":10.25},{"film":"Bridesmaids", "pmt_amount":8.75},{"film":"Bridesmaids","pmt_amount":6.25,"matinee":true}]}}},{"_source":{ "timestamp":"2017-03-03","cust_name":"Nick","details":{"cust_class":"critic","location":"cannes", "pastPurchases":[{"film":"Aala Kaf Ifrit","pmt_amount":0,"matinee":true},{ "film":"Dopo la guerra (Apres la Guerre)","pmt_amount":0,"matinee":true},{ "film":"Avengers: Infinity War","pmt_amount":12.75}]}}}]' # Chomp into a data.table sampleChompedDT <- chomp_hits(hits_json = result, keep_nested_data_cols = TRUE)#> INFO [2020-05-11 23:27:38] Keeping the following nested data columns. Consider using unpack_nested_data for one: #> details.pastPurchasesprint(sampleChompedDT)#> timestamp cust_name details.cust_class details.location #> 1: 2017-01-01 Austin big_spender chicago #> 2: 2017-02-02 James peasant chicago #> 3: 2017-03-03 Nick critic cannes #> details.pastPurchases #> 1: <data.frame> #> 2: <data.frame> #> 3: <data.frame># (Note: use es_search() to get here in one step) # Unpack by details.pastPurchases unpackedDT <- unpack_nested_data(chomped_df = sampleChompedDT , col_to_unpack = "details.pastPurchases") print(unpackedDT)#> timestamp cust_name details.cust_class details.location #> 1: 2017-01-01 Austin big_spender chicago #> 2: 2017-01-01 Austin big_spender chicago #> 3: 2017-01-01 Austin big_spender chicago #> 4: 2017-02-02 James peasant chicago #> 5: 2017-02-02 James peasant chicago #> 6: 2017-02-02 James peasant chicago #> 7: 2017-02-02 James peasant chicago #> 8: 2017-03-03 Nick critic cannes #> 9: 2017-03-03 Nick critic cannes #> 10: 2017-03-03 Nick critic cannes #> film pmt_amount matinee #> 1: The Notebook 6.25 NA #> 2: The Town 8.00 NA #> 3: Zootopia 7.50 TRUE #> 4: Minions 6.25 TRUE #> 5: Rogue One 10.25 NA #> 6: Bridesmaids 8.75 NA #> 7: Bridesmaids 6.25 TRUE #> 8: Aala Kaf Ifrit 0.00 TRUE #> 9: Dopo la guerra (Apres la Guerre) 0.00 TRUE #> 10: Avengers: Infinity War 12.75 NA