ASHIVNI SHEKHAWAT
Research scientist at Lyft Inc.

This is a short post that describes how json-ized arrays can be converted to rows in presto db. Suppose you have the following table, named friends in presto db (both columns are of type VARCHAR).

namefriend_list
tom[“megan”, “harry”, “bob”]
brad[“sam”, “kelly”, “chad”]

Here we see that the column friend_list is a json-ized array. Further, let us assume that the list is rank ordered, and that you want to obtain the following table as final output (all friends who have rank <=2)

namefriendfriend_rank
bradsam1
bradkelly2
tommegan1
tomharry2

Then, you need the following query

WITH friends(name, friend_list) AS (
  VALUES
    ('tom', '["megan", "harry", "bob"]'),
    ('brad', '["sam", "kelly", "chad"]')
)

, friends_array AS 
(
  SELECT
    name
    , CAST(JSON_PARSE(friend_list) AS ARRAY(VARCHAR)) AS friends
  FROM friends
)

SELECT 
  name
  , friend
  , friend_rank
FROM friends_array
CROSS JOIN UNNEST(friends) WITH ORDINALITY AS friends(friend, friend_rank)
WHERE friend_rank <= 2
ORDER BY 1, 2 DESC

While most of the code here is self explanatory, the CROSS JOIN bit (line 20 above) is kind of tricky. It is one of those things where it becomes clear that any vendor who produces a new flavor of SQL has annoying the user as a necessary goal. Why is it that presto must introduce a UNNEST(...) when hive already has LATERAL VIEW EXPLODE to do essentially the same thing? I am sure this was hotly debated by the writers of presto. Standardization be damned, they couldn’t just inherit syntax from the slow and sluggish hive, could they? Or Perhaps it is just that they did not have my phone number handy to settle this debate once and for all **makes mental note to send phone number to support@presto.com**. Oh, well! There you have it — your query, and my rant. Happy SQLing!