This is a short post that describes how json-ized arrays can be converted to rows in presto db. Suppose you have the following table, named friends
in presto db (both columns are of type VARCHAR).
name | friend_list |
tom | [“megan”, “harry”, “bob”] |
brad | [“sam”, “kelly”, “chad”] |
Here we see that the column friend_list
is a json-ized array. Further, let us assume that the list is rank ordered, and that you want to obtain the following table as final output (all friends who have rank <=2)
name | friend | friend_rank |
brad | sam | 1 |
brad | kelly | 2 |
tom | megan | 1 |
tom | harry | 2 |
Then, you need the following query
WITH friends(name, friend_list) AS (
VALUES
('tom', '["megan", "harry", "bob"]'),
('brad', '["sam", "kelly", "chad"]')
)
, friends_array AS
(
SELECT
name
, CAST(JSON_PARSE(friend_list) AS ARRAY(VARCHAR)) AS friends
FROM friends
)
SELECT
name
, friend
, friend_rank
FROM friends_array
CROSS JOIN UNNEST(friends) WITH ORDINALITY AS friends(friend, friend_rank)
WHERE friend_rank <= 2
ORDER BY 1, 2 DESC
While most of the code here is self explanatory, the CROSS JOIN
bit (line 20 above) is kind of tricky. It is one of those things where it becomes clear that any vendor who produces a new flavor of SQL has annoying the user
as a necessary goal. Why is it that presto must introduce a UNNEST(...)
when hive already has LATERAL VIEW EXPLODE
to do essentially the same thing? I am sure this was hotly debated by the writers of presto. Standardization be damned, they couldn’t just inherit syntax from the slow and sluggish hive, could they? Or Perhaps it is just that they did not have my phone number handy to settle this debate once and for all **makes mental note to send phone number to support@presto.com**. Oh, well! There you have it — your query, and my rant. Happy SQLing!