Research scientist at Lyft Inc.
This is a short post that describes how json-ized arrays can be converted to rows in presto db. Suppose you have the following table, named
friends in presto db (both columns are of type VARCHAR).
|tom||[“megan”, “harry”, “bob”]|
|brad||[“sam”, “kelly”, “chad”]|
Here we see that the column
friend_list is a json-ized array. Further, let us assume that the list is rank ordered, and that you want to obtain the following table as final output (all friends who have rank <=2)
Then, you need the following query
WITH friends(name, friend_list) AS ( VALUES ('tom', '["megan", "harry", "bob"]'), ('brad', '["sam", "kelly", "chad"]') ) , friends_array AS ( SELECT name , CAST(JSON_PARSE(friend_list) AS ARRAY(VARCHAR)) AS friends FROM friends ) SELECT name , friend , friend_rank FROM friends_array CROSS JOIN UNNEST(friends) WITH ORDINALITY AS friends(friend, friend_rank) WHERE friend_rank <= 2 ORDER BY 1, 2 DESC
While most of the code here is self explanatory, the
CROSS JOIN bit (line 20 above) is kind of tricky. It is one of those things where it becomes clear that any vendor who produces a new flavor of SQL has
annoying the user as a necessary goal. Why is it that presto must introduce a
UNNEST(...) when hive already has
LATERAL VIEW EXPLODE to do essentially the same thing? I am sure this was hotly debated by the writers of presto. Standardization be damned, they couldn’t just inherit syntax from the slow and sluggish hive, could they? Or Perhaps it is just that they did not have my phone number handy to settle this debate once and for all **makes mental note to send phone number to firstname.lastname@example.org**. Oh, well! There you have it — your query, and my rant. Happy SQLing!