How to use array join to extract and query varying attributes using map keys and values
Question
If I have varying attributes in a column using map types, how can I extract them and use them in queries?
Answer
This is a basic example of extracting keys and values from a variable attributes field. This method will create seemingly duplicates from each row in the source/raw table. Due to the keys and values being extracted, however, they can be put into the Primary Key or a secondary with an index, such as a bloom filter.
In this example, we basically have a source that creates a metrics table, it has multiple attributes that can apply in an attributes field that has maps. If there are attributes that will always be present for records, it is better to pull those out into their own columns and populate.
You should be able to just copy and paste to see what the outputs would be and what the materialized view does in this instance.
Create a sample database:
Create the initial table that will have the rows and attributes:
Insert sample rows into the table. The sample size is intentionally small so that when the materialized view is created, you can see how the rows are multiplied for each attribute.
We can then create a materialized view with array join so that it can extract the map attributes onto keys and values columns. For demonstration, in the example below, it uses an implicit table (with the POPULATE command, and backing table like .inner.{uuid}...
). The recommended best practice, however, is to use an explicit table where you wouldd define the table first, then create a materialized view on top with the TO
command instead.
The new table will have more rows and will have the keys extracted, like this:
From here, in order to query for your rows that need certain attributes, you would do something like this: