Snowflake Materialized Table


Snowflake just announced Materialized Tables for streaming data. They were described as the mid-point between streams and tasks and materialized views on the flexibility/simplicity spectrum. The goal is to enable declarative transformation of streaming data.

Table Materialization, in my opinion, will enable real-time loading into Data Vault Satellites! Go Snowflake!

Sample SQL for creating Materialized Tables in Snowflake:

-- Number of records being ingested
select count (*) from streaming_ad_record;

-- Raw streaming Ad click records that Snowpipe ingests
select record_content
from streaming_ad_record
order by RECORD_METADATA: CreateTime
  desc Limit 10;

---- AD -> CAMPAIGN mapping reference data set that will 
---- be joined with ad data in Materialized Table

select * from ad_campaign_map;
--Create Materialized Table for computing Campaign spend
create or replace materialized table campaign_spend lag = '1 minute' AS
select c.campaign,
  as_varchar (k: channel) channel
  to_date(to_timestamp(to_number (k: timestamp))) date
  sum(to_decimal(k:click)) total clicks,
  sum(to_decimal(k:cost)) total_cost,
  sum(1) ads served
from
  ad_campaign_map c, 
  (select PARSE_JSON(RECORD_CONTENT) as k from streaming_ad_record) 
  where
  c. ad_id = to_number (k: ad_id) group by
  c. campaign, channel, date;

-- Materialized Table results which shows daily ad 
-- cost, clicks, etc. across channels & campaigns

select * from campaign_spend order by date desc Limit 500;
Creating a Materialized Table in Snowflake

Summary

Materialized Tables for streaming data makes it easy to join and transform Data in a streaming manner. A Materialized Table is a declarative Data Pipeline. Describe what to do instead of how to achieve i.e. write declarative code for the transformation and Snowflake handles the incremental refresh to materialize the Pipeline. The Snowflake Materialized Table automatically refreshes as new data streams in.