August 2022 dbt Update: v1.3 beta, Tech Partner Program, and Coalesce!
Semantic layer, Python model support, the new dbt Cloud UI and IDE… there’s a lot our product team is excited to share with you at Coalesce in a few weeks.
But how these things fit together—because of where dbt Labs is headed—is what I’m most excited to discuss.
You’ll hear more in Tristan’s keynote, but this feels like a good time to remind you that Coalesce isn’t just for answering tough questions… it’s for surfacing them. For sharing challenges we’ve felt in silos, finding the people you want to solve them with, and spending the rest of the year chipping away at them. As Tristan says in his latest blog, that’s how this industry moves forward.
What's new
- dbt Core v1.3 beta: Do you use Python for analytics? The first beta prerelease of dbt Core v1.3—including support for dbt models written in Python—is ready to explore! Check it out, and read more about dbt supported Python models in our docs.
- Technology Partner Program: We just launched our new Technology Partner Program with 40+ friends in the Modern Data Stack to provide consistent support for seamless integrations joint-users can trust. Check our new dbt Cloud integrations page for what’s available today!
- Single-tenant users: dbt Cloud v1.1.60 is now available on dbt Cloud Enterprise.
What’s better
- dbt Cloud UI: The new dbt Cloud UI is in beta, and can be opted-into by any multi-tenant dbt Cloud customer. Cleaned-up interface, better ergonomics, fewer clicks to frequently-used screens.
- dbt Cloud IDE: Did you catch Staging last month (our quarterly product update)? The dbt Cloud IDE has been overhauled for greater speed and performance, and is now in beta—enroll to check it out!
New resources
Things to try 🛠️
- dbt_artifacts v1.2.0: Brooklyn Data Co just shipped a pretty significant re-write to the dbt_artifacts package. Capture all the metadata generated by dbt at the end of an invocation (project nodes, success rate, test results, etc), and store directly in Snowflake, Databricks, or BigQuery for immediate analysis.
- dbt YAML validator using JSON schema: If you do any development in VS Code, this repo unlocks autocomplete and validation for dbt’s YAML files. Find those tests that never ran because you messed up the indentation. Not that that would ever happen to you.
- dbt Exposures for Hightouch: Exposures in dbt allow you to quickly see how downstream data applications are making use of your dbt models and sources. These don’t have to just represent dashboards in BI tools though — you can now represent your Hightouch syncs as dbt exposures too.
- Are you a certified dbt developer? We recently launched our new Analytics Engineering certification program, and would love to hear what you think. We personally dug this writeup from Charles Verleyen on what to expect, and exactly how much experience/prep he recommends.
Things to read 📚
- How to enforce rules at scale: It’s best practice to add model tests in dbt, but can you require it? In his latest blog, Benoit Perigaud (dbt Labs Senior Analytics Engineer) shares how to use the pre-commit-dbt package to do just that.
- How we shaved 90 minutes off a model: Check out how we used the model timing tab in dbt Cloud to find and re-architect our longest running model.
- How to decide between hashed or integer surrogate keys: Dave Connors (dbt Labs Senior Analytics Engineer) breaks down the pros and cons of each approach in dbt.
- How to think about dbt Python models in Snowpark: Eda Johnson wrote a nice primer on how to approach dbt-supported Python models in Snowflake with Snowpark Python.
- dbt Labs is officially partnering with Monte Carlo: The partnership makes it simple for analytics engineers to supplement dbt testing with end-to-end observability.
- How Comcast accidentally invented a feature store in 2013: What a genuinely delightful read. Josh Berry details the peak and pits of a fast-moving data science team that transcended an initial aversion to documentation to build “Rosetta.”
Consulting corner 🌎
I just discovered the treasure trove of excellent resources from dbt Labs consulting partners, and want to start sharing more here. Here’s a few you might have missed over the summer:
- Reduce ETL costs: I’ve only just seen this blog from Mighty Digital, but found it to be a super practical (and concise) introductory guide to rethinking your ETL pipelineExtract, Transform, Load (ETL) is the process of first extracting data from a data source, transforming it, and then loading it into a target data warehouse. with dbt.
- Explore data: Part two of a series on exploring data brought to you by Vivanti. This post focuses on working with JSONJSON (JavaScript Object Notation) is a minimal format for semi-structured data used to capture relationships between fields and values. objects in dbt, but I also recommend the preceding post if you want to see how they spun up their stack.
- Track historical changes: Snapshots are a pretty handy feature for tracking changes in dbt, but they’re often overlooked during initial onboarding. Montreal Analytics explains how to set them up in dev/prod environments
- Learn dbt: Have some new faces on the data team that might need an introduction to dbt? Our friends at GoDataDriven are hosting a virtual dbt Learn Sept 12-14.
Thank you!
This month’s newsletter was brought to you by: Joel, Gloria, Azzam, Amos, and me (Lauren)
Comments