Fionas Treasure Chest
Jun 20, 2024
3 Minutes
504 Words

sql hacks - querying faster than the puppet master

In a List

Here is a list of some useful SQL hacks that I use almost every day to make my work a little bit easier:

  • keep queries compact for easy debugging and to save query cost.
    • use “session” mode in bigquery, this stores your various query results for a couple of hours
      → this reduces both, query runtime and query cost!
    • reduce query complexity by creating a “reference” table at logical increments.
      → use CREATE OR REPLACE TABLE, and then referg to the table instead of continuing to build a very long query with many cte’s
      → this has the added benefit that you can easily sanity check stuff in the different tables instead of having to dig through the full query every time
    • reduce input table size by adding filters where possible.
      → listing only the necessary columns
      → using the WHERE clause right in the beginning
  • think ahead by creating actions that can be adjusted easily.
    • start the query with a cte for each table youre using so you can just refer to the cte instead of type the full name of the table each time
    • assign an alias to a column that will be used later for joining
      → this enables you to assign a join column with “USING (consistentColumnName)” instead of “ON table1.columnName = table2.differentColumnName”
    • always start your WHERE clause with “WHERE 1=1”
      → enables you to easily comment out other where filters during testing
  • future-proof the query by explaining everything unusual that’s going on in the query and avoiding clutter.
    • using table aliases for each table in a join and using these for each column added to the select statement
      → lets you easily identify where a column came from
    • keep a short note on each cte to state what it does
    • add an explanation for every filter in the WHERE clause
    • absolutely no cte within a cte! these are very difficult to read quickly

As a query example

Here’s a code that covers what I described in the bullet points bk who wants to read text if you can just read code, hehe

1
WITH
2
-- in this query we are pulling sales data for the past 90 days
3
-- on weekly level and seller account segment
4
-- for very important stakeholder ofc and for important reason
5
6
calendar AS (
7
SELECT
8
*,
9
date_day AS amsterdamEventDate
10
FROM 'bq_project.subfolder.dim_financial_calendar'
11
WHERE
12
date_day BETWEEN CURRENT_DATE()-90 AND CURRENT_DATE()
13
),
14
15
accounts AS (
39 collapsed lines
16
SELECT
17
*
18
FROM 'bq_project.subfolder.dim_account_list'
19
),
20
21
sales AS (
22
SELECT
23
*
24
FROM 'bq_project.subfolder.fct_sales'
25
WHERE
26
-- only last 90 days are relevant to request
27
amsterdamEventDate BETWEEN CURRENT_DATE()-90 AND CURRENT_DATE()
28
),
29
30
-- in this CTE we are joining all tables
31
-- and aggregating on segment and week level
32
sales_aggregations AS (
33
SELECT
34
calendar.date_week AS amsterdamEventWeek,
35
accounts.segment,
36
SUM(sales.orderTotal) AS salesSegment,
37
SUM(sales.orderTotal)
38
OVER (PARTITION BY calendar.date_week) AS salesAllSegments
39
FROM calendar
40
LEFT JOIN sales USING (amsterdamEventDate)
41
LEFT JOIN accounts USING (accountId)
42
GROUP BY ALL, calendar.date_week
43
44
)
45
46
SELECT
47
*,
48
ROUND(
49
SAFE_DIVIDE(salesSegment, salesAllSegments)
50
,2) AS proportionSegment
51
FROM sales_aggregations
52
WHERE 1=1
53
--AND segment = "Longtail"
54
ORDER BY amsterdamEventWeek

Image source: one of the first 5 images on google search

Copyright 2026
Sitemap