When you buy goods from a grocery store and the cashier registers them, the store’s application interfaces use SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. to update the transaction into a relationalData is organized relationally when it's distributed across columns, rows, and tables in such a way that entries are linked together using keys. For example, a user can be represented by a user_id both in a table of purchases (who made the purchase) and in a table of users (who is the user). databaseStructured storage for data that usually serves a singular purpose. For example, a company's financial records would be stored in a database. and to update the inventory stock count.
When you log into your favorite app on your phone, that app uses SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. to fetch your user data records from the user databaseStructured storage for data that usually serves a singular purpose. For example, a company's financial records would be stored in a database..
When you start building your own app, one of the first things you’ll do is create a databaseStructured storage for data that usually serves a singular purpose. For example, a company's financial records would be stored in a database. schemaAn analytics system uses schemas to parse, validate, and store events ingested by the collector. The schema dictates what a valid event looks like, what data types are accepted by the system, and what values are required in all incoming events. Schema can also be used to describe the structure of other things, such as the Data Layer. with some tables and entities – with SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets., of course.
And when you use a data analytics system like Google Analytics or Snowplow, you are using SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. either indirectly through an analysis interface or directly by typing the queries yourself.
SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. is ubiquitous in the digital world. It’s a language that’s available in a million different contexts – from application code to spreadsheet scripts, from analytics interfaces to smart TVs, and from AI assistants to smart kitchen appliances.
For technical marketers, it’s one of the key skills to learn. For a data-savvy marketer, SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. offers a way to communicate directly with the data structures of an organization. Without direct SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. access, the marketer is forced to rely on the user interfaces and the abstractions of the tools and services that the organization uses. While this is not an issue for many use cases, it does become limiting in the long run, especially if the marketer wants to have direct data access to verify or validate the results these tools offer.
In this Topic, you’ll learn the key concepts of SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. as well as some of the more ontological insights that structured dataStructured data is a semantic layer on a website, explaining what the website is about and what different entities on the page are. This information is usually provided for bots and crawlers to consume. can introduce to your “data-informed” world view. As with all languages, SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. is more than just a communication tool that lets you query data. It’s also a way to understand and to structure the world.
Don’t miss this fact!
Analytics systems often have reporting interfaces that let you pull in data from the data warehouseData warehouse is a repository of data collected by an organization from different sources. The data can then be transformed within the data warehouse before being made available for querying against. without having to write any SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets.. This is very useful. However, remember that these systems build their own abstractions against their own schemasAn analytics system uses schemas to parse, validate, and store events ingested by the collector. The schema dictates what a valid event looks like, what data types are accepted by the system, and what values are required in all incoming events. Schema can also be used to describe the structure of other things, such as the Data Layer.. By learning SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets., you can verify and validate how these systems work so that you can understand them better and so that you can structure the source data in optimal ways.
Validate and verify with SQL
When you use an analytics system for your reporting needs, you are relying on abstractions and visualizations determined by the system itself. Some analytics tools let you edit these definitions to some extent, but ultimately these tools draw up their own queries against their own data using their own preconceptions about what those queries should look like.
In many cases, this is just fine. After all, as long as the analytics system produces consistent results it doesn’t necessarily matter how the sausage gets made, right?
However, these systems are often designed for a global population of users with vastly different business needs and market fits. Vendored analytics systems can’t tailor to unique business cases of each individual customer – instead, they choose a more homogenous approach with dimensions, metrics, and queries that are generic enough to suit the majority of their customers without singling out any one of their unique approaches.
But what if you do want to be singled out? What if you want your analytics system to relate details about your data in a way that goes against the grain of what is generally accepted?
What if you don’t agree with the analytics system’s definitions? What if you believe your business questions to be unique enough to require flexibility from the analytics system that the user interface doesn’t offer?
This should always be the case! You should always be critical of the systems that you use and the abstractions they rely on.
Unfortunately, many analytics systems simply don’t give you the freedom of interacting with data in any way you like. Google Analytics, for example, places restrictions on what dimensions and metrics you can query together. Similarly, its key aggregation units of “Users” and “Sessions” are based on algorithms that you might not agree with.
This is where direct data access becomes your friend and ally. By using SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. on the data produced by these analytics systems, you can break the definition lock and choose your own approach without being hamstrung by the user interface of the analytics system.
-- Count users as the number of different "customer_id" values
SELECT
COUNT(DISTINCT(
SELECT
value.string_value
FROM
UNNEST(event_params)
WHERE
key = 'customer_id')) AS users
FROM
data_table
In the query above, the aggregation unit of users is redefined by counting the number of unique customer_id
values in the databaseStructured storage for data that usually serves a singular purpose. For example, a company's financial records would be stored in a database.. The analytics system itself doesn’t let you change the key metric “Users” to this in the user interface, so you need to resort to SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. to get the job done.
SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. lets you complement the analytics interfaces of the tools you use with your own approaches to querying the underlying data. SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. is like your second opinion generator – you can use it to validate and verify the aggregations and visualizations available in the analytics user interface.
Deep Dive
Fixed schema of the GA4 session
A solid example of an analytics system forcing you to adopt its schemaAn analytics system uses schemas to parse, validate, and store events ingested by the collector. The schema dictates what a valid event looks like, what data types are accepted by the system, and what values are required in all incoming events. Schema can also be used to describe the structure of other things, such as the Data Layer. is the “Session” definition of Google Analytics.
Sessions play a key part in Google Analytics. They bucket multiple events from the same user together, as long as they happened within a certain timeframe. Sessions are often called “visits”, too. Thus, they can be used as an analogy to a single “user journey” on the site or the app.
In Google Analytics 4, the definition of a session is very simple: a session expires after 30 minutes of no events collected from the user. In other words, when GA4 collects events from a user, they are always grouped into the same session until 30 minutes pass with no events collected from the user. At that point, the session ends. The next time the user sends an event to GA4, it will be part of a new session.
While this 30 minute timeout can be changed, it will always be the same for all your users.
The problem with this is how incredibly abstract and generic this definition is.
It doesn’t take into account the interactions that happen within the session, nor does it consider visits from different sources (e.g. first visiting the site from a Google search and shortly after from a Facebook ad), and it doesn’t stop the session when a clear milestone is reached (such as a purchase conversion).
When you use GA4, you will always be subject to this definition of a session. If you want to break free of this abstraction, you need to directly access the underlying data to reconstruct this aggregation.
SQL syntax
For data queries, SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. syntax is quite readable. It relies on English language patterns rather than obscure programming shorthand.
In SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets., you SELECT
data FROM
a table.
Sometimes you can further qualify which source data is accessed by instructing the query to pull in data WHERE
the data matches certain criteria.
user_id | purchase_date |
---|---|
adcd_1255 | 13/11/2023 |
gffh_4888 | 11/10/2023 |
adcd_1255 | 10/10/2023 |
pttt_3441 | 05/09/2023 |
-- Count the number of purchases for each user
SELECT
user_id,
COUNT(*) AS purchases
FROM
transaction_table
GROUP BY
user_id
user_id | purchases |
---|---|
adcd_1255 | 2 |
gffh_4888 | 1 |
pttt_3441 | 1 |
Often, the data you query has relations to other data – either in the same table or in other tables. For example, you might have a table of purchases where each transaction is associated with a user_id
. This user_id
can also be found in a table of users with additional details about the user, such as their country or the number of times they’ve already made a purchase.
In these cases, you might end up building a query that SELECT
s data FROM
a table that is JOIN
ed with another table for additional information.
user_id | country |
---|---|
adcd_1255 | Finland |
gffh_4888 | Finland |
pttt_3441 | Sweden |
-- Count the number of purchases by country
SELECT
country,
COUNT(*) AS purchases
FROM
user_table JOIN transaction_table USING user_id
GROUP BY
country
country | purchases |
---|---|
Finland | 3 |
Sweden | 1 |
In the query above, each purchase is associated with the country of the user who made the purchase by fetching the country
for each user_id
from the related user table.
SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. can get very messy very fast. Often, the readability, quality, and performance of any given query is directly related to the quality of the data being queried. If the data is messy, lacking structure, and spread across a number of tables, your queries will become structurally unviable, too.
In this way, your understanding of SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. can directly impact how you organize the data engineering efforts in your organization.
If you find it difficult to write efficient queries against your data, it might be a symptom of an unnecessarily complex data architectureHow data is structured, stored, and utilized within an organization. It's a collection of data models, schemas, data flow maps, and governance instructions. in your organization. If you run into roadblocks consistently, you might need to consider altering the architecture to make querying the data less of a chore. After all, in many organizations it is precisely the querying of the data that should dictate its organization rather than cost or complexity of storage.
Don’t miss this fact!
If you find yourself struggling with creating a clean and readable SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. query, it might not be because of your lack of skills with the language. A potential culprit is messy source data that requires a lot of wrangling before it produces a clean result.
Digital marketing use cases
Where there is an analytics question, there is a SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. query.
In digital marketing, having access to the raw data generated by your marketing tools is invaluable. Couple this data access with decent SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. skills, and you can vastly improve your understanding of the efficiency of your marketing channels and how they interact with each other.
Here is a non-exhaustive list of things you can do with SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. when querying relevant data sources.
Customer Behavior Analysis
SegmentWhen data is grouped by property, attribute, or value, it is segmented. When building audiences for ads, for example, you need to choose for which user segments to target the ads. audiences based on their actions on your sites and in your apps. Combine this with their purchase history and demographic information. Feed these segmentsWhen data is grouped by property, attribute, or value, it is segmented. When building audiences for ads, for example, you need to choose for which user segments to target the ads. into your campaign management tools for more control over audience building and targeting.
Query user data across online and offline data tables to build a more comprehensive look into their lifetime value to your organization.
Generalize observations across larger datasets that might suffer from gaps or patches of bad data.
Campaign Analysis and Optimization
Pull data from analytics, advertising, and social media data sources, and join it together around users and traffic sources. Analyze the performance of your marketing campaigns across different channels.
Calculate the return on investment (ROI) of your campaigns by aligning marketing spend with outcomes like sales and conversions across different channels.
Trends and Consumer Insights
Use machine learning to analyze patterns and changes in consumer behavior over time. Detect anomalies before they become disruptive problems for your organization.
Feed data from social media and customer reviews into dedicated systems that can gauge public sentiment towards your brand.
Personalization and Content Strategy
Analyze which type of content performs best with different audience segmentsWhen data is grouped by property, attribute, or value, it is segmented. When building audiences for ads, for example, you need to choose for which user segments to target the ads.. Design your content strategy and content calendars around these insights.
Create personalized and near-real-time product recommendations based on the user’s interactions with your brand both in the past and in the current session.
Fuel Marketing Integrations
Use SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. for automated data extraction, transformation, and reporting in supported marketing tools and platforms. Build scheduled pipelines that deliver the right type of the data at the right time to the right person.
Build useful dashboards using SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. on the source data in a data visualization tool that supports this approach.
As a technical marketer, you are most likely working with SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. on a daily basis. You might not type the queries yourself, but the tools and interfaces you interact with will most certainly query the underlying data using SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets..
SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. can be revelatory not only in how it validates and verifies the underlying data but also how it helps you restructure the data to be more performant when accessed through a query interface.
SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. is thus more than just a language – it’s an approach to organizing the data within your organization. The act of querying is the end result of a long chain of decisions that begins with data architectureHow data is structured, stored, and utilized within an organization. It's a collection of data models, schemas, data flow maps, and governance instructions. design. At its best, SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. can be a powerful feedback loop into improving the data structures that your organization relies on.
Key takeaway #1: Use SQL to skip the analytics UI
Analytics tools build their user interfaces as abstraction layers on top of the raw data. Often, there’s a lot of processing, thresholding, sampling, and normalizing going on that turns the thing you see in the UI quite different from what is actually stored in the data tables. With SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. and raw data access, you can build your understanding of the underlying data. You can generate your own abstractions that work better with your business questions than those used by the analytics tool.
Key takeaway #2: SQL makes you a better measurement planner
While it’s tempting to think of SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. as just a query language, it’s also a way to structure the world, too. The more you understand about the declarative logic of SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets., the better you can organize your data to support the query use cases. Your knowledge of SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. can feed directly into how you build your measurement plans. Instead of looking at data as arbitrary rows and entities that hopefully make sense once collected, you can approach data collection from the POV of how you’ll eventually want to query the data. And here, a solid understanding of SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. will be instrumental in making your measurement plans more comprehensive.
Key takeaway #3: Digital marketing use cases
SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. is everywhere in digital marketing. From data analysis with data from tools like Google Analytics and Piwik Pro to running dashboard reports in Looker Studio or Tableau, from building machine learning models on complex datasets to starting complex data flows in a cloud data pipeline. There’s not a single data-related discipline in digital marketing that wouldn’t benefit from SQLStructured Query Language (SQL) is a standardized, domain-specific language designed for accessing and manipulating data. It is particularly powerful with structured and relational datasets. knowledge.