Funnel.io x Metric Labs: Why Your Data Differs Inside and Outside GA4

Data AnalyticsData Science

As digital marketers, we’re all familiar with Google Analytics 4 (GA4). Yet, frustrations linger, especially when it comes to data discrepancies and mismatches. In collaboration with Funnel.io and Metric Labs, let’s delve into the reasons behind these discrepancies and explore strategies to enhance insights by leveraging GA data in other tools.

This blog is a recap of the webinar; you can view a recording of the full webinar here.

Meet the Experts:

Leading the charge is Vincent Maneno, heading a team of data analytics experts in Australia, specializing in GA and analytics. Joining forces is Gabi Somadelis, a Solution Consultant at Funnel.io, offering expertise to brands and agencies across APAC, streamlining marketing reporting processes.

Why Google Estimates Data?

Google Analytics 4 (GA4) employs data estimation due to the demand for precise counts, which require significant memory and can impact performance. For example, GA4’s session count is based on an estimation derived from unique session IDs. To enhance accuracy and reduce errors, Google Analytics revamped its calculation method for session metrics in October 2021.

Discrepancy by Reporting Surface

Data discrepancies extend beyond estimation, varying across reporting surfaces. While GA4 UI and the data API align closely, BigQuery data often diverges significantly due to various factors.

Factors Contributing to Discrepancies

  • Data Scoping: Google Analytics stores data in either aggregated or granular tables. BigQuery’s granular rows affect data retrieval, emphasising the importance of understanding metric and dimension compatibility.

  • High Cardinality: Dimensions with high cardinality, such as URL paths, pose challenges. Google Analytics may consolidate less common rows, particularly over shorter time frames.

  • Sampling: Exceeding sampling limits leads to data sampling, resulting in greater discrepancies when comparing BigQuery data to GA4 UI.

  • Thresholding: Data thresholds prevent user identification based on demographics or interests, particularly with active Google Signals.

  • Modelling: Conversion and behavioural data may undergo modelling in GA4, leading to discrepancies, especially for users who decline cookies.

  • Data-Driven Attribution: Attribution models applied in GA4 UI and the data API may not reflect in BigQuery data.

Further to the reporting surfaces, Google signal also plays a part in affecting the metrics reported.

Understanding Google Signals

Google signals are session data from sites and apps that Google associates with users who have signed in to their Google accounts, and who have turned on Ads Personalisation. This association of data with these signed-in users is used to enable cross-device reporting, cross-device remarketing, and cross-device conversion export to Google Ads.

In the screenshot below we compare session data broken down by date and you can see the discrepancy when google signals is on and when its off.

What is the Solution?

To address these challenges, a strategic approach is essential:

  • Start with the End in Mind: Prioritise long-term trends and insights over short-term discrepancies.

  • Take the Data Outside of GA4: Explore alternative tools to enhance visualisation capabilities and improve data retention.

  • Consider Your Options: Three types of tools are highlighted for data management – BI tools, storage tools, and marketing data hubs.

Diving Deeper into Tool Options

Looker Studio as a BI Tool

Looker Studio serves as an initial step for reporting outside of GA4, offering easy data connection and report sharing. However, limitations exist in data quota and historical data accessibility.

BigQuery as a Storage Option

BigQuery provides a storage solution with larger data quota limits and improved data retention for year-over-year analysis. However, technical expertise may be required for integration with multiple channels.

Marketing Data Hub

A marketing data hub like Funnel.io offers built-in storage, connectors to multiple channels, and automated API management. This option provides flexibility, eliminates quota limits and sampling issues, and allows for custom data harmonisation without requiring technical resources.

Moving Forward

Moving forward, it is critical that marketers take proactive actions to improve their data management practices. Businesses can gain deeper insights and make better decisions by prioritising long-term trends, researching other tools for extracting data from GA4, and examining options such as BI tools, storage solutions, and marketing data centres. To begin your journey to data-driven success, contact Metric Labs and Funnel.io now for expert advice and support.

This blog is a recap of the webinar; you can view a recording of the full webinar here.


Like this blog post? Sign up to our email newsletter – Lab Report – and never miss a new one. Or, get it sent straight to your Messenger!

Reminder: Google UA Historical Data to be Deleted in July 2024