How I Set Up Nightly GA4 Data Pulls into Google Sheets
A technical guide to archiving GA4 data automatically. Overcoming API limits, quotas, and building a historical database for free.
The Problem: Data Retention & Agility
Most marketers don't know that detailed user-level data in standard GA4 properties expires after 2 months or 14 months (depending on your settings). If you want to analyze a cohort from 2 years ago, you are often out of luck.
I didn't want my data to have an expiration date. I also hated the latency—sometimes waiting 24-48 hours for data to settle in the interface.
So, I built an automated "Nightly Data Pull" system. Think of it as a methodical data harvest.
The Engineering: How It Works
This isn't enterprise-grade rocket science; it's a scrappy, robust script running on Google Apps Script.
Step 1: The 'Yesterday' Trigger
Every morning at 3:00 AM, my script wakes up. It calculates yesterday's date (today - 1). This ensures I'm always asking for a complete 24-hour cycle.
Step 2: The Dimensions & Metrics
I don't pull everything. I pull 3 specific tables to different sheets:
- Traffic_Daily:
date,sessionSourceMedium,campaignName,sessions,activeUsers. - Pages_Daily:
date,pagePath,views,entrances. - Events_Daily:
date,eventName,eventCount,conversions.
Step 3: Stacking the Data
The script doesn't overwrite the sheet. It appends rows. It finds the last empty row and pastes the new data. Over time, this builds a massive, vertical database inside the sheet.
Issues Faced (The "Gotchas")
1. Sampling on "Day" Dimension
Surprisingly, even daily pulls can get sampled if the property is huge. I solved this by splitting the request. If I pull "Traffic" separate from "Pages," sampling is rare. If I try to combine "Source" + "Page" + "Device" + "City" in one request, sampling hits.
2. The 10-Million Cell Limit
Google Sheets has a hard limit of 10 million cells. My "Pages_Daily" tab hit this after 18 months for a large client.
The Fix: I implemented a "Yearly Archival" logic. The script checks the date. If it's January 1st, it creates a new spreadsheet file "GA4_Data_2025" and starts writing there.
Why Go Through This Trouble?
1. Historical Analysis That Actually Works
I can now run a query: "Show me the performance of the 'Black Friday' campaign for the last 3 years."
In GA4, this is difficult because campaign names might have changed, or data expired. In my Sheet, it's just a SUMIFS formula.
2. Blending with Non-Google Data
I have another script that pulls Shopify sales data nightly. Because both datasets live in Sheets and share a date key, I can blend them perfectly. I can calculate "Total Profit / GA4 Session"—a metric impossible to get in GA4 alone.
3. Owning the Asset
Data is an asset. If Google Analytics goes down, or changes its pricing (like the UA sunset), I have my CSVs. I am platform-independent.
Build your own data warehouse (for free).
You don't need expensive ETL tools or Snowflake to start archiving your data. I can set up a robust Google Sheets archiving system for you in a weekend.
Start Archiving Data