Generating a list of dates between two dates can be very helpful in many situations. Many source systems typically provide you with a start and an end date for a record, but if you want to match values within the range – you would need to have every date as a single entity.
Understanding the GENERATE_DATE_ARRAY function
We can use a single function, GENERATE_DATE_ARRAY, in BQSQL to perform this task. This function creates an array of DATE or TIMESTAMP elements, starting from a specific date and ending at another.
The syntax is pretty straightforward:
GENERATE_DATE_ARRAY(start_date, end_date, [interval])
The ‘start_date’ and ‘end_date’ are quite self-explanatory. The ‘interval’ is optional and specifies the period between the dates in the array.
Practical example
I have a dataset in BigQuery containing campaign ID’s and their respective start and end dates. I want to match the sales associated with the campaign for every single day during the campaign period, but I am unable to do this now as I only have the start and end date.
Let’s use the GENERATE_DATE_ARRAY function to solve this. Since the function will add all the values into an array by default, we need to unnest the function, we can do so easily by wrapping it in a UNNEST function.
SELECT
campaignId,
date
FROM
campaigns.campaignlist,
UNNEST(GENERATE_DATE_ARRAY(startDate, endDate)) AS date
This query will group campaignId by all dates between (and including) start and end date. When we run the query, the result is as follows:
We can now match all the campaign ID’s individual dates to another table. Very simple, very powerful!