A step-by-step guide on how to track active subscriptions without using snapshots in Salesforce CRM Analytics

The article was originally published on our linkedin.com page.

If you run a subscription-based business, you likely want to track the number of subscriptions you have each year, month, and day. Additionally, you may want to monitor their dynamics, monthly average revenue, and other key metrics.

In this article, we'll show you how to achieve this using Salesforce CRM Analytics (also known as Tableau CRM, Einstein Analytics, Wave, EdgeSpring) without relying on data snapshots, just by utilizing the data you already have along with some SAQL. No special skills are required. Let's get started!

Example and Assumptions

The most frequent data structure involves having a record (contact, account, etc.) with two dates:

  • Start Date,

  • End Date.

Example: An opportunity starts in November 2021 and ends in April 2022, indicating a 6-month opportunity.

Your goal is to group your data by year-month and observe the number of all active opportunities for each month (as shown in the cover photo of this article). In the above case, you want to view this opportunity as active in November 2021, December 2021, and through April 2022. Like this:

As you can see in the chart, this opportunity is marked as active in each month between Nov 2021 and Apr 2022. How can this be achieved with only two dates?

Solution using SAQL and "fill" function

In my example:

  • I use the opportunity dataset,

  • where the Start Date is 'CreatedDate,'

  • and the End Date is 'CloseDate' (using generic fields).

data = load "opportunity2"

-- get created date and give it a universal name

start = group data by ('Id', 'CreatedDate_Year', 'CreatedDate_Month');
start = foreach start generate 
'Id', 
'CreatedDate_Year' as 'Date_Year',
'CreatedDate_Month' as 'Date_Month',
1 as 'mark_cr';

-- get close date and give it the same universal name

close = group data by ('Id', 'CloseDate_Year', 'CloseDate_Month');
close = foreach close generate 
'Id', 
'CloseDate_Year' as 'Date_Year',
'CloseDate_Month' as 'Date_Month',
1 as 'mark_cl';

First, create two streams of data and group by both date fields. In my case, it's by month because I track my data monthly. It's important that you generate 'Year' and 'Month' fields with the same names in both streams. I use the 'mark_cr' and 'mark_cl' fields later to indicate which month is a start date and which is an end date.

-- blend both stram

r = union start, close;
r = group r by ('Id', 'Date_Year', 'Date_Month');
r = foreach r generate 
'Id',  
'Date_Year', 
'Date_Month',
sum('mark_cr') as 'mark_cr',
sum('mark_cl') as 'mark_cl';

r = fill r by (dateCols=(Date_Year, Date_Month, "Y-M"), partition='Id');

Then union both streams and group by Id and our derived date fields. This will result in having two rows per opportunity - one with start date, the other with end date ⬇️

Once you have that, use the 'fill' function to fill the gaps between the dates! The system is smart enough to fill not only the dates between your two per opportunity but essentially takes the earliest and the latest from the whole dataset and fills them accordingly for each opportunity. We'll filter them out later. Here is what you should get:

In my case date range always starts with 01.2020.

r = group r by ('Id', 'Date_Year', 'Date_Month')
r = foreach r generate 
'Id',  
'Date_Year', 
'Date_Month',
sum('mark_cr') as 'mark_cr',
sum('mark_cl') as 'mark_cl',
sum(sum('mark_cr')) over ([..0] partition by 'Id' order by 'Id') as 'cr total',
sum(sum('mark_cl')) over ([0..] partition by 'Id' order by 'Id') as 'cl total';
r = filter r by 'cr total' == 1 && 'cl total' == 1;

To filter out irrelevant months, use a windowing function. The first function will fill with 1 for all months from the start date to eternity. The second function will fill with 1 for all months from the beginning (in my case, 01.2020) until the end date. In this situation, the relevant lines are only those that have '1' in both columns. You can do it more simply, but this method explains it better.

As you can see, both 'total' columns are filled only between the start and end dates. The filter in the last row of code removes all irrelevant lines.

r = group r by ('Date_Year', 'Date_Month')
r = foreach r generate 
'Date_Year' + "~~~" + 'Date_Month' as 'CloseDate_Year~~~CloseDate_Month',
sum('cr total') as 'Number of active oppties';
r = order r by 'CloseDate_Year~~~CloseDate_Month' asc;

As the last step, you group all records not by Id anymore but by date, aiming to obtain the number of Ids that are active in every month. In my code example, I generated the data using CloseDate because I used it in a Time Bar chart (which requires the use of a date field from the dataset - another issue we'll leave for now).

And that's it! My query has a total of 45 lines, including comments, a separate line for each field, and empty lines for clarity. When compressed, it has 16 lines.

data = load "opportunity2"

-- extract created date and give it a universal name, add created date epoch
start = group data by ('Id', 'CreatedDate_Year', 'CreatedDate_Month');
start = foreach start generate 
'Id', 
'CreatedDate_Year' as 'Date_Year',
'CreatedDate_Month' as 'Date_Month',
1 as 'mark_cr';

-- extract close date and give it a universal name, add close date epoch
close = group data by ('Id', 'CloseDate_Year', 'CloseDate_Month');
close = foreach close generate 
'Id', 
'CloseDate_Year' as 'Date_Year',
'CloseDate_Month' as 'Date_Month',
1 as 'mark_cl';

-- blend both strams
r = union start, close;
r = group r by ('Id', 'Date_Year', 'Date_Month');
r = foreach r generate 
'Id',  
'Date_Year', 
'Date_Month',
sum('mark_cr') as 'mark_cr',
sum('mark_cl') as 'mark_cl';
r = fill r by (dateCols=(Date_Year, Date_Month, "Y-M"), partition='Id');

r = group r by ('Id', 'Date_Year', 'Date_Month');
r = foreach r generate 
'Id',  
'Date_Year', 
'Date_Month',
sum('mark_cr') as 'mark_cr',
sum('mark_cl') as 'mark_cl',
sum(sum('mark_cr')) over ([..0] partition by 'Id' order by 'Id') as 'cr total',
sum(sum('mark_cl')) over ([0..] partition by 'Id' order by 'Id') as 'cl total';
r = filter r by 'cr total' == 1 && 'cl total' == 1;

r = group r by ('Date_Year', 'Date_Month');
r = foreach r generate 
'Date_Year' + "~~~" + 'Date_Month' as 'CloseDate_Year~~~CloseDate_Month',
sum('cr total') as 'Number of active oppties';
r = order r by 'CloseDate_Year~~~CloseDate_Month' asc;

Conclusion

With this solution, you can provide your business with real-time tracking of active opportunities or subscriptions, without using heavy snapshot recipes or multiplying datasets—all within your query.

This also serves as a solid foundation for measuring MRR, ARR, and tons of other metrics essential for your business.

Does the solution fit your business? Let’s explore this idea together:

Previous
Previous

UX Wednesday: 3 things about data we can learn from head-up displays [+ 3 examples]

Next
Next

A Cloud & Berry customer success story: How to sell more tailored services to SaaS customers using Salesforce CRM Analytics