BIGQUERY — Flow of events (CJM)

There are a lot of questions related to the chain of events accompanying the user’s path through the site / application.
In particular:
- what is the user path (CJM) when interacting with the site / application?
- how much time passes between the first user event and the last one, or any
others interesting events for us?
- the first / last event in the user’s path.
How to get an answer to them using Bigquery, I will describe below in several parts.
Part one - CJM user, by the minute.
As a result of executing the query, we will get the following table.
Within the framework of a single user, we see the events accompanying him through the site, the event_name field and the time when the event occurred in the Time column with details to the second.
So, the first user event … 5AB1 is screen_view committed at 13: 32.19
the last event is also a screen-view, but already committed at 13:33:55.
In total, when interacting with the site / application, 14 events occurred.
15 events if you pay attention already belongs to another user.

request code:
SELECT user_pseudo_id,
event_name,
FORMAT_TIME(
'%T',TIME(TIMESTAMP_MICROS(event_timestamp))
) AS TimeFROM `firebase-public-project.analytics_153293282.events_20181003`GROUP BY user_pseudo_id,
event_name,
event_timestampORDER BY user_pseudo_id,
event_timestamp
The table obtained as a result of this query, a kind of framework in order to understand the logic of building a query, and based on this, go further in order to get answers to deeper questions.
For example, “how much time passes between the two nearest events in CJM”.
Part two. The time difference between the nearest events.
In order to be able to calculate the time difference between the current event and the next one in time, we add two columns Next_event & Next_time, and we get the opportunity to calculate the time difference between two consecutive events. Сalculate the difference and place it in the Diff_time column.

The query that generates the table has an onion structure:
- The first block - is the flow of user events, the same basic table that
was considered earlier.
- The second block - adds fields describing the next event in the chain of
events.
- The third block - we calculate the difference in seconds between the two
nearest events.
- The fourth block - we format the time in a readable format.
-- ( 4 ) The fourth block - Format timeSELECT user_pseudo_id,
event_name,
FORMAT_TIME('%T',TIME) AS Time,
Next_event,
FORMAT_TIME('%T',Next_time ) AS Next_time,
Diff_Time
FROM (-- ( 3 ) The third block - Difference in timeSELECT *,
TIME_DIFF(Next_time, Time, SECOND) AS Diff_Time
FROM (-- ( 2 ) The second block - Next Events & TimeSELECT *,
LEAD(event_name) OVER (PARTITION BY user_pseudo_id
ORDER BY Time) AS Next_event,
LEAD(Time) OVER (PARTITION BY user_pseudo_id
ORDER BY Time) AS Next_time
FROM(-- ( 1 ) The first block - Events flowSELECT user_pseudo_id,
event_name,
TIME(TIMESTAMP_MICROS(event_timestamp)) AS TimeFROM `firebase-public-project.analytics_153293282.events_20181003`GROUP BY user_pseudo_id,
event_name,
TimeORDER BY user_pseudo_id,
Time
)GROUP BY user_pseudo_id,
event_name,
TimeORDER BY user_pseudo_id,
Time
)
)
So, let’s analyze the result of executing each of the blocks of sql code.
The first block - the flow of user events, in fact the same base table with which we began our meeting, the only difference is that the time is not in a human readable format. There is only one reason if we are now let’s format the time, later we won’t be able to calculate the time difference.

-- ( 1 ) The first block - Events flowSELECT user_pseudo_id,
event_name,
TIME(TIMESTAMP_MICROS(event_timestamp)) AS TimeFROM `firebase-public-project.analytics_153293282.events_20181003`GROUP BY user_pseudo_id,
event_name,
TimeORDER BY user_pseudo_id,
Time
The second block - adds columns describing the next event in the Next_event & Next_time chain of events

-- ( 2 ) The second block - Next Events & TimeSELECT *,
LEAD(event_name) OVER (PARTITION BY user_pseudo_id
ORDER BY Time) AS Next_event,
LEAD(Time) OVER (PARTITION BY user_pseudo_id
ORDER BY Time) AS Next_time
FROM(-- ( 1 ) The first block - Events flowSELECT user_pseudo_id,
event_name,
TIME(TIMESTAMP_MICROS(event_timestamp)) AS TimeFROM `firebase-public-project.analytics_153293282.events_20181003`GROUP BY user_pseudo_id,
event_name,
TimeORDER BY user_pseudo_id,
Time
)
The third block - adds a Diff_Time column with the difference in seconds between the two closest events.

-- ( 3 ) The third block - Difference in timeSELECT *,
TIME_DIFF(Next_time, Time, SECOND) AS Diff_Time
FROM (-- ( 2 ) The second block - Next Events & TimeSELECT *,
LEAD(event_name) OVER (PARTITION BY user_pseudo_id
ORDER BY Time) AS Next_event,
LEAD(Time) OVER (PARTITION BY user_pseudo_id
ORDER BY Time) AS Next_time
FROM(-- ( 1 ) The first block - Events flowSELECT user_pseudo_id,
event_name,
TIME(TIMESTAMP_MICROS(event_timestamp)) AS TimeFROM `firebase-public-project.analytics_153293282.events_20181003`GROUP BY user_pseudo_id,
event_name,
TimeORDER BY user_pseudo_id,
Time
)GROUP BY user_pseudo_id,
event_name,
TimeORDER BY user_pseudo_id,
Time
)
The fourth block - formatting the time into a readable format.

-- ( 4 ) The fourth block - Format timeSELECT user_pseudo_id,
event_name,
FORMAT_TIME('%T',TIME) AS Time,
Next_event,
FORMAT_TIME('%T',Next_time ) AS Next_time,
Diff_Time
FROM (-- ( 3 ) The third block - Difference in timeSELECT *,
TIME_DIFF(Next_time, Time, SECOND) AS Diff_Time
FROM (-- ( 2 ) The second block - Next Events & TimeSELECT *,
LEAD(event_name) OVER (PARTITION BY user_pseudo_id
ORDER BY Time) AS Next_event,
LEAD(Time) OVER (PARTITION BY user_pseudo_id
ORDER BY Time) AS Next_time
FROM(-- ( 1 ) The first block - Events flowSELECT user_pseudo_id,
event_name,
TIME(TIMESTAMP_MICROS(event_timestamp)) AS TimeFROM `firebase-public-project.analytics_153293282.events_20181003`GROUP BY user_pseudo_id,
event_name,
TimeORDER BY user_pseudo_id,
Time
)GROUP BY user_pseudo_id,
event_name,
TimeORDER BY user_pseudo_id,
Time
)
)