Skip to main content

Outcome predictors

info

View this example in Motif here.

This is an investigation into the events and event pairs that predict an outcome of interest.

Video

SOL query

// Split into sessions with an hour gap
match split Session()+
if duration(Session[-1], SUFFIX[0]) > 1h

// Focus our analysis on watch_end
match watch_end
// Create a binary outcome
set watched = length(watch_end)

// Calculate and store the baseline watch rate
set baseline_watchrate = 0.694

// Calculate each user's marginal contribution to the total
set marginalgain = watched - baseline_watchrate

// Remove events after the suffix
replace SUFFIX with null
// Remove the watch_end event
replace MATCHED with null

// On each event, store a string called "pair" with the previous event and current event
set SEQ[1:].pair = concat(SEQ[:-1].name, " >> ", SEQ[1:].name)
set SEQ[0].pair = "BASE"

// Divide each sequence into single events, and pull the pair value from the event onto the sequence
match split A()
if A.pair not in PREFIX.pair
set __PATH = A.pair

Key steps

  1. After sessionizing, we match on our key outcome, watch_end, and set a binary value on each user's sequence.
match watch_end
set watched = length(watch_end)
  1. We take the average of the binary value to get the baseline and store it on the sequence, then we do some simple math to calculate each user's marginal contribution to the number of watch_end events.
set baseline_watchrate = 0.694
set marginalgain = watched - baseline_watchrate
  1. Create a new dimension on each event called "pair", which is the combination of the previous event name and the current event name. For the first event, set it to the value "BASE"
set SEQ[1:].pair = concat(SEQ[:-1].name, " >> ", SEQ[1:].name)
set SEQ[0].pair = "BASE"
  1. Split each sequence into single events, only if the "pair" value is new for that sequence. Finally, copy the value of pair from the event to the sequence.
match split A()
if A.pair not in PREFIX.pair
set __PATH = A.pair
  1. We then use the table feature to identify the pairs (stored in __PATH) that drive increases or decreases in marginal gain

intensity