Quickstart
In 15 minutes, this tutorial will teach you how to run basic analysis in Motif. You can interactively explore examples in this guide by clicking on the “Motif link” links below the screenshots.
We'll be using the “Motiflix” dataset, which is simulated based on common user sessions in a movie streaming service like Netflix. Let's imagine that we are analysts on the Motiflix growth team and are exploring ways to improve search experience to increase movie watches.
Motif uses a small but powerful Sequence Operations Language (SOL) to query sequences. The most basic SOL query is finding sequences with a specific event, for example, finding when Motiflix users performed a search (viewed search results page):
match search_results_page
This query finds the first event called search_results_page
in each sequence and tags it, so it can be highlighted in visualizations and used in subsequent SOL operations. The best way to understand how SOL queries work is by viewing their results in the “Examples” view:
match
operation tags matched events but does not remove sequences without matches. Notice that matches to search_results_page
events have blue tags above them. Sequences without matches don't have tags. Matched events are preceded with a Prefix
tag, which is automatically added and contains all events before the match. You can expand Prefix
to see its events by hovering over its pill in the "Tag map" above the visualization.
To filter out sequences without matches, we can add the filter
operation:
match search_results_page
filter MATCHED
Here MATCHED
is another automatically added tag, consisting of all matched events. Filtering removes sequences without it.
To narrow down to situations where searching was followed later by starting to watch a movie, we can use the following query:
match search_results_page >> * >> watch_start
filter MATCHED
Here we define a match pattern consisting of 3 consecutive states. *
in the 2nd state matches any event any number of times - similar to quantifiers in regular expressions. Here is a screenshot of the result:
Notice that matched watch_start
events are displayed with red tags.
As we can see from the screenshot above, events in between might contain home_page
or other search_result_page
events. To find situations where searching directly led to starting to watch a movie, we can add an exclude list of event names to the middle state:
match search_results_page >> (^home_page,search_results_page)* >> watch_start
filter MATCHED
Symbol ^
is from the NOT operator in regex and we need to group excluded events into ()
.
We can narrow user behavior further to situations where searching directly led to starting to watch a movie within 2 minutes with no more than 5 events in between:
match search_results_page >> (^home_page,search_results_page){0,5} >> watch_start
if watch_start.ts - search_results_page.ts < 2min
filter MATCHED
{0,5}
is another event quantifier from the regex syntax. search_results_page.ts
and watch_start.ts
access a time event dimension ts
on the events tagged by the match. Here are the results:
Notice that the number of matched sequences - "Filtered actors to ...%" metric at the bottom of the page - went down from 23% to 12%.
Next, let’s see if the number of returned search results is correlated with whether a user starts watching one of them. By having watch_start
event in the query, we filter out sequences without watches. We need to capture whether users start movies after searching, and we can do it by adding an optional ?
quantifier to the watch_start
state:
match search_results_page >> (^home_page,search_results_page){0,5} >> watch_start?
if watch_start.ts - search_results_page.ts < 2min
filter MATCHED
Now we can view the correlation we are after in the "Outcome" view. We need to select outcome to be the presence of the watch_start
tag and condition to be search_results
dimension on the tagged search_results_page
event using the "Tag Map" above the plots:
As we can see in the top left plot, the probability of watching a movie decreases as more search results are being returned:
Other plots display duration and number of steps between searching and watching (top right), rates of watching over time (bottom), and correlations of other dimensions of the tagged search_results_page
event with watch_start
(right panel).
Next, let's explore what user behaviors might be contributing to the observed differences in watch rates. A good way to do it is by aligning sequences on the search_results_page
event:
match search_results_page
filter MATCHED
... and displaying them in "Barcode" view:
Here you can see the prominence of various user paths starting from the matched event. In the screenshot above, one path is selected and can be viewed as a funnel in the panel on the right.
We can compare user paths with low and high number of returned search results by setting a condition in the Tag Map to be search_results
dimension on search_results_page
event again and then setting groups A and B to be 8
and 12
search results, respectively, in the display options above the Barcode plot:
As you can see the plot is mostly grey with some events being blue and some - red, if there are significant differences in reaching that event. In the screenshot above, we have selected the big blue node and can see in the right panel that it corresponds to more users starting to watch a movie from a movie_page
in group A (8 search results) than in group B (12 results): 64.2% vs. 14.8%.
Finally, let’s compute the distribution of durations between search_result_page
and watch_start
events:
match search_results_page >> * >> watch_start
filter MATCHED
set search_to_watch_time = duration(search_results_page, watch_start)
This query uses the set
SOL operation to create a new sequence dimension search_to_watch_time
with the duration()
SOL function. You can review a full list of supported functions in documentation. Sequence dimensions can be plotted in the "Metrics" view:
Now you know how to write basic SOL queries and use core Motif data visualizations.
Next steps
- Learn how to load your data locally
- Learn more about the Sequence Operations Language (SOL)
- Explore analysis recipes