Skip to main content

Quickstart

In 15 minutes, this tutorial will teach you how to run basic analysis in Motif. You can interactively explore examples in this guide by clicking on the “Motif link” links below the screenshots.

We'll be using the “Motiflix” dataset, which is simulated based on common user sessions in a movie streaming service like Netflix. Let's imagine that we are analysts on the Motiflix growth team and are exploring ways to improve search experience to increase movie watches.

Motif uses a small but powerful Sequence Operations Language (SOL) to query sequences. The most basic SOL query is finding sequences with a specific event, for example, finding when Motiflix users performed a search (viewed search results page):

match search_results_page

This query finds the first event called search_results_page in each sequence and tags it, so it can be highlighted in visualizations and used in subsequent SOL operations. The best way to understand how SOL queries work is by viewing their results in the “Examples” view:

match search_results_page Motif link

match operation tags matched events but does not remove sequences without matches. Notice that matches to search_results_page events have blue tags above them. Sequences without matches don't have tags. Matched events are preceded with a Prefix tag, which is automatically added and contains all events before the match. You can expand Prefix to see its events by hovering over its pill in the "Tag map" above the visualization.

To filter out sequences without matches, we can add the filter operation:

match search_results_page
filter MATCHED

Here MATCHED is another automatically added tag, consisting of all matched events. Filtering removes sequences without it.

match search_results_page filter MATCHED Motif link

To narrow down to situations where searching was followed later by starting to watch a movie, we can use the following query:

match search_results_page >> * >> watch_start
filter MATCHED

Here we define a match pattern consisting of 3 consecutive states. * in the 2nd state matches any event any number of times - similar to quantifiers in regular expressions. Here is a screenshot of the result:

match search_results_page >> * >> watch_start Motif link

Notice that matched watch_start events are displayed with red tags.

As we can see from the screenshot above, events in between might contain home_page or other search_result_page events. To find situations where searching directly led to starting to watch a movie, we can add an exclude list of event names to the middle state:

match search_results_page >> (^home_page,search_results_page)* >> watch_start
filter MATCHED

Symbol ^ is from the NOT operator in regex and we need to group excluded events into ().

We can narrow user behavior further to situations where searching directly led to starting to watch a movie within 2 minutes with no more than 5 events in between:

match search_results_page >> (^home_page,search_results_page){0,5} >> watch_start
if watch_start.ts - search_results_page.ts < 2min
filter MATCHED

{0,5} is another event quantifier from the regex syntax. search_results_page.ts and watch_start.ts access a time event dimension ts on the events tagged by the match. Here are the results:

complex match Motif link

Notice that the number of matched sequences - "Filtered actors to ...%" metric at the bottom of the page - went down from 23% to 12%.

Next, let’s see if the number of returned search results is correlated with whether a user starts watching one of them. By having watch_start event in the query, we filter out sequences without watches. We need to capture whether users start movies after searching, and we can do it by adding an optional ? quantifier to the watch_start state:

match search_results_page >> (^home_page,search_results_page){0,5} >> watch_start?
if watch_start.ts - search_results_page.ts < 2min
filter MATCHED

Now we can view the correlation we are after in the "Outcome" view. We need to select outcome to be the presence of the watch_start tag and condition to be search_results dimension on the tagged search_results_page event using the "Tag Map" above the plots:

tag map

As we can see in the top left plot, the probability of watching a movie decreases as more search results are being returned:

outcome view Motif link

Other plots display duration and number of steps between searching and watching (top right), rates of watching over time (bottom), and correlations of other dimensions of the tagged search_results_page event with watch_start (right panel).

Next, let's explore what user behaviors might be contributing to the observed differences in watch rates. A good way to do it is by aligning sequences on the search_results_page event:

match search_results_page
filter MATCHED

... and displaying them in "Barcode" view:

barcode view Motif link

Here you can see the prominence of various user paths starting from the matched event. In the screenshot above, one path is selected and can be viewed as a funnel in the panel on the right.

We can compare user paths with low and high number of returned search results by setting a condition in the Tag Map to be search_results dimension on search_results_page event again and then setting groups A and B to be 8 and 12 search results, respectively, in the display options above the Barcode plot:

compare mode Motif link

As you can see the plot is mostly grey with some events being blue and some - red, if there are significant differences in reaching that event. In the screenshot above, we have selected the big blue node and can see in the right panel that it corresponds to more users starting to watch a movie from a movie_page in group A (8 search results) than in group B (12 results): 64.2% vs. 14.8%.

Finally, let’s compute the distribution of durations between search_result_page and watch_start events:

match search_results_page >> * >> watch_start
filter MATCHED
set search_to_watch_time = duration(search_results_page, watch_start)

This query uses the set SOL operation to create a new sequence dimension search_to_watch_time with the duration() SOL function. You can review a full list of supported functions in documentation. Sequence dimensions can be plotted in the "Metrics" view:

metrics view Motif link

Now you know how to write basic SOL queries and use core Motif data visualizations.

Next steps