Skip to main content

Demo Datasets

We provide several example datasets to practice with and explore in Motif. To load them, go to the "Example Datasets" section of the dataset selector in the top left of the screen.

demo datasets

Motiflix

Motiflix is a simulated dataset that helps illustrate typical uses of Motif in a business-like setting. It simulates user behaviors on a video streaming platform, including browsing and searching for movies, bookmarking favorites, and watching trailers and movies.

Suggested exploration questions:

  • What events happen immediately before a user watches a movie?
  • Do more people start watching movies from a search result or from their favorites page?
  • Is the number of movie results returned during searching correlated with whether users end up watching a movie from the results list?

ATUS (American Time Use Survey)

ATUS is a survey run by the US Bureau of Labor Statistics. It includes self-reported sequences of activities, which respondents were engaged in during one 24 hour period of their lives.

Suggested exploration questions:

  • What do Americans spend the most time on?
  • What do Americans do most often after waking up?
  • What type of activities do Americans with long commutes give up most?

Flights

Flights from carrier on-time performance data set. Each event is a separate flight, including delayed and cancelled ones.

Suggested exploration questions:

  • How many flights does an airplane do per day on average?
  • Which airlines and airports have the longest average delays?
  • Are planes, which get delayed on one flight, able to catch up to their schedule on subsequent flights?

Github issues

Events from the lifecycle of Github issues: opening, assigning, cross-referencing, commenting, closing, etc. The data was pulled from public Github repositories using Github API.

NFL

NFL play-by-play data for 2009-2019 NFL seasons, organized into drive/possession sequences.

Suggested exploration questions:

  • How often do teams run a ball the 3rd time after 2 runs, which don't result in a first down?
  • What is the most common play after a long pass of 20+ yards?
  • What is the most successful 3 play sequence to get a fresh set of downs (reach a new first down)?

Wikispeedia

Wikispeedia is a game of getting from one Wikipedia article to another exclusively by following links in the articles players encounter. The dataset includes successful and unsuccessful play paths with each event corresponding to a visited article.

GA4 (Google Analytics Sample Data)

The ga4_obfuscated_sample_commerce dataset is a sample export from Google Analytics. The dataset was extracted following the instructions for the sample dataset in the BigQuery UI and exported as a JSON file. Its schema is described in Google's documentation.