More SOL concepts
Operation chaining
SOL supports operations chaining, allowing you to construct arbitrary long queries.
match search_page >> * >> watch_start
set search_to_watch_time = watch_start.ts - search_page.ts
match split ...
filter ...
replace ...
combine ...
...
Tag lifecycle
- Tags are added to and removed from sequences according to the following
lifecycle:
- The
match
operation removes existing tags and creates new ones - The
replace
operation can create new and move or remove existing tags - The
combine
operation removes all existing tags.
- The
- In addition to explicitly created tags via the match and replace operations,
SOL automatically creates and provides access to several implicit tags:
start
andend
are special tags that exist immediately before or after the sequenceSEQ
- refers to the whole sequence and can be used even without a prior match operationPREFIX
- refers to a sub-sequence of events before the first event matched by the match operationSUFFIX
- refers to a sub-sequence of events after the last event matched by the match operationMATCHED
- refers to a sub-sequence of all matched events
match search_click >> * >> watch_start
replace PREFIX with null
replace SUFFIX with null
set matched_events_num = length(SEQ)
Query execution
- SOL query execution follows the “show must go on” principle. Instead of
throwing runtime errors and interrupting your analysis, when SOL encounters an
expression it cannot compute, it casts the result as
null
. SOL then records them as “unexpected nulls”, which are surfaced to the user for monitoring and debugging. There are four possible types of unexpected nulls:- referencing a tag or dimension name, which does not exist
- trying to perform an operation on variables of incompatible types
- trying to perform an operation on tagged event arrays of different sizes
- trying to reference tag index, which does not exist.
- This design choice is made because event data is semi-structured, changing over time, and often unpredictable. Almost any query will encounter issues on some percentage of events, yet small errors are usually inconsequential and don’t affect the big event patterns that users care about.
- For example:
- The query wants to grab an event dimension value, but that dimension doesn’t exist on the event
- A tag is being referenced but it doesn’t exist on that sequence
- One number is divided by the second, which happens to be zero