Skip to main content

Replacing

Sequence Operations Language (SOL) borrows another important concept from regular expressions - replacing event patterns in sequences through the replace operation.

replace gif

replace provides powerful data wrangling capabilities for transforming and coarsening data tailored to answering specific business questions.

The simplest replace operation is removing unwanted events by replacing them with null:

match split UnwantedEvent(event1 | event2)
// Replace matched events with null to remove
replace UnwantedEvent with null
// Combine sub-sequences into original sequences
combine

In general, replace takes one previously defined tag as an argument, followed by a with clause, which specifies a new sub-sequence to substitute in the place of events labelled with that tag, and finally an optional dims clause, which defines how to pass event dimensions.

match A(a) >> B(^a)*
if duration(A, B) < 1d

// Replace the whole sequence with just event tagged "A" followed by events tagged "B"
replace SEQ with A >> B

// Insert a new event "churn" after events tagged "B"
replace B with B >> (churn)

// Combine all events in tag "B" into one event called "non_a" and labelled with tag "C"
replace B with C(non_a) dims
C.event_num = length(B),
C.duration = B[-1].ts - B[0].ts

// Duplicate event tagged "A" and label the 2nd copy with tag "C"
replace A with A >> C(@A)

The substitute sub-sequence in the with clause mostly follows the same syntax as match patterns in the match operation with a few exceptions:

  • can’t use event quantifiers
  • can’t use event include lists and exclude lists
  • can reference existing tags as A (insert all tagged events and the tag) and (@A) (insert tagged events only)

If the tag, which is asked to be replaced, does not exist in a given sequence, replace doesn’t do anything with that sequence.

You can find more common examples for using replace in the SOL recipes.