Report query language

The report query language (RQL) is an embedded language in CSL that allows defining relations on data from past events. It is possible to use relations in the expression language to compute aggregate information based on the data in the relations.

Overview

RQL is a variant of Datalog which is a declarative logic-based programming language. RQL introduces two new constructions:

relation definitions for describing relations between data of past events; and
for expressions for computing aggregate information from past events.

A relation definition merely declares a relationship between data - by itself it does not “do” anything. This decoupling of the logical specification from the actual query execution allows the CSL runtime system to apply various optimization techniques, such as incremental view maintenance using standard optimized Datalog evaluation algorithms. The for expressions are used to query the relations and apply a custom aggregation function to the result tuples. Relationships are expressed by merely stating them, and the computation of the final query result must take care of finding a set of tuples that satisfies the constraints set out in the definitions of the relations through a process called unification. Informally, each tuple in the result set will satisfy the property that if it is substituted into the relation being queried then it will be satisfied. Moreover, the result set will contain all tuples with this property that can be deduced from the system in its current state given the events seen so far.

Relations

A relation specifies a relation between values such that they satisfy one or more of the clauses given in the definition. These relations can be thought of as sets of n-ary tuples, and can be conveniently visualized as a table in this text.

To define a relation, we use the top-level definition relation:

relation relationName(<params>) <clauses>

Here, <params> (the “head”) and <clauses> (the “body”) stand for a list of parameters x_0, ..., x_n and a list of clauses clause_0, ..., clause_m, respectively. The name relationName is the name of a relation between values of the parameters such that at least one of the clauses clause_0 - clause_m can be satisfied. A clause is satisfied if there exists a value substitution of all parameters that occur in the clause such that the clause evaluates to True.

Parameters are formed using the normal syntax for constants and variables we use elsewhere in the language. In particular, type annotations may occur anywhere to either aid in documentation or to help the type checker, or both. The parameters in the head are special insofar that only variables, possibly type-annotated, may occur there. Relations themselves are also typed: the inferred type of a relation is a tuple consisting of each of the types of the parameters in the relation’s head.

There are three basic clause constructs and one combinator for forming bigger clauses from smaller ones:

equality clauses: x = y, stipulating the equivalence of two parameters. This clause is satisfied when the values being substituted for x and y are equal;
event-matching clauses: Event { field0 = param0 , ..., fieldN = paramN } @ cid, denoting that an event of the given type and with the given fields has been applied on a contract. This clause is satisfied when there exists a value substitution of param0 to paramN such that an event with the corresponding fields set to those values has happened on contract with contract instance cid;
relation-invocation clauses: relationName(param0, ..., paramN), which is satisfied only if the specified relation relationName exists and can be satisfied with the given parameters; and
conjunctive clauses: the conjunction and is used to form more advanced clauses from basic clauses: clause0 and ... and clauseN..

The parameters for a relation must be a subset of the variables that occur in the rule-body. For example, the following relation definition is not allowed, as none of a, b, or c occur in the clause:

relation gibberish(a, b, c)
  | Sale {} @ cid

Equality clauses

To stipulate that two terms must be equal, we use the equality clause: x = y. Either side of the equality symbol may be a constant.

Example

relation theNumberTwo(x)
  | x = 2

The relation theNumberTwo is unary, so it just consists of a set of values. This relation is satisfied for those values that can be substituted for x and make one of the clauses satisfied. In this example, there is only one clause, x = 2, stipulating that x must be equal to 2 for it to be satisfied. Clearly, there is exactly one satisfying assignment for x here, namely the value 2. As it is the only clause in the relation, it contains only the value 2.

Example

We can expand the simple relation from the previous example with some more clauses to make the unary relation bigger:

relation fourFirstPrimes(x)
  | x = 2
  | x = 3
  | x = 5
  | x = 7

This relation contains the values 2, 3, 5, and 7.

Example

In the previous two examples, constants were involved in all the equality clauses. We can also use parameters on both sides of the equality sign:

relation same(x : Int, y : Int)
  | x = y

The binary relation denoted by same contains all Int tuples (x, y) such that x is equal to y. This relation is of course exactly that denoted by the equality clause itself, so the rule same is superfluous. However, in more advanced settings, where there are multiple constraints on the parameters, requiring two parameters to be equivalent becomes useful.

Note

Note that we have used type annotations in the relation definition above. Relations must be well-typed and the contract writer may use type annotations to guide the type inferrer.

Event-matching clauses

A clause may contain a condition on past events of contracts:

Sale { saleId = x } @ someId

The clause above expresses a condition that is true if the contract with id equal to that assigned to the variable someId was matched with an event of type Sale where the field saleId was assigned a value equal to the value of the variable x.

Example

type Item | Book | Movie | Coffee

type Sale : Event {
    saleId : Int,
    price: Int,
    item: Item
}

relation saleIds(x, y)
    | Sale { saleId = x }@y

The relation saleIds contains pairs of values (x, y) of the type Tuple Int ContractInstance for all events of type Sale applied to contract with id y and where the value of saleId on the event was x.

Suppose the following events have been applied: (reusing the notation ev @ c to denote that event ev was applied to contract with contract-id c)

...
Sale { saleId = 1, price = 10, item = Book } @ c1
Sale { saleId = 2, price = 42, item = Coffee } @ c1
Sale { saleId = 1, price = 10, item = Book } @ c2
Sale { saleId = 2, price = 10, item = Book } @ c2
Sale { saleId = 3, price = 117, item = Movie } @ c1
...

The values related by the saleIds relation can be visualized in the following table:

`x`	`y`
`1`	`c1`
`2`	`c1`
`1`	`c2`
`2`	`c2`
`3`	`c1`

Example

A slightly extended relation that includes more information about the event-data is given below:

relation sales(saleId, price, item, cid)
  | Sale { saleId = saleId, price = price, item = item }@cid

In the context of the events given above, the following table visualizes the data as related by sales.

`saleId`	`price`	`item`	`cid`
`1`	`10`	`Book`	`c1`
`2`	`42`	`Coffee`	`c1`
`1`	`10`	`Book`	`c2`
`2`	`10`	`Book`	`c2`
`3`	`117`	`Movie`	`c1`

Relation-invocation clauses

A clause may consist of an invocation of a relation with a list of parameters: relationName(param0, ..., paramN). These parameters can be either constants or variables themselves. If a relation is invoked with only constants it corresponds to a check of whether the tuple containing those constants is in the relation. Invoking a relation with n separate variables as parameters denotes the same relation as the rule itself, but restrictions can be placed by giving the same variable multiple times.

Example

The relation lessThanFive denotes the set containing the integers 0 to 4:

relation lessThanFive(x) | x = 0 | x = 1 | x = 2 | x = 3 | x = 4

We can invoke a relation as follows, essentially creating an alias:

relation alsoLessThanFive(x) | lessThanFive(x)

This relation definition expresses that whatever integer is in the unary relation lessThanFive is also in the unary relation alsoLessThanFive.

Conjunctive clauses

A clause in a relation may contain multiple conditions using the conjunctive combinator and. The format of a conjunctive clause is:

clause0 and clause1 and ... and clauseN

where each of clause0 to clauseN are basic (non-conjunctive) clauses. Combining two or more clauses with and yields a new clauses that is satisfied when all of the constituent subclauses are satisfied.

Example

relation bookSales(s, p, cid)
  | Sale { saleId = s, price = p, item = i} @ cid and i = Book

The satisfying assignments of variables for the bookSales relation in the context of the previously shown events are:

`s`	`p`	`cid`
`1`	`10`	`c1`
`1`	`10`	`c2`
`2`	`10`	`c2`

Example

In this example, we assume that an event Payment can also be applied to the contracts to which Sale can be applied. A relation that collects information from all Payment-events is given by the payments rule.

type Payment: Event {
  receiver: Agent,
  amount: Int,
  id: Int
}

relation payments(r, a, i, cid)
  | Payment { receiver = r, amount = a, id = i } @ cid

Assume that the following events are added.

Payment { receiver = a1, amount = 10, id = 1} @ c1
Payment { receiver = a2, amount = 10, id = 1} @ c2
Payment { receiver = a3, amount = 117, id = 3} @ c1

The relation payments is visualized in the following table:

`r`	`a`	`i`	`cid`
`a1`	`10`	`1`	`c1`
`a2`	`10`	`1`	`c2`
`a3`	`117`	`3`	`c1`

Joining relations

By combining relation invocations with conjunctions we can formulate joins. Joining relations is central to the usage of RQL, so it does not have any dedicated syntax. By repeating variable names in multiple places in a clause we introduce constrains corresponding to joins: because the system finds satisfying assignments of all variables occurring in the clause, it is enough to repeat a variable multiple places to express that the same value must be found in that subpart.

Example

For example, the relations sales and payments can be combined into a joined relation that collects information for sales that have received payment:

relation paidSales(id, amount, item, cid)
  | sales(id, amount, item, cid) and payments(r, amount, id, cid)

The occurrence of the same variable in sales and payments ensures that only the assignments to variables that are part of both relations are included as part of the paidSales relation.

`id`	`amount`	`item`	`cid`
`1`	`10`	`Book`	`c1`
`1`	`10`	`Book`	`c2`
`3`	`117`	`Movie`	`c1`

Example

One can further constrain the paidSales relation and specialize it into a paidBookSales rule:

relation paidBookSales(id, amount, cid)
  | sales(id, amount, x, cid)
    and payments(r, amount, id, cid)
    and x = Book

In this case, we can inline the equality-condition on x = Book and get:

relation paidBookSales2(id, amount, cid)
  | sales(id, amount, Book, cid) and payments(r, amount, id, cid)

`id`	`amount`	`cid`
`1`	`10`	`c1`
`1`	`10`	`c2`

Multiple clause-bodies

A relation may be defined by the union (disjunction) of multiple clauses:

relation rName(params)
  | clause_1
  | clause_2
  | ...
  | clause_n

An assignment of values to parameters is satisfying if at least one of clause_1 … clause_2 can be evaluated to True.

Example

A relation that includes sales of either Coffee or Movie can be given as:

relation bookOrMovieSales(id, price, cid)
  | sales(id, price, Coffee, cid)
  | sales(id, price, Movie, cid)

The visualized version is below:

`id`	`price`	`cid`
`2`	`42`	`c1`
`3`	`117`	`c1`

Example

The clauses in disjunctions may be completely different:

relation saleOrPayment(id, cid)
  | sales(id, amount1, item, cid)
  | payments(r, amount2, id, cid)

Recursive relations

Relations can be recursive if the rec keyword is used. This allows us to express complex relationships in a succinct manner.

A recursive relation is declared in the same way as a normal one, except that we add the keyword rec:

relation rec relationName(<params>) | <clauses>

Clauses in the list <clauses> may now include relation invocations of the relation relationName itself. Relations may also be mutually recursive: this is achieved with the combination of rec and with also used for mutually recursive templates:

relation rec relation0(<params0>) | <clauses0>
with relation1(<params1>) | <clauses1>
...
with relationN(<paramsN>) | <clausesN>

The relations relation0 to relationN may freely refer to one another recursively through relation invocations.

Example

The following CSL code demonstrates how to express a transitive closure of another relation. Customers are related via their id strings through CustomerEvent events. In line 6 the relation directlyRelated is defined, containing pairs of customer ids such that an event has happened (on any contract instance, since cid is free) that relates the two. Next, line 9 defines the relation relatedCustomers as the transitive closure of directlyRelated. Customer id pairs belong to this relation if either 1) they are directly related, as per the directlyRelated relation, or 2) there exists some third customer id x that ancestor is related to and which itself is directly related to descendant. Hence, customer id pairs in the relatedCustomers may be either directly related or indirectly related through a chain of ids.

type CustomerEvent : Event {
  customerId: String,
  referrerId: String
}

relation directlyRelated(parent, child, cid)
  | CustomerEvent { customerId = parent, referrerId = child } @ cid

relation rec relatedCustomers(ancestor, descendant, cid)
  | directlyRelated(ancestor, descendant, cid)
  | relatedCustomers(ancestor, x, cid)
    and directlyRelated(x, descendant, cid)

Let us assume that the following events have occurred:

CustomerEvent { customerId = 2, referrerId = 1 } @ c1
CustomerEvent { customerId = 3, referrerId = 1 } @ c1
CustomerEvent { customerId = 4, referrerId = 3 } @ c1
CustomerEvent { customerId = 5, referrerId = 3 } @ c1
CustomerEvent { customerId = 6, referrerId = 2 } @ c1

The relation directlyRelated now contains:

`parent`	`child`	`cid`
`1`	`2`	`c1`
`1`	`3`	`c1`
`3`	`4`	`c1`
`3`	`5`	`c1`
`2`	`6`	`c1`

If we draw this relation as a tree we get something akin to a “referral hierarchy”:

The relation relatedCustomers expresses that a customer id occurs below another one in this tree:

`ancestor`	`descendant`	`cid`
`1`	`2`	`c1`
`1`	`6`	`c1`
`1`	`3`	`c1`
`1`	`4`	`c1`
`1`	`5`	`c1`
`2`	`6`	`c1`
`3`	`4`	`c1`
`3`	`5`	`c1`

If we want to take the reflexive transitive closure, i.e., express that a customer id is always related to one self, we simply add an extra clause:

relation relatedCustomersRefl(ancestor, descendant, cid)
  | CustomerEvent { customerId = ancestor } @ cid
    and ancestor = descendant
  | relatedCustomers(ancestor, descendant, cid)

`for` expressions

Relation definitions do not by themselves compute the tuples that are members of the relation, they merely state the constraints such tuples must satisfy. In order to get actual values that can be used in the expression language as part of reports we must execute a query over a relation. For this purpose, the expression language contains the for expression, which performs a computation over the result tuples of a query over a relation:

for pattern in relationName do reducer

A for expression provides a way to perform a computation over all the satisfying assignments of values to parametes for the given relation. It contains three parts:

the pattern is a pattern that is used to assign names to the values;
the relationName is some previously-defined relation; and
the reducer is an expression of the type t -> t, where t is the type of the result value that is being built by the query.

Any names bound in the pattern will be in scope in the reducer, and the reducer expression is called once for each value in the result set, carrying along the partially built result of type t and returning a “more complete” result of the same type. The type of the entire for expression is itself the type of the given reducer, i.e., t -> t. Thus, once we have built an expression with the for construct, we need to supply it some initial state in order to compute the complete result. An example of a an initial value would be 0 if the state type t is an Int and the reducer computes a sum.

Example

The following simple example demonstrates how the for expression is used to perform a query computing a sum based on past events. We first define a sum type BookOrMovie that we use as an example of data which can be used to discern between events of the type BookOrMovieEvent through the style field. Next we define the relation bookStyleEventTimeStamps between time stamps and contract instances - i.e., a set of tuples of the type Tuple Instant ContractInstance. Each tuple in the set has the property that the timestamp is from a BookOrMovieEvent where the style field is set to the value Book. This is ensured by the constraint put on the style field in line 7.

The relation is used in a query in lines 11 and 12, where we deconstruct each tuple in the result set with the pattern (_, c). This pattern ignores the actual timestamp values by using the _ wildcard pattern and it binds the contract instances to the name c. Our report countFrom takes a contract instance cid and yields a function that, given an initial integer value, returns the sum of the initial value and the number of Book-style BookOrMovieEvent’s that have happened on the contract cid. The report sum uses countFrom and sets the initial value to 0.

type BookOrMovie | Book | Movie
type BookOrMovieEvent : Event {
  style : BookOrMovie
}

relation bookStyleEventTimeStamps(t, cid)
  | BookOrMovieEvent { timestamp = t, style = Book }@cid

/// countFrom : ContractInstance -> Int -> Int
val countFrom = \cid ->
  for (_, c) in bookStyleEventTimeStamps do
  if (c = cid) \x -> 1 + x else \x -> x

/// sum : ContractInstance -> Int
val sum = \cid -> countFrom cid 0

Caveat

The use of for-expressions isn’t supported in where-clauses of contracts. Rules and for-expressions are supported only when running reports (functions) externally.

Example

A consequence of disallowing use of for-expressions is that the following definition is not allowed:

type BookOrMovie | Book | Movie
type BookOrMovieEvent : Event {
  style : BookOrMovie
}

relation bookStyleEventTimeStamps(t, cid)
  | BookOrMovieEvent { timestamp = t, style = Book }@cid

/// countFrom : Int -> Int
val countFrom =
  for (_, _) in bookStyleEventTimeStamps do
    \x -> 1 + x

/// sum : ContractInstance -> Int
val sum = countFrom 0

Report query language

Overview

Relations

Equality clauses

Event-matching clauses

Relation-invocation clauses

Conjunctive clauses

Joining relations

Multiple clause-bodies

Recursive relations

for expressions

Caveat

`for` expressions