# Report query language¶

The *report query language* (RQL) is an embedded language in CSL that allows defining relations on data from past events.
It is possible to use relations in the expression language to compute aggregate information based on the data in the relations.

## Overview¶

RQL is a variant of Datalog which is a declarative logic-based programming language. RQL introduces two new constructions:

*relation definitions*for describing relations between data of past events; and*for expressions*for computing aggregate information from past events.

A relation definition merely declares a relationship between data - by itself it does not “do” anything.
This decoupling of the logical specification from the actual query execution allows the CSL runtime system to apply various optimization techniques.
The `for`

expressions are used to query the relations and apply a custom aggregation function to the result tuples.
Relationships are expressed by merely stating them, and the computation of the final query result must take care of finding a set of tuples that satisfies the constraints set out in the definitions of the relations through a process called unification.
Informally, each tuple in the result set will satisfy the property that if it is substituted into the relation being queried then it will be satisfied.
Moreover, the result set will contain all tuples with this property that can be deduced from the system in its current state given the events seen so far.

## Relations¶

A `relation`

specifies a relation between values such that they satisfy one or more of the clauses given in the definition.
These relations can be thought of as sets of n-ary tuples, and can be conveniently visualized as a table in this text.

To define a relation, we use the top-level definition `relation`

:

```
relation relationName(<params>) <clauses>
```

Here, `<params>`

(the “head”) and `<clauses>`

(the “body”) stand for a list of parameters `x_0, ..., x_n`

and a list of clauses `clause_0, ..., clause_m`

, respectively.
The name `relationName`

is the name of a relation between values of the parameters such that at least one of the clauses `clause_0`

- `clause_m`

can be *satisfied*.
A clause is satisfied of there exists a value substitution of all parameters that occur in the clause such that the clause evaluates to `True`

.

Parameters are formed using the normal syntax for constants and variables we use elsewhere in the language. In particular, type annotations may occur anywhere to either aid in documentation or to help the type checker, or both. The parameters in the head are special insofar that only variables, possibly type-annotated, may occur there. Relations themselves are also typed: the inferred type of a relation is a tuple consisting of each of the types of the parameters in the relation’s head.

There are three basic clause constructs and one combinator for forming bigger clauses from smaller ones:

- equality clauses
`x = y`

, stipulating the equivalence of two parameters. This clause is satisfied when the values being substituted for`x`

and`y`

are equal;- event-matching clauses
`Event { field0 = param0 , ..., fieldN = paramN } @ cid`

, denoting that an event of the given type and with the given fields has been applied on a contract. This clause is satisfied when there exists a value substitution of`param0`

to`paramN`

such that an event with the corresponding fields set to those values has happened on contract with contract id`cid`

;- relation-invocation clauses
`relationName(param0, ..., paramN)`

, which is satisfied only if the specified relation`relationName`

exists and can be satisfied with the given parameters; and- conjunctive clauses
- the conjunction
`and`

is used to form more advanced clauses from basic clauses:`clause0 and ... and clauseN`

..

The parameters for a relation must be a *subset* of the variables that occur in the rule-body.
For example, the following relation definition is not allowed, as none of `a`

, `b`

, or `c`

occur in the clause:

```
relation gibberish(a, b, c)
| Sale {} @ cid
```

### Equality clauses¶

To stipulate that two terms must be equal, we use the equality clause: `x = y`

.
Either side of the equality symbol may be a constant.

Example

```
relation theNumberTwo(x)
| x = 2
```

The relation `theNumberTwo`

is unary, so it just consists of a set of values.
This relation is satisfied for those values that can be substituted for `x`

and make one of the clauses satisfied.
In this example, there is only one clause, `x = 2`

, stipulating that `x`

must be equal to `2`

for it to be satisfied.
Clearly, there is exactly one satisfying assignment for `x`

here, namely the value `2`

.
As it is the only clause in the relation, it contains only the value `2`

.

Example

We can expand the simple relation from the previous example with some more clauses to make the unary relation bigger:

```
relation fourFirstPrimes(x)
| x = 2
| x = 3
| x = 5
| x = 7
```

This relation contains the values `2`

, `3`

, `5`

, and `7`

.

Example

In the previous two examples, constants were involved in all the equality clauses. We can also use parameters on both sides of the equality sign:

```
relation same(x : Int, y : Int)
| x = y
```

The binary relation denoted by `same`

contains all `Int`

tuples `(x, y)`

such that `x`

is equal to `y`

.
This relation is of course exactly that denoted by the equality clause itself, so the rule `same`

is superfluous.
However, in more advanced settings, where there are multiple constraints on the parameters, requiring two parameters to be equivalent becomes useful.

Note

Note that we have used type annotations in the relation definition above. Relations must be well-typed and the contract writer may use type annotations to guide the type inferrer.

### Event-matching clauses¶

A clause may contain a condition on past events of contracts:

```
Sale { saleId = x } @ someId
```

The clause above expresses a condition that is true if the contract with id equal to that assigned to the variable `someId`

was matched with an event of type `Sale`

where the field `saleId`

was assigned a value equal to the value of the variable `x`

.

Example

```
type Item | Book | Movie | Coffee
type Sale : Event {
saleId : Int,
price: Int,
item: Item
}
relation saleIds(x, y)
| Sale { saleId = x }@y
```

The relation `saleIds`

contains pairs of values `(x, y)`

of the type `Tuple Int ContractId`

for all events of type `Sale`

applied to contract with id `y`

and where the value of `saleId`

on the event was `x`

.

Suppose the following events have been applied: (reusing the notation `ev @ c`

to denote that event `ev`

was applied to contract with contract-id `c`

)

```
...
Sale { saleId = 1, price = 10, item = Book } @ c1
Sale { saleId = 2, price = 42, item = Coffee } @ c1
Sale { saleId = 1, price = 10, item = Book } @ c2
Sale { saleId = 2, price = 10, item = Book } @ c2
Sale { saleId = 3, price = 117, item = Movie } @ c1
...
```

The values related by the `saleIds`

relation can be visualized in the following table:

`x` |
`y` |
---|---|

`1` |
`c1` |

`2` |
`c1` |

`1` |
`c2` |

`2` |
`c2` |

`3` |
`c1` |

Example

A slightly extended relation that includes more information about the event-data is given below:

```
relation sales(saleId, price, item, cid)
| Sale { saleId = saleId, price = price, item = item }@cid
```

In the context of the events given above, the following table visualizes the data as related by `sales`

.

`saleId` |
`price` |
`item` |
`cid` |
---|---|---|---|

`1` |
`10` |
`Book` |
`c1` |

`2` |
`42` |
`Coffee` |
`c1` |

`1` |
`10` |
`Book` |
`c2` |

`2` |
`10` |
`Book` |
`c2` |

`3` |
`117` |
`Movie` |
`c1` |

### Relation-invocation clauses¶

A clause may consist of an invocation of a relation with a list of parameters: `relationName(param0, ..., paramN)`

.
These parameters can be either constants or variables themselves.
If a relation is invoked with only constants it corresponds to a check of whether the tuple containing those constants is in the relation.
Invoking a relation with n separate variables as parameters denotes the same relation as the rule itself, but restrictions can be placed by giving the same variable multiple times.

Example

The relation `lessThanFive`

denotes the set containing the integers `0`

to `4`

:

```
relation lessThanFive(x) | x = 0 | x = 1 | x = 2 | x = 3 | x = 4
```

We can invoke a relation as follows, essentially creating an alias:

```
relation alsoLessThanFive(x) | lessThanFive(x)
```

This relation definition expresses that whatever integer is in the unary relation `lessThanFive`

is also in the unary relation `alsoLessThanFive`

.

### Conjunctive clauses¶

A clause in a relation may contain multiple conditions using the conjunctive combinator `and`

.
The format of a conjunctive clause is:

```
clause0 and clause1 and ... and clauseN
```

where each of `clause0`

to `clauseN`

are basic (non-conjunctive) clauses.
Combining two or more clauses with `and`

yields a new clauses that is satisfied when all of the constituent subclauses are satisfied.

Example

```
relation bookSales(s, p, cid)
| Sale { saleId = s, price = p, item = i} @ cid and i = Book
```

The satisfying assignments of variables for the `bookSales`

relation in the context of the previously shown events are:

`s` |
`p` |
`cid` |
---|---|---|

`1` |
`10` |
`c1` |

`1` |
`10` |
`c2` |

`2` |
`10` |
`c2` |

Example

In this example, we assume that an event `Payment`

can also be applied to the contracts to which `Sale`

can be applied.
A relation that collects information from all `Payment`

-events is given by the `payments`

rule.

```
type Payment: Event {
receiver: Agent,
amount: Int,
id: Int
}
relation payments(r, a, i, cid)
| Payment { receiver = r, amount = a, id = i } @ cid
```

Assume that the following events are added.

```
Payment { receiver = a1, amount = 10, id = 1} @ c1
Payment { receiver = a2, amount = 10, id = 1} @ c2
Payment { receiver = a3, amount = 117, id = 3} @ c1
```

The relation `payments`

is visualized in the following table:

`r` |
`a` |
`i` |
`cid` |
---|---|---|---|

`a1` |
`10` |
`1` |
`c1` |

`a2` |
`10` |
`1` |
`c2` |

`a3` |
`117` |
`3` |
`c1` |

### Joining relations¶

By combining relation invocations with conjunctions we can formulate *joins*.
Joining relations is central to the usage of RQL, so it does not have any dedicated syntax.
By repeating variable names in multiple places in a clause we introduce constrains corresponding to joins: because the system finds satisfying assignments of all variables occurring in the clause, it is enough to repeat a variable multiple places to express that the same value must be found in that subpart.

Example

For example, the relations `sales`

and `payments`

can be combined into a joined relation that collects information for sales that have received payment:

```
relation paidSales(id, amount, item, cid)
| sales(id, amount, item, cid) and payments(r, amount, id, cid)
```

The occurrence of the same variable in `sales`

and `payments`

ensures that only the assignments to variables that are part of *both* relations are included as part of the `paidSales`

relation.

`id` |
`amount` |
`item` |
`cid` |
---|---|---|---|

`1` |
`10` |
`Book` |
`c1` |

`1` |
`10` |
`Book` |
`c2` |

`3` |
`117` |
`Movie` |
`c1` |

Example

One can further constrain the `paidSales`

relation and specialize it into a `paidBookSales`

rule:

```
relation paidBookSales(id, amount, cid)
| sales(id, amount, x, cid)
and payments(r, amount, id, cid)
and x = Book
```

In this case, we can inline the equality-condition on `x = Book`

and get:

```
relation paidBookSales(id, amount, cid)
| sales(id, amount, Book, cid) and payments(r, amount, id, cid)
```

`id` |
`amount` |
`cid` |
---|---|---|

`1` |
`10` |
`c1` |

`1` |
`10` |
`c2` |

### Multiple clause-bodies¶

A relation may be defined by the union (disjunction) of multiple clauses:

```
relation rName(params)
| clause_1
| clause_2
| ...
| clause_n
```

An assignment of values to parameters is satisfying if at least one of `clause_1`

… `clause_2`

can be evaluated to `True`

.

Example

A relation that includes sales of either `Coffee`

or `Movie`

can be given as:

```
relation bookOrMovieSales(id, price, cid)
| sales(id, price, Coffee, cid)
| sales(id, price, Movie, cid)
```

The visualized version is below:

`id` |
`price` |
`cid` |
---|---|---|

`2` |
`42` |
`c1` |

`3` |
`117` |
`c1` |

Example

The clauses in disjunctions may be completely different:

```
relation saleOrPayment(id, cid)
| sales(id, amount1, item, cid)
| payments(r, amount2, id, cid)
```

### Recursive relations¶

Relations can be recursive if the `rec`

keyword is used.
This allows us to express complex relationships in a succinct manner.

A recursive relation is declared in the same way as a normal one, except that we add the keyword `rec`

:

```
relation rec relationName(<params>) | <clauses>
```

Clauses in the list `<clauses>`

may now include relation invocations of the relation `relationName`

itself.
Relations may also be mutually recursive: this is achieved with the combination of `rec`

and `with`

also used for mutually recursive templates:

```
relation rec relation0(<params0>) | <clauses0>
with relation1(<params1>) | <clauses1>
...
with relationN(<paramsN>) | <clausesN>
```

The relations `relation0`

to `relationN`

may freely refer to one another recursively through relation invocations.

Example

The following CSL code demonstrates how to express a transitive closure of another relation.
Customers are related via their id strings through `CustomerEvent`

events.
In line 6 the relation `directlyRelated`

is defined, containing pairs of customer ids such that an event has happened (on any contract id, since `cid`

is free) that relates the two.
Next, line 9 defines the relation `relatedCustomers`

as the transitive closure of `directlyRelated`

.
Customer id pairs belong to this relation if either 1) they are directly related, as per the `directlyRelated`

relation, or 2) there exists some third customer id `x`

that `ancestor`

is related to and which itself is directly related to `descendant`

.
Hence, customer id pairs in the `relatedCustomers`

may be either directly related or indirectly related through a chain of ids.

1 2 3 4 5 6 7 8 9 10 11 12 | ```
type CustomerEvent : Event {
customerId: String,
referrerId: String
}
relation directlyRelated(parent, child, cid)
| CustomerEvent { customerId = parent, referrerId = child } @ cid
relation rec relatedCustomers(ancestor, descendant, cid)
| directlyRelated(ancestor, descendant, cid)
| relatedCustomers(ancestor, x, cid)
and directlyRelated(x, descendant, cid)
``` |

Let us assume that the following events have occurred:

```
CustomerEvent { customerId = 2, referrerId = 1 } @ c1
CustomerEvent { customerId = 3, referrerId = 1 } @ c1
CustomerEvent { customerId = 4, referrerId = 3 } @ c1
CustomerEvent { customerId = 5, referrerId = 3 } @ c1
CustomerEvent { customerId = 6, referrerId = 2 } @ c1
```

The relation `directlyRelated`

now contains:

`parent` |
`child` |
`cid` |
---|---|---|

`1` |
`2` |
`c1` |

`1` |
`3` |
`c1` |

`3` |
`4` |
`c1` |

`3` |
`5` |
`c1` |

`2` |
`6` |
`c1` |

If we draw this relation as a tree we get something akin to a “referral hierarchy”:

```
1
/ \
2 3
/ / \
6 4 5
```

The relation `relatedCustomers`

expresses that a customer id occurs below another one in this tree:

`ancestor` |
`descendant` |
`cid` |
---|---|---|

`1` |
`2` |
`c1` |

`1` |
`6` |
`c1` |

`1` |
`3` |
`c1` |

`1` |
`4` |
`c1` |

`1` |
`5` |
`c1` |

`2` |
`6` |
`c1` |

`3` |
`4` |
`c1` |

`3` |
`5` |
`c1` |

If we want to take the reflexive transitive closure, i.e., express that a customer id is always related to one self, we simply add an extra clause:

```
relation relatedCustomersRefl(ancestor, descendant, cid)
| CustomerEvent { customerId = ancestor } @ cid
and ancestor = descendant
| relatedCustomers(ancestor, descendant, cid)
```

`for`

expressions¶

Relation definitions do not by themselves compute the tuples that are members of the relation, they merely state the constraints such tuples must satisfy.
In order to get actual values that can be used in the expression language as part of reports we must execute a query over a relation.
For this purpose, the expression language contains the `for`

expression, which performs a computation over the result tuples of a query over a relation:

```
for pattern in relationName do reducer
```

A `for`

expression provides a way to perform a computation over all the satisfying assignments of values to parametes for the given relation.
It contains three parts:

- the
`pattern`

is a pattern that is used to assign names to the values; - the
`relationName`

is some previously-defined relation; and - the
`reducer`

is an expression of the type`t -> t`

, where`t`

is the type of the result value that is being built by the query.

Any names bound in the pattern will be in scope in the `reducer`

, and the `reducer`

expression is called once for each value in the result set, carrying along the partially built result of type `t`

and returning a “more complete” result of the same type.
The type of the entire `for`

expression is itself the type of the given reducer, i.e., `t -> t`

.
Thus, once we have built an expression with the `for`

construct, we need to supply it some initial state in order to compute the complete result.
An example of a an initial value would be `0`

if the state type `t`

is an `Int`

and the reducer computes a sum.

Example

The following simple example demonstrates how the `for`

expression is used to perform a query computing a sum based on past events.
We first define a sum type `BookOrMovie`

that we use as an example of data which can be used to discern between events of the type `BookOrMovieEvent`

through the `style`

field.
Next we define the relation `bookStyleEventTimeStamps`

between time stamps and contract ids - i.e., a set of tuples of the type `Tuple DateTime ContractId`

.
Each tuple in the set has the property that the timestamp is from a `BookOrMovieEvent`

where the `style`

field is set to the value `Book`

.
This is ensured by the constraint put on the `style`

field in line 7.

The relation is used in a query in lines 11 and 12, where we deconstruct each tuple in the result set with the pattern `(_, c)`

.
This pattern ignores the actual timestamp values by using the `_`

wildcard pattern and it binds the contract ids to the name `c`

.
Our report `countFrom`

takes a contract id `cid`

and yields a function that, given an initial integer value, returns the sum of the initial value and the number of `Book`

-style `BookOrMovieEvent`

’s that have happened on the contract `cid`

.
The report `sum`

uses `countFrom`

and sets the initial value to `0`

.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ```
type BookOrMovie | Book | Movie
type BookOrMovieEvent : Event {
style : BookOrMovie
}
relation bookStyleEventTimeStamps(t, cid)
| BookOrMovieEvent { timestamp = t, style = Book }@cid
/// countFrom : ContractId -> Int -> Int
val countFrom = \cid ->
for (_, c) in bookStyleEventTimeStamps do
if (c = cid) \x -> 1 + x else \x -> x
/// sum : ContractId -> Int
val sum = \cid -> countFrom cid 0
``` |

### Caveat¶

The use of `for`

-expressions isn’t supported in `where`

-clauses of contracts.
Rules and `for`

-expressions are supported only when running reports (functions) externally.

Example

A consequence of disallowing use of `for`

-expressions is that the following definition is not allowed:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ```
type BookOrMovie | Book | Movie
type BookOrMovieEvent : Event {
style : BookOrMovie
}
relation bookStyleEventTimeStamps(t, cid)
| BookOrMovieEvent { timestamp = t, style = Book }@cid
/// countFrom : Int -> Int
val countFrom =
for (_, _) in bookStyleEventTimeStamps do
\x -> 1 + x
/// sum : ContractId -> Int
val sum = countFrom 0
``` |