Report query language
The report query language (RQL) is an embedded language in CSL that allows defining relations on data from past events. It is possible to use relations in the expression language to compute aggregate information based on the data in the relations.
Overview
RQL is a variant of Datalog which is a declarative logic-based programming language. RQL introduces two new constructions:
relation definitions for describing relations between data of past events; and
for expressions for computing aggregate information from past events.
A relation definition merely declares a relationship between data - by itself it does not “do” anything.
This decoupling of the logical specification from the actual query execution allows the CSL runtime system to apply various optimization techniques, such as incremental view maintenance using standard optimized Datalog evaluation algorithms.
The for
expressions are used to query the relations and apply a custom aggregation function to the result tuples.
Relationships are expressed by merely stating them, and the computation of the final query result must take care of finding a set of tuples that satisfies the constraints set out in the definitions of the relations through a process called unification.
Informally, each tuple in the result set will satisfy the property that if it is substituted into the relation being queried then it will be satisfied.
Moreover, the result set will contain all tuples with this property that can be deduced from the system in its current state given the events seen so far.
Relations
A relation
specifies a relation between values such that they satisfy one or more of the clauses given in the definition.
These relations can be thought of as sets of n-ary tuples, and can be conveniently visualized as a table in this text.
To define a relation, we use the top-level definition relation
:
relation relationName(<params>) <clauses>
Here, <params>
(the “head”) and <clauses>
(the “body”) stand for a list of parameters x_0, ..., x_n
and a list of clauses clause_0, ..., clause_m
, respectively.
The name relationName
is the name of a relation between values of the parameters such that at least one of the clauses clause_0
- clause_m
can be satisfied.
A clause is satisfied if there exists a value substitution of all parameters that occur in the clause such that the clause evaluates to True
.
Parameters are formed using the normal syntax for constants and variables we use elsewhere in the language. In particular, type annotations may occur anywhere to either aid in documentation or to help the type checker, or both. The parameters in the head are special insofar that only variables, possibly type-annotated, may occur there. Relations themselves are also typed: the inferred type of a relation is a tuple consisting of each of the types of the parameters in the relation’s head.
There are three basic clause constructs and one combinator for forming bigger clauses from smaller ones:
- equality clauses
x = y
, stipulating the equivalence of two parameters. This clause is satisfied when the values being substituted forx
andy
are equal;- event-matching clauses
Event { field0 = param0 , ..., fieldN = paramN } @ cid
, denoting that an event of the given type and with the given fields has been applied on a contract. This clause is satisfied when there exists a value substitution ofparam0
toparamN
such that an event with the corresponding fields set to those values has happened on contract with contract instancecid
;- relation-invocation clauses
relationName(param0, ..., paramN)
, which is satisfied only if the specified relationrelationName
exists and can be satisfied with the given parameters; and- conjunctive clauses
the conjunction
and
is used to form more advanced clauses from basic clauses:clause0 and ... and clauseN
..
The parameters for a relation must be a subset of the variables that occur in the rule-body.
For example, the following relation definition is not allowed, as none of a
, b
, or c
occur in the clause:
relation gibberish(a, b, c)
| Sale {} @ cid
Equality clauses
To stipulate that two terms must be equal, we use the equality clause: x = y
.
Either side of the equality symbol may be a constant.
Example
relation theNumberTwo(x)
| x = 2
The relation theNumberTwo
is unary, so it just consists of a set of values.
This relation is satisfied for those values that can be substituted for x
and make one of the clauses satisfied.
In this example, there is only one clause, x = 2
, stipulating that x
must be equal to 2
for it to be satisfied.
Clearly, there is exactly one satisfying assignment for x
here, namely the value 2
.
As it is the only clause in the relation, it contains only the value 2
.
Example
We can expand the simple relation from the previous example with some more clauses to make the unary relation bigger:
relation fourFirstPrimes(x)
| x = 2
| x = 3
| x = 5
| x = 7
This relation contains the values 2
, 3
, 5
, and 7
.
Example
In the previous two examples, constants were involved in all the equality clauses. We can also use parameters on both sides of the equality sign:
relation same(x : Int, y : Int)
| x = y
The binary relation denoted by same
contains all Int
tuples (x, y)
such that x
is equal to y
.
This relation is of course exactly that denoted by the equality clause itself, so the rule same
is superfluous.
However, in more advanced settings, where there are multiple constraints on the parameters, requiring two parameters to be equivalent becomes useful.
Note
Note that we have used type annotations in the relation definition above. Relations must be well-typed and the contract writer may use type annotations to guide the type inferrer.
Event-matching clauses
A clause may contain a condition on past events of contracts:
Sale { saleId = x } @ someId
The clause above expresses a condition that is true if the contract with id equal to that assigned to the variable someId
was matched with an event of type Sale
where the field saleId
was assigned a value equal to the value of the variable x
.
Example
type Item | Book | Movie | Coffee
type Sale : Event {
saleId : Int,
price: Int,
item: Item
}
relation saleIds(x, y)
| Sale { saleId = x }@y
The relation saleIds
contains pairs of values (x, y)
of the type Tuple Int ContractInstance
for all events of type Sale
applied to contract with id y
and where the value of saleId
on the event was x
.
Suppose the following events have been applied: (reusing the notation ev @ c
to denote that event ev
was applied to contract with contract-id c
)
...
Sale { saleId = 1, price = 10, item = Book } @ c1
Sale { saleId = 2, price = 42, item = Coffee } @ c1
Sale { saleId = 1, price = 10, item = Book } @ c2
Sale { saleId = 2, price = 10, item = Book } @ c2
Sale { saleId = 3, price = 117, item = Movie } @ c1
...
The values related by the saleIds
relation can be visualized in the following table:
|
|
---|---|
|
|
|
|
|
|
|
|
|
|
Example
A slightly extended relation that includes more information about the event-data is given below:
relation sales(saleId, price, item, cid)
| Sale { saleId = saleId, price = price, item = item }@cid
In the context of the events given above, the following table visualizes the data as related by sales
.
|
|
|
|
---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Relation-invocation clauses
A clause may consist of an invocation of a relation with a list of parameters: relationName(param0, ..., paramN)
.
These parameters can be either constants or variables themselves.
If a relation is invoked with only constants it corresponds to a check of whether the tuple containing those constants is in the relation.
Invoking a relation with n separate variables as parameters denotes the same relation as the rule itself, but restrictions can be placed by giving the same variable multiple times.
Example
The relation lessThanFive
denotes the set containing the integers 0
to 4
:
relation lessThanFive(x) | x = 0 | x = 1 | x = 2 | x = 3 | x = 4
We can invoke a relation as follows, essentially creating an alias:
relation alsoLessThanFive(x) | lessThanFive(x)
This relation definition expresses that whatever integer is in the unary relation lessThanFive
is also in the unary relation alsoLessThanFive
.
Conjunctive clauses
A clause in a relation may contain multiple conditions using the conjunctive combinator and
.
The format of a conjunctive clause is:
clause0 and clause1 and ... and clauseN
where each of clause0
to clauseN
are basic (non-conjunctive) clauses.
Combining two or more clauses with and
yields a new clauses that is satisfied when all of the constituent subclauses are satisfied.
Example
relation bookSales(s, p, cid)
| Sale { saleId = s, price = p, item = i} @ cid and i = Book
The satisfying assignments of variables for the bookSales
relation in the context of the previously shown events are:
|
|
|
---|---|---|
|
|
|
|
|
|
|
|
|
Example
In this example, we assume that an event Payment
can also be applied to the contracts to which Sale
can be applied.
A relation that collects information from all Payment
-events is given by the payments
rule.
type Payment: Event {
receiver: Agent,
amount: Int,
id: Int
}
relation payments(r, a, i, cid)
| Payment { receiver = r, amount = a, id = i } @ cid
Assume that the following events are added.
Payment { receiver = a1, amount = 10, id = 1} @ c1
Payment { receiver = a2, amount = 10, id = 1} @ c2
Payment { receiver = a3, amount = 117, id = 3} @ c1
The relation payments
is visualized in the following table:
|
|
|
|
---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Joining relations
By combining relation invocations with conjunctions we can formulate joins. Joining relations is central to the usage of RQL, so it does not have any dedicated syntax. By repeating variable names in multiple places in a clause we introduce constrains corresponding to joins: because the system finds satisfying assignments of all variables occurring in the clause, it is enough to repeat a variable multiple places to express that the same value must be found in that subpart.
Example
For example, the relations sales
and payments
can be combined into a joined relation that collects information for sales that have received payment:
relation paidSales(id, amount, item, cid)
| sales(id, amount, item, cid) and payments(r, amount, id, cid)
The occurrence of the same variable in sales
and payments
ensures that only the assignments to variables that are part of both relations are included as part of the paidSales
relation.
|
|
|
|
---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Example
One can further constrain the paidSales
relation and specialize it into a paidBookSales
rule:
relation paidBookSales(id, amount, cid)
| sales(id, amount, x, cid)
and payments(r, amount, id, cid)
and x = Book
In this case, we can inline the equality-condition on x = Book
and get:
relation paidBookSales2(id, amount, cid)
| sales(id, amount, Book, cid) and payments(r, amount, id, cid)
|
|
|
---|---|---|
|
|
|
|
|
|
Multiple clause-bodies
A relation may be defined by the union (disjunction) of multiple clauses:
relation rName(params)
| clause_1
| clause_2
| ...
| clause_n
An assignment of values to parameters is satisfying if at least one of clause_1
… clause_2
can be evaluated to True
.
Example
A relation that includes sales of either Coffee
or Movie
can be given as:
relation bookOrMovieSales(id, price, cid)
| sales(id, price, Coffee, cid)
| sales(id, price, Movie, cid)
The visualized version is below:
|
|
|
---|---|---|
|
|
|
|
|
|
Example
The clauses in disjunctions may be completely different:
relation saleOrPayment(id, cid)
| sales(id, amount1, item, cid)
| payments(r, amount2, id, cid)
Recursive relations
Relations can be recursive if the rec
keyword is used.
This allows us to express complex relationships in a succinct manner.
A recursive relation is declared in the same way as a normal one, except that we add the keyword rec
:
relation rec relationName(<params>) | <clauses>
Clauses in the list <clauses>
may now include relation invocations of the relation relationName
itself.
Relations may also be mutually recursive: this is achieved with the combination of rec
and with
also used for mutually recursive templates:
relation rec relation0(<params0>) | <clauses0>
with relation1(<params1>) | <clauses1>
...
with relationN(<paramsN>) | <clausesN>
The relations relation0
to relationN
may freely refer to one another recursively through relation invocations.
Example
The following CSL code demonstrates how to express a transitive closure of another relation.
Customers are related via their id strings through CustomerEvent
events.
In line 6 the relation directlyRelated
is defined, containing pairs of customer ids such that an event has happened (on any contract instance, since cid
is free) that relates the two.
Next, line 9 defines the relation relatedCustomers
as the transitive closure of directlyRelated
.
Customer id pairs belong to this relation if either 1) they are directly related, as per the directlyRelated
relation, or 2) there exists some third customer id x
that ancestor
is related to and which itself is directly related to descendant
.
Hence, customer id pairs in the relatedCustomers
may be either directly related or indirectly related through a chain of ids.
1type CustomerEvent : Event {
2 customerId: String,
3 referrerId: String
4}
5
6relation directlyRelated(parent, child, cid)
7 | CustomerEvent { customerId = parent, referrerId = child } @ cid
8
9relation rec relatedCustomers(ancestor, descendant, cid)
10 | directlyRelated(ancestor, descendant, cid)
11 | relatedCustomers(ancestor, x, cid)
12 and directlyRelated(x, descendant, cid)
Let us assume that the following events have occurred:
CustomerEvent { customerId = 2, referrerId = 1 } @ c1
CustomerEvent { customerId = 3, referrerId = 1 } @ c1
CustomerEvent { customerId = 4, referrerId = 3 } @ c1
CustomerEvent { customerId = 5, referrerId = 3 } @ c1
CustomerEvent { customerId = 6, referrerId = 2 } @ c1
The relation directlyRelated
now contains:
|
|
|
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If we draw this relation as a tree we get something akin to a “referral hierarchy”:
1
/ \
2 3
/ / \
6 4 5
The relation relatedCustomers
expresses that a customer id occurs below another one in this tree:
|
|
|
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If we want to take the reflexive transitive closure, i.e., express that a customer id is always related to one self, we simply add an extra clause:
relation relatedCustomersRefl(ancestor, descendant, cid)
| CustomerEvent { customerId = ancestor } @ cid
and ancestor = descendant
| relatedCustomers(ancestor, descendant, cid)
for
expressions
Relation definitions do not by themselves compute the tuples that are members of the relation, they merely state the constraints such tuples must satisfy.
In order to get actual values that can be used in the expression language as part of reports we must execute a query over a relation.
For this purpose, the expression language contains the for
expression, which performs a computation over the result tuples of a query over a relation:
for pattern in relationName do reducer
A for
expression provides a way to perform a computation over all the satisfying assignments of values to parametes for the given relation.
It contains three parts:
the
pattern
is a pattern that is used to assign names to the values;the
relationName
is some previously-defined relation; andthe
reducer
is an expression of the typet -> t
, wheret
is the type of the result value that is being built by the query.
Any names bound in the pattern will be in scope in the reducer
, and the reducer
expression is called once for each value in the result set, carrying along the partially built result of type t
and returning a “more complete” result of the same type.
The type of the entire for
expression is itself the type of the given reducer, i.e., t -> t
.
Thus, once we have built an expression with the for
construct, we need to supply it some initial state in order to compute the complete result.
An example of a an initial value would be 0
if the state type t
is an Int
and the reducer computes a sum.
Example
The following simple example demonstrates how the for
expression is used to perform a query computing a sum based on past events.
We first define a sum type BookOrMovie
that we use as an example of data which can be used to discern between events of the type BookOrMovieEvent
through the style
field.
Next we define the relation bookStyleEventTimeStamps
between time stamps and contract instances - i.e., a set of tuples of the type Tuple Instant ContractInstance
.
Each tuple in the set has the property that the timestamp is from a BookOrMovieEvent
where the style
field is set to the value Book
.
This is ensured by the constraint put on the style
field in line 7.
The relation is used in a query in lines 11 and 12, where we deconstruct each tuple in the result set with the pattern (_, c)
.
This pattern ignores the actual timestamp values by using the _
wildcard pattern and it binds the contract instances to the name c
.
Our report countFrom
takes a contract instance cid
and yields a function that, given an initial integer value, returns the sum of the initial value and the number of Book
-style BookOrMovieEvent
’s that have happened on the contract cid
.
The report sum
uses countFrom
and sets the initial value to 0
.
1type BookOrMovie | Book | Movie
2type BookOrMovieEvent : Event {
3 style : BookOrMovie
4}
5
6relation bookStyleEventTimeStamps(t, cid)
7 | BookOrMovieEvent { timestamp = t, style = Book }@cid
8
9/// countFrom : ContractInstance -> Int -> Int
10val countFrom = \cid ->
11 for (_, c) in bookStyleEventTimeStamps do
12 if (c = cid) \x -> 1 + x else \x -> x
13
14/// sum : ContractInstance -> Int
15val sum = \cid -> countFrom cid 0
Caveat
The use of for
-expressions isn’t supported in where
-clauses of contracts.
Rules and for
-expressions are supported only when running reports (functions) externally.
Example
A consequence of disallowing use of for
-expressions is that the following definition is not allowed:
1type BookOrMovie | Book | Movie
2type BookOrMovieEvent : Event {
3 style : BookOrMovie
4}
5
6relation bookStyleEventTimeStamps(t, cid)
7 | BookOrMovieEvent { timestamp = t, style = Book }@cid
8
9/// countFrom : Int -> Int
10val countFrom =
11 for (_, _) in bookStyleEventTimeStamps do
12 \x -> 1 + x
13
14/// sum : ContractInstance -> Int
15val sum = countFrom 0