Fluid interfaces for happy databases

Alistair O'Neill
Senior Software Engineer

I recently had the opportunity to participate in a XTDB workshop run by Jeremy Taylor. This opened my eyes to the power and flexibility of XTDB.

A database that allows you to query not only forward and backward in time, but orthogonally constrained on when data was known? That's awesome! If Nixon had used XTDB, then Howard Baker's famous question would have been trivial.

As a Kotlin engineer with my Clojure knowledge in its infancy, I set out to see how I would use XTDB within a purely Kotlin context. The Java API gave me great place to start, so I cloned the repo and began my journey of discovery. My goal was to create an idiomatic way for Kotlin devs to interact with XTDB which requires zero knowledge of Clojure datastructures.

To get started, we need to learn a bit about Kotlin's extension functions. Once we have that foundation, we'll take a deep dive into creating XTDB DSLs with Kotlin.

Kotlin Extensions

One of my favorite features of Kotlin is the ability to add functions to classes and interfaces without the requirement to wrap them in an additional class.

For example, let's say we have a Java class from a third party that we can't modify:

public class Person {
private String name;
private String lastName;

public Person(String name, String lastName) {
this.name = name;
this.lastName = lastName;
}

public String getName() {
return name;
}

public String getLastName() {
return lastName;
}

public void setName(String name) {
this.name = name;
}

public void setLastName(String lastName) {
this.lastName = lastName;
}
}

We might want to have a convenient function to get the full name of the person. This can be implemented in the following ways:

fun Person.getFullName(): String {
return "$name$lastName"
}

// Alternatively
fun Person.getFullName() = "$name$lastName"

// Or even
val Person.fullName get() = "$name$lastName"

// Usage
val me = Person("Alistair", "O'Neill")
println(me.getFullName())

// Or
println(me.fullName)

In Kotlin, we can use .name and .lastName rather than calling .getName() and .getLastName(). When interacting with Java classes, any fields which have a getter and setter can be used directly as if they are var (mutable reference). Any field with just a getter is treated as a val (immutable reference).

Indeed, the name of the underlying field is irrelevant. The Java getter/setter could look like the following, and our Kotlin would be the same:

public class Person {
private String foo;

// ... constructor elided ...

public String getName() {
return foo;
}

public void setName(String name) {
this.foo = name;
}

// ... other accessor methods elided ...
}

Why is this useful, though? This allows us to add Kotlin functions to the Java API in xtdb-core without requiring any Kotlin code whatsoever in xtdb-core! All Kotlin code can live in a separate module entirely.

Idiomatic DSLs with Kotlin

In kotlin-dsl we have implemented extensions to IXtdbSubmitClient and IXtdbDataSource that allow us to write queries and transactions directly.

For example, to find people with the same forename and surname, a query looks like this:

// Set up
val node = XtdbKt.startNode { }
val person = "person".sym
val name = "name".sym
val forenameKey = "forename".kw
val surnameKey = "surname".kw

node.db().q {
find {
+person
+name
}

where {
person has forenameKey eq name
person has surnameKey eq name
}
}

This looks nice, but what is actually going on here?

IXtdbDataSource.q is an extension function which takes any number of parameters and, finally, an argument of type QueryContext.() -> Unit. The type QueryContext.() -> Unit is a function which has QueryContext as a Receiver, doesn't take arguments, and doesn't return anything.

When we say "Receiver" we mean that, within the function, a reference to this refers to an instance of QueryContext. Our scope has become QueryContext. Most of our DSL will operate within this scope, giving us a sandbox to build up queries using a syntax that's easy on the eyes.

The sections ahead will explain our use of trailing lambdas (sometimes called "lambda block syntax" in other languages), scoped extensions, operator overloading, infix functions, and a bit of invocation mojo. Finally, we'll tie it all together to create an even more powerful DSL around XTDB Query Rules.

1. Using Lambdas

In Kotlin, when calling a function where the last argument you supply is a lambda (denoted by curly braces), you are allowed to move the lambda outside the function call's parentheses.

Thus, the above could equivalently have been called as follows (although your IDE will yell at you):

node.db().queryKt({
find {
+person
+name
}

where {
person has forenameKey eq name
person has surnameKey eq name
}
})

// Moved outside of the parentheses
node.db().queryKt() {
find {
+person
+name
}

where {
person has forenameKey eq name
person has surnameKey eq name
}
}

In fact, because our IXtdbDatasource.q can only take a single function argument (if you provide no parameters for the query), we can omit the () entirely from the function call to leave us with the original syntax. This results in more human-readable queries.

So, within our curly braces, we are acting within the scope of an instance of QueryContext. This is why we are able to call functions like find which are defined in the QueryContext class.

2. Extensions with Scopes

Next, let's look at this curious usage of +person. As I've already mentioned, Kotlin allows you to add extension functions to extant types. For most purposes, this is done at the top level so that your extension is available everywhere. For example:

val String.sym: Symbol get() = Symbol.intern(this)

After adding this extension method, anywhere I want to create a symbol from a string, it is as convenient as "foo".sym.

However, extensions don't need to be created at a global level. They can be declared within a class to restrict their usage to within the scope of that class. Returning to the Person example from earlier, we could have the following:

class Family(val father: Person, val mother: Person, val child: Person) { //
val Person.fullName get() = "$name$lastName"
}

fun main() {
val father = Person("Dan", "O'Neill")
val mother = Person("Joanne", "O'Neill")
val me = Person("Alistair", "O'Neill")

val family = Family(father, mother, me)

// This won't compile. Person.fullName is not defined in this scope.
println(father.fullName)

// This works because in the apply block, we have the scope of Family.
family.apply {
println(father.fullName)
}
}
• This is the Person class defined above in Java

So, how is this utilised in our Query syntax?

In FindContext, we define

operator fun FindClause.unaryPlus() = add(this)

operator fun Symbol.unaryPlus() = +FindClause.SimpleFind(this)

We have added the extension function unaryPlus to the Symbol class within the scope of FindBuilder.

By restricting our extension of Symbol to within the scope of our BuilderContext, it means that the body of the extension can interact with (and mutate!) the instance of BuilderContext. Furthermore, it avoids conflicts where we might wish to have +person mean something else in another context.

3. Operator Functions

unaryPlus is a special type of function known as an operator function. In practice, this means that it is called using special syntax.

Operator functions are usually used to define things like what a + b means. Let's have a look at a simple example.

data class Fraction(val n: Int, val d: Int) {
override fun toString() = "$n/$d"
}

fun main() {
val a = Fraction(1, 3)
val b = Fraction(1, 2)
println(a + b)
}

This won't compile because our compiler doesn't know what it means to add two Fraction classes together. We can define this using the operator function plus. (I don't account for simplifying fractions, but that isn't important for the example.)

data class Fraction(val n: Int, val d: Int) {
override fun toString() = "$n/$d"
operator fun plus(other: Fraction) = Fraction(n * other.d + other.n * d, d * other.d)
}

// Or we could do it as an extension
data class Fraction(val n: Int, val d: Int) {
override fun toString() = "$n/$d"
}

operator fun Fraction.plus(other: Fraction) = Fraction(n * other.d + other.n * d, d * other.d)

Both of these implementations convert a + b to a.plus(b) with our supplied definitions. The difference is that the first requires us to change the Fraction class, whereas the latter is as an extension so can be added elsewhere (or in a given special scope).

In the XTDB query example, we have +person where person is an instance of Symbol. This line will call person.unaryPlus() which is defined in the scope of FindContext to add a SimpleFind clause to our list for the Find section of the query.

4. Infix Functions

One of the benefits of having a DSL is to allow complicated function calls or data structures to read more naturally.

Let's take the example of adding a restriction to the where section of a query to check whether a person document has the key :name. Using normal Kotlin functions, this might look like:

fun hasKey(symbol: Symbol, key: Keyword) = TODO("Add to your list of clauses or w/e")

// Used like:
db.q {
where {
}
}

// Or we could define it as an extension function
fun Symbol.hasKey(key: Keyword) = TODO("Do the thing")

// Used like:
db.q {
where {
}
}

While the latter is certainly nicer than the former, it still doesn't give us the "plain English" feel of a fluid interface. We can do better by using an infix operation. Even if you have never specifically heard of infix operations, much less implemented one yourself, you have almost certainly used one.

val me = mapOf(
"name" to "Alistair",
"age" to 26,
"National Insurance" to "lol, fat chance"
)

If we look at the signature of mapOf, we find:

public fun <K, V> mapOf(vararg pairs: Pair<K, V>): Map<K, V>

That's a little odd. When we were creating our map, we never explicitly created Pair objects. Come to think of it, the way we just have to between two values is rather strange.

Fortunately, Kotlin is Open Source, so we can press Ctrl-B on it and have a look at the definition to see what's going on:

public infix fun <A, B> A.to(that: B): Pair<A, B> = Pair(this, that)

This is a function that acts "on" an object of type A with a single argument of type B. These are used as the first and second values of the returned Pair.

The infix modifier is what allows us to call the function without needing a dot or parentheses. Note that you can call an infix function the "normal" way if you really want to:

// Using the infix
"Alistair" to "O'Neill"

// Calling normally
"Alistair".to("O'Neill")

There are some restrictions on what can be made an infix function and what can't:

• The function must take exactly one argument which cannot be a vararg, nor can it have a default value.

• The function must be a member of a class, or an extension function (i.e. it must have a Receiver).

With this in mind, let's consider how to implement syntax that modifies our WhereContext and uses more human looking syntax:

// Desired syntax:
where {
person has name
}

// The definition within the WhereContext class
infix fun Symbol.has(key: Keyword) = TODO("Mutate our context accordingly")

Happy days! We can now use has in between a Symbol and Keyword to do ...something. In our case, we might want to add a newly constructed clause to a list.

Things start to get a little hairy when we want to chain together infix functions. We not only want to be able to say that person has a name key, but also want to be able to constrain what the value of this key is.

// Desired syntax:
where {
person has name               // Checks for the existence of a name key
person has name eq "Alistair" // Checks that the person has a name key with value "Alistair"
}

When infix functions are chained in such a way, the order of operations is from left to right. That is to say, the above code is equivalent to:

where {
(person has name) eq "Alistair"
}

If we want our DSL to look like this in practice, it means that our eq function will need to be defined on whatever the result type of the has function is.

Having has spit out a bespoke data class which is then extended with the eq function would be absolutely fine if we only wanted to support the five word statement. We would just have the code that mutates the Context act in the eq function.

However, we want both of them to be valid. This means that we still need to mutate the Context as a side effect of has in case there isn't an eq function being called. In the case that there is an eq call, we don't want to add two clauses to our where section. In this specific instance, it would be fine to add both query results-wise as they are overlapping restrictions: a document can't have a key with a specific value if it doesn't have that key at all!

I'm not happy with that, however. It would result in a Query object that doesn't strictly represent what a user has input, and that's just bad design (and probably has performance implications for the query engine). Instead, I implemented a buffer system where the most recent clause isn't added to the list for the section until either build time, or a new clause is started.

In the case that the eq form of the statement is used, it replaces the buffered HasKey clause with a HasKeyEqualTo clause:

// Within WhereContext
data class SymbolAndKey(val symbol: Symbol, val key: Keyword)

infix fun Symbol.has(key: Keyword) =
SymbolAndKey(this, key).also {
}

infix fun SymbolAndKey.eq(value: Any) = replace(HasKeyEqualTo(symbol, key, value))

// The abstract class which sets up the buffer
abstract class ComplexBuilderContext<CLAUSE, TYPE>(
private val constructor: (List<CLAUSE>) -> TYPE
): BuilderContext<TYPE> {
private val clauses = mutableListOf<CLAUSE>()
private var hangingClause: CLAUSE? = null

lockIn()
hangingClause = clause
}

protected fun replace(clause: CLAUSE) {
hangingClause = clause
}

private fun lockIn() {
hangingClause = null
}

override fun build(): TYPE {
lockIn()
return constructor(clauses)
}
}

5. Invoking Mojo

What does the following do?

"hello"("world")

If your answer is "That's a String, not a function. What do you think you are playing at?", you have a valid point. As it stands, it is meaningless and won't compile.

What if we could call a String as a function, though? In terms of natural looking, declarative input, it opens up a lovely design space. Well, this is Kotlin, and we can extend types as we please: including adding an invoke function.

operator fun String.invoke(other: String) {
println("$this$other")
}

By adding the invoke operator function, we can now use a String as if it is a function which takes another String.

This functionality, in conjunction with the fact that trailing lambdas can be lifted out of parenthesis enables our nested XtdbKt configuration to look especially pleasing:

XtdbKt.startNode {
"xtdb/tx-log" {
"kv-store" {
module = "xtdb.rocksdb/->kv-store"
"db-dir" to File("/tx")
}
}
"xtdb/document-store" {
"kv-store" {
module = "xtdb.rocksdb/->kv-store"
"db-dir" to File("/doc")
}
}
"xtdb/index-store" {
"kv-store" {
module = "xtdb.rocksdb/->kv-store"
"db-dir" to File("/index")
}
}
}

Code is data, data code, -- that is all

Ye know on earth, and all ye need to know

-- Keats (probably)

This is achieved by adding the following invoke extension function to String within the NodeConfigurationContext and ModuleConfigurationContext:

operator fun String.invoke(block: ModuleConfigurationContext.() -> Unit) {
builder.with(this, build(block))
}

It's amazing how a few little concepts can be combined to create convenient bespoke syntax.

6. Tying it all together

We now have all the pieces in place. As a final example -- and perhaps my favorite example of creating one's own syntax -- this is how you define XTDB Query Rules and use them from kotlin-dsl:

// Setup
val node = XtdbKt.startNode()
val db = node.db()

val dwarf = "dwarf".sym
val descendentOf = "descendentOf".sym
val descendent = "descendent".sym
val ancestor = "ancestor".sym
val fatherKey = "father".kw
val intermediate = "intermediate".sym

db.q {
find {
+dwarf
}

where {
rule(descendentOf) (dwarf, "farin")
}

rules {
def(descendentOf) (descendent, ancestor) {
descendent has fatherKey eq ancestor
}

def(descendentOf) (descendent, ancestor) {
descendent has fatherKey eq intermediate
rule(descendentOf) (intermediate, ancestor)
}
}
}

Here, we are declaring a Rule descendentOf that accepts two parameters and then providing the restrictions on those parameters using a WhereContext.()->Unit

Being able to split out the parameters from the name of the Rule comes from the fact that the definition is created in two function calls.

data class RuleDeclaration(val name: Symbol)

fun def(name: Symbol) = RuleDeclaration(name)

operator fun RuleDeclaration.invoke(vararg params: Symbol, block: WhereContext.() -> Unit) =
+RuleDefinition(
name,
params.toList(),
WhereContext.build(block)
)

The first call of def takes the name of the Rule and creates a RuleDeclaration. This declaration then gets invoked with a variable number of parameters (which are in the parentheses) and a lambda. Because the lambda is the final argument, it is written outside the parentheses in curly braces.

All of this comes together to form a very natural looking way of defining a Rule.

Conclusion

Overall, I am incredibly excited to see where we can take kotlin-dsl.

Our goal is to provide a seamless, strongly-typed utility layer which can lie on top of such a powerful and versatile database.

Stay tuned!