Introduction to Dex
Dex is a functional, statically typed language for array processing. There are many tools for array processing, from high-level libraries like NumPy and MATLAB to low-level languages like CUDA. Dex is a new approach for high-level array processing that aims for the clarity of high-level libraries while allowing for more granular expressivity. In particular, Dex does not force you to rewrite all operations in terms of batched tensor interactions, but allows for a range of interactions. Put more simply, when learning MATLAB students are told repeatedly to "avoid for loops". Dex gives for loops back.
Let us begin with the most useful component of Dex, the
builder. The best analogy for this construct is list comprehensions
in Python. For instance, in Python, we might write a list
x = [[1.0 for j in range(width)] for i in range(height)]
In Dex, this construct would be written as:
Once we have a variable, we can print it.
More interestingly, we can also see its type with
:t. This type tells us
x is a two-dimensional table, whose first dimension has type
second dimension has type
Fin stands for
finite represents the type of range from 0
to the value given minus one. The
: tells us the type of the
We can also display it as html. To do this we include the plot library. Right now our table is not so interesting :)
Once we have an table, we can use it in new comprehensions. For example,
let's try to add
5 to each table element. In Python, one might write this as:
x5 = [[x[i][j] + 5.0 for j in range(width)] for i in range(height)]
Dex can do something similar. The main superficial difference is the
table indexing syntax, which uses
table.i instead of square brackets for
However, we can make this expression nicer. Because
x has a known table type
j index into that type, Dex can infer the range of the loop.
That means that we can safely remove the explicit
Fin type annotations and
get the same result.
Dex also lets you reduce this expression to include multiple variables
in the same
Standard functions can be applied as well. Here we take the
mean over each column:
This style of using
for to construct type-inferred tables is central to what
makes Dex powerful. Tables do not only need to have
Let's consider another example. This one produces a list of ones in Python.
y = [1.0 for j in range(width) for i in range(height)]
The analogous table construct in Dex is written in the following form. It
produces a one-dimensional table of
Height x Width elements. Here
indicates a tuple constructor.
As before, we can implement "adding 5" to this table using a
enumerating over each of its elements:
And we can apply table functions to the table:
But things start to get interesting when we consider the type of the table. Unlike the Python example, which produces a flat list (or other examples like NumPy arrays), the Dex table maintains the index type of its construction. In particular, the type of the table remembers the original ranges.
The use of typed indices lets you do really neat things, but let's consider how it works. Critically, one cannot simply index an table with an integer.
Instead, it is necessary to cast the integer into the index type of the
current shape. This type annotation is done with the
If you are feeling lazy and sure of yourself, you can also let Dex infer
the type for you. This is also how
for works in the examples above that
did not provide and explicit type annotation.
If it helps, you can think of table indexing as function application:
a with index
i just like how
f x applies function
Another consequence is that you cannot use indices as integers. It
is necessary to explicitly annotate index types with
This is because finite sets i.e.
Fin are not closed under addition.
If we want to convert these values to floats, we do it manually with the
function. We can use this to make an image gradient.
As we have seen, indices are not limited to only integers. Many different Dex
types are valid index types. For example, we declared table
y as having a
pair of integers as its index type (
a & b means tuple type), so indexing
y requires creating a tuple value.
Tuple indices also provide an ordinal value.
Many algorithms in Dex come down to being able to pack and unpack these
indices. For example, we have seen that it is easy to sum over one dimension
of a 2D table. However, if we have a 1D table indexed by a pair, we can
easily turn it into a 2D table using two
Again, we rely on type inference in order to avoid explicitly spelling the ranges.
Defining functions over tables
One use case of packing and unpacking table indices is that it allows us to change the order of the axes. This is useful for applying functions on tables.
For instance, we saw the
mean function above which sums over the first axis
of an table. We can apply
y to produce the mean average over 24
mean function works independently of the index type of the table.
Let's see how we can define our own table functions. Defining a function in Dex uses the following syntax.
Functions also have types. Note that that function types in Dex
-> symbol whereas tables use
We can also write functions with type variables over their inputs. For
instance, we may want to be able to write a function that applies "adds 5"
to tables with any index type. This is possible by declaring an
n => Int32
table argument type: this declares the type variable
n as the index type of
the table argument.
But function types can help you out even more. For instance, since index types are statically known, type checking can ensure that table arguments have valid dimensions. This is "shape safety".
Imagine we have
transpose function. We can encode the shape change in the type.
We can even make it more generic by abstracting over the value type.
We can also use this to check for shape errors:
The type system checked for us that the input tables indeed have the same shape.
Case Study: Fashion MNist
To run this section, move the following binary files to examples:
wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz; gunzip t10k-images-idx3-ubyte.gz
wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz; gunzip t10k-labels-idx1-ubyte.gz
To make some of these concepts for tangible let us consider a real example using Fashion MNist clothing. For this example we will first read in a batch of images each with a fixed size.
To do this we will use Dex's IO to load some images from a file. This section uses features outside the scope of the tutorial, so you can ignore it for now.
This show the mean pixel value aggregation over all images.
This example overplots three different pairs of clothing.
This one shows all the images on one channel over the base plot.
This example utilizes the type system to help manipulate the shape of an image. Sum pooling downsamples the image as the max of each pixel in a tile grid pattern. See if you can figure out the other types.
Dex is a functional language - but when writing mathematical algorithms, it is often convenient to temporarily put aside immutability and write imperative code using mutation.
For example, let's say we want to actually implement the
ourselves by accumulating summed values in-place. In Python, implementing this
is not directly possible solely via list comprehensions, so we would write a
acc = 0.0
for i in range(len(x)):
`acc = acc + x[i]`
return acc / len(x)
In Dex, values are immutable, so we cannot directly perform mutation. But Dex
includes algebraic effects, which are a purely-functional way to modeling
side-effects like mutation. We can write code that looks like mutation using
State effect, which provides getter and setter functionality (via
:= assignment). Here's what it looks like:
So, even though Dex is a functional language, it is possible to write loops that look similar to ones that truly perform mutation. However, there is one line which is quite new and a bit scary. Let us look into that line in a bit more detail.
$. This symbol is used in Dex just like it is used in Haskell, but
if you haven't seen it before, it seems a bit strange. The symbol
$ is the function
application operator: it basically replaces of expression-grouping parentheses
(f x) when it is inconvenient to write them. For example, the following two
expressions are identical:
\. This symbol is the lambda sigil in Dex. It is analogous to the
lambda keyword in Python, and starts the definition of a function value
(i.e. closure). In
tableMean above: the lambda takes an argument named
and returns the body, which is the expression following the
constructor in this case).
withState. This function uses the
State effect, enabling us
to introduce imperative variables into the computation.
withState takes an initial value
init and a body function taking a
"mutable value" reference (
acc here), and returns the body function's result.
Here's a simple example:
The element returned is the body function's result (
Finally: this is a good point to talk a bit about some other operators in Dex.
In the examples above, we see two types of equal sign operators:
The first is the
let operator that creates an immutable assignment (a
"let-binding"). This one is built into the language and can be used anywhere.
The other is
:=, which is an assignment operator that can only be used
State effect is available (e.g. inside of a body function passed to
ref := x assigns the value
x to the mutable reference
Reading the value in
ref is possible via the
get function. or via using
the final result returned by
tableMean function is pretty neat. It lets us work with tables with any
index type and computes the sum. However,
tableMean explicitly takes only
integer tables (of type
n => Float32).
If we try to apply
tableMean to other types for get errors. For example,
tableMean does not work when applied to a table of pairs of floats.
Intuitively, supporting this seems possible. We just need to be able to add and divide pair types. Let's look closer at the exact types of the addition and division operators.
These function types are a bit complex.
a -> a -> a with a constraint that
a implements the
a -> Float32 -> a where
a implements the
If we look in the Prelude, we can see that these interfaces are defined as (This will throw error because it mirrors the prelude, but we are just repeating it here.):
interface Add a add : a -> a -> a sub : a -> a -> a zero : a
interface [Add a] VSpace a scaleVec : Float -> a -> a
Interfaces define requirements: the functions needed for a type to implement the interface (via an instance).
Here is an
Add instance for the float pair type.
And here is a
VSpace instance for the float pair type:
Once we have these two instance definitions, we can revisit our table sum function using them:
The instance values are hardcoded for the float pair type. To be more general,
we can and should instead define
VSpace instances for generic
' tuple types.
More Fashion MNist
Now that we have more functions we can revisit some of the Fashion MNist examples.
Function that uses state to produce a histogram:
Plot how many times each pixel value occurs in an image:
Find nearest images in the dataset:
Variable Length Lists
So far all the examples have assumed that we know the exact size of our tables. This is a common assumption in array languages, but it makes some operations surprisingly difficult to do.
For instance, we might want to filter our set of images to only allow for images of 5's. But what is the type of this table?
Dex allows for tables with an unknown and varying length
List construct. You can think of list as
hiding one finite dimension of a table.
Tables of lists can be concatenated down to single lists.
And they can be deconstructed to fetch a new table.
Using this construct we can return to extracting items with label of shoes (label 5) from the image set.
Note though that the type here does not tell us how many there are. The type system cannot know this. To figure it out we need to unpack the list.
However we can still utilize the table. For instance if we are summing over the hidden dimension, we never need to know how big it is.
We hope this gives you enough information to start playing with Dex.
This is just a start though of the different functionality available
in the language. If you are interested in continuing to learn, we recommend
you look at the examples in the
examples/ directory, check out the prelude
lib/prelude.dx, and file issues on the GitHub repo. We have a welcoming
and excited community, and we hope you are interested enough to join us.
Here are some topics to check out in the Prelude:
- Randomness and Keys
- Laziness of For Comprehensions
- Records and Variant Types
- File IO
- Effects Beyond State