Profile cover photo
Profile photo
Dan Doel
Dan's posts

Hey FedEx. Maybe, if a delivery requires a signature---a signature that I am not allowed to put on the packing slips you leave; it has to be in person---you should attempt to contact me to receive said signature, by knocking on my door for instance.

Maybe I shouldn't have to leave a note for the delivery person telling them to knock, because I am in fact home for the second day in a row, attempting to sign for the package.

Or am I required to sit on the porch and do nothing all morning?

I've been working on our SQL generator lately, and have become increasingly amazed at how complicated, inconsistent and arbitrary SQL's syntax is. A lot of it lately has had to do with 'order by' clauses, and they provide a lot of good examples. (Some of this may be Microsoft specific, but not all of it.)

Previously, I had though 'order by' was part of 'select.' But it turns out you can apply it to things like union. I was surprised, because this actually seems convenient and nicely orthogonal. But there are also many queries you can't apply it to. The one that makes sense is another 'order by', which would be redundant, and 'order by' also can't be used as a subquery in a 'select' for instance. You can have a subquery that uses 'order by' with 'offset/fetch' to get a subset of the relation, but 'offset/fetch' is also not eligible to be used with 'order by', even though there are reasons to want to nest that. (You also can't union a limited result set directly, for some reason.) You must have a 'select' to promote the limited relation to the right syntactic class.

The real fun, though, has to do with the things you are allowed to order by. Since 'order by' can be attached to some subqueries, you can order by whatever columns that query has, without qualification (because the query has no name to qualify by). You are not allowed to order by constant expressions like '1 + 1'. However, you are allowed to order by constant numerals, like '1', which actually refers to the first column of the subquery, positionally.

If the subquery is a 'select', you are also able to order by (non-constant) expressions. These expressions are allowed to mention the (possibly qualified) column names of the tables in the 'from' clause. But they are not allowed to mention the columns selected by the whole subquery.

If the subquery is a 'union' or similar, the things you are allowed to refer to of course come from the first subquery of the 'union', because 'union' is positional. So you can refer to the named columns selected in the first subquery, and are not allowed to refer to the named columns in the other subqueries.

If the first subquery of the 'union' is a 'select', you are also allowed to order by certain expressions in terms of the columns in scope in the 'from' clause. However, the only allowed expressions are the ones that can be detected to be equivalent (possibly up to qualification; not sure) to an expression that was selected. And in that case, ordering by the expression means to order by the positional column that the expression appeared in. So:

select t.a a, t.b b, (t.a + 1) c
from (values (1,2), (2,3)) as t(a,b)
select p, q, r
from (values (1, 4, 37)) as s(p,q,r)
order by a+1 desc

Has (1,4,37) as the first row, because 37 is the highest value for the third column, even though you might think that a should refer to the first column, and so we should order by the successor of the first column.

And if you really want to order by a constant expression, you can, because SQL allows you to order by subqueries, so you can 'order by (select 1 + 1)'. I assume you just get an arbitrary order in that case.

Isn't that great?

I and other people have written about related stuff before, but some time ago I realized an amusing crystalization that I haven't seen explicitly written up anywhere. So here it is.

I usually tell people that 'finally tagless' stuff is Church/Boehm-Berarducci encoding.* To define μF, you use ∀r. (F r -> r) -> r. However, the details of the approach (in the papers I've read) are somewhat tailored to be nice in Haskell. For one, without extensions, that quantified type isn't first-class, so you only implicitly program with it by writing polymorphic functions.

More importantly, the (F r -> r) algebra portion is represented by type classes. This nicely affords extensible data types. Adding constructors corresponds to taking coproducts F + G, but the algebras of these coproducts can be represented by products of algebras. Classes are really nice here, because they can represent the product of a set of (named) algebras, so you don't have to keep track of the order your constructors come in, and can't include the same constructors twice.

When encoding higher-order things, r gets parameterized, so you end up with types like `(C1 r, C2 r, ...) => r T`. You can even encode GADTs this way, by having the algebra methods in Cn have more specific types for the arguments to r.

Now, another big trend among Haskell/functional programmers is extensible, algebraic effect systems. For these, we have a set of algebraic effect specifications, and work with the free monad over those specifications. So, what happens if we apply the 'finally tagless' approach to this?

What you get is essentially the mtl. Every class like MonadState has methods that are what you'd get if you pushed the algebraic effect specifications through to the finally tagless approach. And the above mentioned row-like (commutative and idempotent) behavior of class constraints in the finally tagless approach gives you all the nice 'extensible' aspects.

The mismatch in the mtl is that many of the classes contain non-algebraic components. For instance, throw is algebraic, but catch is not. These don't really hurt much for most purposes (and might be convenient sometimes); but strictly speaking its something of a mismatch.

If you want to do algebraic effect style handlers, what this corresponds to is choosing a transformer to fill one of the constraints. If I have a (MonadState s m, MonadError e m) context, I can use `ExceptT e n` to handle the exceptions, and push the context down to (MonadState s n). Not all constraints have to be discharged at once (although that is the common way people use the library).

So, if you combine a couple of the trendiest things these days, what you get is something that has been in use for decades. :)

[*] For people who know more category theory than type theory folklore for some reason, Boehm-Berarducci encoding basically goes like this: imagine if one of your type-forming principles were, "any end formula I can write down exists by fiat." Then you get all kinds of right Kan extensions, and any time you want to define a 'free' thing, you just used the fact that `Ran U 1 ⊣ U`, and your ability to write down a formula for the left adjoint.

I've been trying out the dedicated YouTube music app on Android recently, and it's incredible how worthless it is. It's almost universally inferior to the normal YouTube app at its intended domain.

It can play music in the background, but so can the normal app. It can't create or edit playlists, you need to switch the normal app for that. And when you do have a playlist, the music app will often judge that some of the entries aren't music, and omit them, so you can't use it for playback, either. The only thing it does that is unique is play random sequences of songs related to a video you pick.

I'm not sure I've ever seen something like this, where a dedicated application is almost strictly inferior to the more general version by the same company in (essentially) every way.

I was talking recently with someone about his idea for tweaking things about file handles. I could only think of one minor annoyance with the idea, and I noted that it would probably be a pretty moot point if you were using a copy-on-write filesystem (which are increasingly popular), and if they actually presented an interface that exposed the underlying immutability, which they don't.

But some time after, I realized the story is even more ridiculous than I had first thought, because SSDs are taking over long term storage these days. And mutation is not a great operation on them either. So the hardware is already doing something more like allocating modified values and garbage collecting, and only presenting mutation as an interface to that.

Then on top of the immutable disks presenting a mutable interface, people write immutable filesystems that present a mutable interface, and only use the underlying immutability for some extra features, like snapshotting.

Then, within the immutable filesystem presenting a mutable interface, people use version control systems, which are higher level immutable filesystems. But the feature everyone is obsessed with in these is the ability to appear to mutate the immutable structures, even though it's really a mutable-appearing view to an immutable structure with garbage collection.

Something to think about next time you hear a Real Programmer tell you that mutation is, "how things really work."

Post has attachment
Someone mentioned to me after I wrote the adjoint triples blog post that the colim ⊣ const ⊣ lim triple can actually be seen as part of the cohesion adjoint quadruple, where the quadruple would have a right adjoint to lim (I think).

So, I fooled around, and figured out that the analogue of that triple in Haskell (or really, a total version thereof) actually extends into an adjoint quintuple, with a right adjoint to lim (forall) and a left adjoint to colim (exists). At least, it works for the co- and contravariant endofunctor cases---I didn't bother thinking about quantifying over higher-kinded types and whatnot.

Post has attachment
Fun with adjoint triples and the (co-)Yoneda lemma.

Post has attachment
I haven't bothered reading up on 5th edition D&D, so I don't know how good or bad it's supposed to be. But I imagine it's a bad sign that they're reprinting 2nd edition.

On the topic of bugs in Google products (last one being their android music player)....

The gmail spam filter is terrible these days. I have to regularly check my spam folder, because it has non-spam messages almost every week. I had to modify my haskell-cafe rule to skip the spam filter, because it was erroneously marking so many of those as spam (and I doubt it's commentary on the quality of that list). It's off target on most other mailing lists, too.

A google search for "google spam filter sucks" reveals that Linus Torvalds discovered this recently, too, after gmail started marking 30% of his Linux kernel mailing list traffic as spam.

Even better, it regularly marks Google's own alert messages about logins from new devices as spam.

Post has attachment
Notes on notes on cubical type theory

After a few days of looking at these notes, the implementation, and some random background material, I think I have a rough understanding of what's going on in cubical type theory. I've heard that there's some confusion amongst people who aren't the implementers as to what's going on, and it's fairly easy to understand why. Certain notation in the notes is both non-standard and completely unexplained. And some (I think) important subtleties with the rules are also unexplained.

Fortunately, I've implemented and used enough type theories that I can go read the implementation and figure out what the notation is actually supposed to mean (in some cases). I'm not sure if this puts me in a small, advantaged subset of people, or if the stuff I understand is trivial and the 'confusion' is over bigger issues. But, here are my notes on how to understand the notes, in case it helps anyone.

First some background. Cubical type theory is based on cubical sets, which are used in category theory as a model for homotopy. These are in turn based on the cube category, which is a category whose objects are n-dimensional cubes. The 0 cube is a point, the 1 cube is a line segment, the 2 cube is a square, etc.

Cubical sets are presheaves on the cube category. So A : Cube^op -> Sets. This should be thought of as taking the 0-cube to the type A, the 1-cube to identities between values of A, the 2-cube to identities between identities, etc.

So, the core idea behind cubical type theory is that we add the 1-cube, to the formalism. This can be thought of as an interval type 𝕀, which has two elements that are identical to each other (but only propositionally). Functions f : 𝕀 -> A are then like two elements of A together with a proof that they are identical.  That is, 𝕀 -> A is like (Σ x y : A. Id A x y), which is the 1-cube portion of the cubical set for A. And we construct higher cubes by iterating the 𝕀 ->. This is like currying the product of intervals that form higher cubes.

The other angle is that when we have 𝕀 -> A, this can be flipped around to get back dependent types, because it can be used in slice categories. So, any particular t : 𝕀 -> A corresponds to an element in the fiber above its two values. So there is a corresponding eq : Id A (t 0) (t 1). This is what cubical type theory actually uses; there's a special lambda that lets us go from "Γ , i : 𝕀 ⊢ t(i) : A" to "Γ ⊢ <i> t(i) : Id A (t 0) (t 1)".

Now, onto the actual notes on the notes.

The first thing to know is the notation A(i0). This is used when A is defined in a context including i : 𝕀, and it means to substitute 0 in for i in A. This is one of the things that isn't explained anywhere in the notes, but reading the code makes it pretty clear that's the intention.

Another thing to know is that the above lambda is not the fully general formulation of things, I think. The code written in cubicaltt the language references an IdP type, and defines the normal identity in terms of it. This is used for type families fibred over the interval, which is written in the notes like "Γ, i : 𝕀 ⊢ A". This makes IdP actually a sort of heterogeneous equivalence type, mediated by a specified equivalence of types, and can be thought of as Πi:𝕀.A(i) instead of 𝕀 -> A (perhaps this is where the name IdP comes from? P for Π?). You can mimic this (I think) in something like Agda by defining a type:

    IdP {A} {B} (AB : A == B) (x : A) (y : B) = subst id AB x == y

But needless to say, it seems significantly more convenient to work with in the cubical rules. This is something of a disconnect between the implementation and the rules, I should note. The implementation works with typing judgments as things being of type U, and appears to let you form functions 𝕀 -> U, because judgments are behind the scenes in an implementation. This is a slight mismatch, but it's pretty common when moving from typing rules to a computer implementation.

One of the reasons why this comes up is the very last set of rules about "glueing" (sic). Glue is (part of) how univalence is added to the theory; it lets you demonstrate an isomorphism (up to identity) and get an identity. The thing to note here is that there are two "glue"s. One is term-level, and the other is type-level. The notes write them almost identically, but I'll write the type level one with a capital. Anyhow, types built with Glue are an example of something that is fibred over the interval. You demonstrate an Iso(T, A), and using Glue with that lets you build a type family that will reduce to T when i = 1, and A when i = 0. And glue lets you move elements along that isomorphism, so it lets you create identity types that match up corresponding elements of the isomorphic types.

The last very cryptic thing is "formulas" in the context. There is pretty much no explanation for them, but they are used in Glue and comp. If you read some cubicaltt programs, you'll see that they look completely different there, like "i = 0". So what I think they are is a way to do case analysis on interval elements in a controlled way. The rules for using them ensure that you can only 'branch' in ways such that the result of each branch will be judged equal in a joint context. The formula 'i' corresponds to the case 'i = 1' and the formula '1 - i' corresponds to the case 'i = 0'. The weird "u_{φ,ψ}v" notation is a way to collect up two branches into one expression, and it's a value that reduces to u when φ = 1 and v when Ψ = 1.

The last problematic thing about formulas is how you even derive things with them in the context. I think one important thing that isn't explained is that you can probably weaken by formulas in some way. Like 'Γ ⊢ a : A' yields 'Γ, φ ⊢ a : A', possibly with side conditions. Or perhaps 'Γ ⊢ a : A' yields 'Γ, 1 ⊢ a : A'. Structural rules like this aren't given, though.

The last thing to understand is comp, but I already covered that in my last post. However, my explanation there might not give the clearest idea of how things work exactly. Namely, a 'x = y -> P x -> P y' type substitution is not fundamental in cubical type theory. To get that sort of substitution, you first prove 'P x = P y' and then use an operation 'comp' that takes 'A = B' to 'A -> B'. So, you never need to inspect a function like P, you only need to inspect its application to things. You can build expressions that demonstrate (effectively) 'P x = P y' in completely abstract contexts, using (interval) lambda and application, and induction on types is used to figure out how those lambda terms need to compute on values.

The last notational bit about comp is that it binds a variable. The hypotheses of comp formation include 'i : 𝕀' in the context, and then 'comp^i' binds i for its subexpressions. But 'fill', which is defined in terms of comp, and uses a similar notation, does not bind its superscript.

I think that's about it. With the above, I think it's possible to get a feeling for the cubicaltt notes on a sort of operational level (figuring out why the computation rules for comp are what they are is then the difficult part). Whether it's any good as a model of homotopy stuff, I can't say, because I know very little about homotopy/topology. But maybe it can help some other folks like me who are more interested in the computational aspects than a syntactic theory of a branch of mathematics (or maybe it will help the latter folk, too; who knows).
Wait while more posts are being loaded