And then there were fewer bugs
Oct 15, 2013
11 minute read

Intro

This deals with rock internals, so fasten your seatbelts and expect many weird things along the way. I’m not necessarily proud of the state of the implementation, I’m just rolling with it and trying to improve it gradually rather than throw everything away.

An error out of nowhere

While working on my current game, John Q. Adamant, I was looking to extract a class into another module - this is routine refactoring and shouldn’t be too hard.

The ‘hard part’ might be figuring out which imports you have to add in other modules so that everything still works properly.

So, I created another ooc file, added the few obvious imports I saw, and then stumbled upon this beautiful rock error:

source/johnq/mob.ooc:125:18 error Undefined symbol 'MobType' (Hint: there's such a type in johnq/mob)
            case MobType MOLAR  => updateMolar()
                 ~~~~~~~                        

See, there are a few problems with that:

  1. This part of the code is absolutely not at fault
  2. Is it really telling us to import our own module ?

That drove me crazy. Sure, I could have just retraced where everything was used and find out which imports to add myself, but… that would be crazy!

What if it was some huge project with thousands of classes and doing that by hand was impossible? No, I had a funny-sized example of a nasty bug and I was going to squash it this time.

After investigation, it seemed to be caused by this piece of code in rock’s source code, when resolving function calls:

score := getScore(candidate)
if(score == -1) {
    if(debugCondition()) "** Score = -1! Aboort" println()
    if(res fatal) {
        // check our arguments are all right
        checkArgumentValidity(res)
        // trigger a resolve on the candidate so that it'll display a more helpful error
        candidate resolve(trail, res)
    }
    return false
}

Now a bit of background. (Disclaimer: the following is mostly ugly). Every ooc file is parsed into an AST (Abstract Syntax Tree) which contains various nodes. FunctionCall is one of them. When a program is being compiled, the AST is to be resolved in a phase that I call tinkering but that other call desugaring (even worse right?).

rock’s tinkerer has a maximum of 32 rounds (by default), everything needs to be resolved before that. When it goes round in circle, losing all hope of getting everything to resolve, we enter the fatal round, in which we should throw error.

Now, when something doesn’t resolve, it might block several other parts of the AST from resolving. For example, if we have

a := someFunction()
a doStuff()

We need to infer the type of a - but we can’t do that if we can’t resolve the call to someFunction. If we do errors the naive way it’ll display an error on a, saying it can’t infer the type - but that’s not the real problem! What we really want is to say that the someFunction symbol is not found, because that’s the source of our problems.

When resolving AST nodes, we have what we call a ‘trail’. It corresponds to the hierarchy (parent, grandparent) etc. of the node in a given module. For example, when resolving the a := someFunction() above, we have a trail like this:

\_ Module
  \_ FunctionDecl
    \_ VariableDecl
      \_ FunctionCall

Now, to go back to our original rock code, things make more sense:

score := getScore(candidate)
if(score == -1) {
    if(debugCondition()) "** Score = -1! Aboort" println()
    if(res fatal) {
        // check our arguments are all right
        checkArgumentValidity(res)
        // trigger a resolve on the candidate so that it'll display a more helpful error
        candidate resolve(trail, res)
    }
    return false
}

getScore will return -1 if something unresolved prevented us from resolving a function call. For example, if one of the argument’s type hasn’t been inferred yet, which would prevent us from choosing the right function variant.

Here, when we’re in the fatal round, we trigger a resolve on the candidate so that it’ll display a more helpful error. But this is wrong, all wrong! Now our trail will look something like

\_ Module
  \_ ... something
    \_ FunctionCall
      \_ FunctionDecl - in another module somewhere!

That doesn’t make sense at all, and as a result, completely funky errors were displayed.

So, what should we do instead? Instead of triggering a resolve, just let it be thrown in its natural habitat, e.g. when resolving the module that actually contains the error.

It was more complicated to implement that than it sounds, but the correct error now appears:

source/johnq/mob.ooc:100:29 error Undefined type 'Shot' (Hint: there's such a type in johnq/shot)
    takeDamage: func (shot: Shot) {

Which makes a lot more sense.

A match made in hell

On the way to getting the correct error message, something else went wrong. Something I didn’t expect at all. Match came to fuck things up.

See, match is somewhat powerful, so it’ll allow you to compare object types in cases as long as they have a T matches?(T) -> Bool method. For that matter, there was a piece of code in rock’s source like this:

fCall := FunctionCall new(expr, "matches__quest", caseToken)
fCall args add(caze getExpr())
hmm := fCall resolve(trail, res)
if(fCall getRef() != null) {
    returnType := fCall getRef() getReturnType() getName()
    if(returnType != "Bool")
        res throwError(WrongMatchesSignature new(expr token, "matches? returns a %s, but it should return a Bool" format(returnType)))
    caze setExpr(fCall)
} // etc.

Forgive the very sloppy coding here, but let’s consider things a minute:

  1. rock turns round and round until it finds fatal
  2. stuff that isn’t marked as resolved, will get resolved again and again as we approach the fatal round and including within it
  3. the code above calls fCall resolve in the fatal round for something that might not always succeed!

So, now it threw error in matches that were perfectly legal, just because it was called during the fatal round.

The fix was in two pieces:

  1. First off, start marking stuff as resolved when it is, so that in further rounds, we leave them alone (previously, only Modules were marked as ‘resolved’ completely)
  2. Never try to find a matches? method in the fatal round - we don’t want an error to be thrown here.

And then that led to other kinds of problems…

Tree transforms

See, the problem when you start to mark nodes as ‘resolved’ - and the reason they weren’t before, is that sometimes, the tree gets transformed.

For example, one thing I dislike in ooc but is a necessary evil for the time being, is that due to our C backend, we have two kinds of ‘function pointers’. We have C function pointers, which are just addresses, and we have closures, which are addresses and a pointer to some context struct.

So when we do this:

a := 0

onCallback(||
    // some code here
    a = 3
)

We’re actually passing a closure struct to onCallback - made of the address of the code in the closure, and the address of the closure’s context. In this case, the address of a, which is a byRef, because we modify it inside the closure (I won’t go into too much detail here).

But we also want things like that to work:

onCallback(exit)

Where exit might very well be a C function. So obviously we can’t just pass a pointer where we expect a Closure struct. Which is why we have some code in VariableAccess, which does the following:

  • If we are of type ‘function pointer’, but not a closure
  • And if the parent node expects a Closure struct
  • Then we replace ourselves with a StructLiteral (functionAddress, null)

Thing is, in order to do that, we have to resolve the function first. And for the function to be resolved, it needs to know the types of its arguments. So we have a kind of mexican stand-off here.

The solution was to allow the function to half-resolve even if some type arguments are missing - that’ll give the function a ref (e.g. a FunctionDecl) but with a score of -1, meaning that some things are still missing.

That didn’t quite work yet though - the call resolved correctly but there were still cases where it didn’t work. See, this is perfectly valid code:

getFunction: func -> Func {
    exit
}

Not only this should be handled by autoReturn, but it should also be turned into a StructLiteral. autoReturn did its job, but the VariableAccess was not resolved again with its new Return node parent, which means it never got turned into a StructLiteral.

The solution to this? Add a refresh method to the Node class, from which all AST nodes inherit. (That’s where you see the evilness of mutable structures at work, I guess). It means “hey, your parent has changed, you might want to check your resolve once more”.

And with that, everything was fine again. Until the next problem…

Imports all around

There was one more very annoying thing. Sometimes, we access member variables from types we directly import, like this:

import Dog

dog := Dog new()
dog name = "fido"

In this case, no problem. But sometimes, we access member variables from types we get indirectly:

import DogFactory

dog := DogFactory makeDog()
dog name = "fido"

Due to the way rock (beautifully) handles modules and their translation into one .c file, and two header files (-fwd.h, and .h), this won’t work - but it didn’t get caught until the C compiler ran - which, again, is very annoying, and gives very little info:

/home/amos/Dev/johnqadamant/source/johnq/stages/game.ooc:98:54: error: dereferencing pointer to incomplete type
    if (distSquared < mob radiusSquared) {

Here, johnq/mob should’ve been imported from johnq/stages/game but it wasn’t, and as a result, the C compiler only saw the incomplete, forward declaration of Mob, and as such we couldn’t access the radiusSquared member of the Mob class.

The solution was quite difficult. First off, we have to understand that imports are classified within rock as either ‘loose’ or ‘tight’.

For example, doing this:

import Animal

Dog: class extends Animal { /* ... */ }

…will result in the import Animal import to be classified as tight, because we define a class that extends a class defined in Animal, so we need to access its structures, hence we need to include the .h file rather than the -fwd.h file.

But just using Animal like this:

import Animal

animal := Animal new()

…will leave import Animal as a loose import, only including -fwd.h, which contains forward type declarations and function prototypes.

But rock didn’t warn you if you tried to access a member from a class defined in a module that you didn’t import - and that caused problems. When trying to add a rock error for that (instead of having, say, a GCC error), I stumbled upon another problem: how do we even detect that case?

My first approach was to try and see for each VariableAccess to a member variable, which class it belonged too, where that class was defined, and if we imported the module from the module in which the VariableAccess lived.

But that is incorrect. For example, doing this is perfectly fine:

import Animal, AnimalFactory

dog := AnimalFactory make("Dog")
dog name = "Fido"

Here we’re assuming Dog extends Animal, and that the name field is defined in the Animal class. Since it is directly imported, we can access it without any problem, but the error checking I had just implemented was too stricted, and it triggered errors where there weren’t any.

The solution was, instead of checking imports directly, checking if there is a link, recursively. Let’s assume -> represents loose imports and => represents tight imports, the following case should be legal:

A -> B
A -> E

B -> C
C => D
D => E

Here we’re assuming that modules C, D, E define classes:

  • C which extends D
  • D which extends E
  • E which extends nothing (e.g. Object)

Even though A is dealing with an instance of C, given indirectly through B, the fact that A imports E is enough to access any field of C that was in fact defined in E’s class definition.

The solution for that is, instead of trying to be smart, just to actually base our analysis on loose and tight imports. If there is a ‘tight import chain’ between any of the modules we import and the module in which the class from which the field comes was defined, then we’re fine. Otherwise, we should err.

But this didn’t work at first, because import classification (between tight and loose) was done in the backend only, when everything was resolved! I lost some time trying to do that in the middle-end without realizing it.

Then I just split the classifying logic into some class in rock/middle/algo, made it run after the tinkerer but before the backend, and.. not having a choice, made a VariableAccessChecker, ran whenever the backend encountered a variable access.

Which means that for this particular case, rock might still generate a few C files before noticing that something is wrong with another module, but at least it’ll display a pretty error instead of letting the C compiler choke on it.

The error in all its beauty:

source/johnq/stages/game.ooc:98:55 error Module `johnq/mob` must be imported to access the variable `radiusSquared` from ClassDecl Mob
                                if (distSquared < mob radiusSquared) {
                                                      ~~~~~~~~~~~~~   

The end… for now

So, what should you remember from this article?

  • That it’s really hard to explain compiler internals in a barely 400-lines article, let alone 4 or 5 different bugs
  • That there’s always room for improvement in rock, but things are getting better
  • That working on real-world projects with your own tools (aka dog fooding) is good, yet one still has to pay for the design mistakes they have made in the past.

If you feel very brave, you can delve in the depths of rock’s source. Work happens in branches and discussions happen in issues, so get digging if you’re curious.

With that, I bid you au-revoir!