Archive

Archive for December, 2010

Optimisation

December 22, 2010 1 comment

There was a big fat cloggage in the compilation process.

The tokenisation routine was using regular expressions before. Regular expressions I’m sure is the “proper” way to do it, but I just couldn’t figure out all that cryptic mess. I had a regex expression to split the source into spaced words and “sting values”, but had to further perform some regex on some regex to break up the code into the correct-tokens/that@may^be*like+this which was abhorrently slow. So I just did a basic single pass through using System.IO.StringReader to read each character and collect tokens along the way.

The speed increase was dramatic, so I probably don’t even need to have that pre-compilation functionality anymore.

The other bottleneck I have in my sights is the way variables are stored and retrieved. At the moment they are just being added and removed ad-hoc as the parser executes. Doing this each time a custom column attribute is referenced has a noticeable performance impact since variables are added, referenced, and removed each time through thousands of loops. The solution would be to do what any decent compiler writer would suggest to me and use a symbol table.

This involves building a table of variables at compile time, taking note of their name and scope of execution. So during execution the symbol table is only referenced and the value read or altered as need be, and if the scope is inappropriate it will error back.

How I am handling scope right now which is embarrassingly inefficient is to add the variable within the scope and popping out of the scope and removing it. So when referenced within a scope and it’s not there, it will try again to see if it was in the scope above.

You probably have already guessed how I did that due to slipping the word ‘try’ in there.

Try
  Look for variable
Catch
  Look for variable in scope above, because Steve is a n00b.
End Try

That’s a real /* WTF */ right there.

Categories: General

Parsing, Compiling

December 1, 2010 Leave a comment

Implementing this idea of type transformations has been quite an intense mental exercise for a fairly average programmer such as myself. I have decided to make it scriptable like the templates. It is pretty simple syntax supporting only variable (entity attribute) assignment, and if-else-elseif logic branching.

It turned up some ugly skeletons in the metadrone closet since it’s been a steep learning curve in the field of code compilation. I’m experimenting and wikipediaing as I go along because I didn’t cover lexical analysis or parsing or syntax trees or any of that brainiac stuff at university.

As it turns out – being so embarrassingly obvious – string comparison on the fly can be very expensive and it’s pretty noticeable after a few thousand loops. Not to mention when being used in the wild at work with real live templates.

So I’ve been brewing the code again sifting through it and trying to make the template source compile as much as possible. Tokenising is only the start of it. It’s more efficient to evaluate on enumerations or integers rather than on the source strings themselves.

Maybe I should just be a smart ass and compile it straight to MSIL bytecode while I’m at it.

Anyway, seriously, the transformations are looking very nice so far and even executes nimbly enough. But there are some optimisations there that need doing to make it a more robust experience. You want to have the confidence of being able to generate your code in the hundreds of thousands of lines without waiting 30 seconds to do it.

Categories: General
Follow

Get every new post delivered to your Inbox.