Thursday, 29 September 2022

Are You Sure?

 A computer app will often throw up a message that says something like "Are you sure?" and you can click yes or no. I wanted to take some money out of a building society and the "yes" option of "are you sure?" opened up a boring side-quest IRL and the more I think about it the more ridiculous it seems.

Are you sure?
Download this pdf file and take it somewhere with a printing machine. Print the file onto a sheet of paper. Write your name briskly with an ink pen on the paper. Obtain another sheet of paper that has been folded and glued into a very specific shape. Put the first sheet of paper inside the second. Mark the second piece of paper with this exact string of symbols. Obtain a small sticky piece of paper that has a portrait of the last Queen on it. Stick the third piece of paper onto the second. Drop the whole bundle into a giant red cast-iron cupboard that you will find bolted down in the street.

Richard "paperless office" B

Tuesday, 20 September 2022

Mimosa

 In the early 90s my father built a very accurate model of the boat that he owned at the time. It was an 18' Plymouth Pilot called Mimosa. The building of the model was significant to me because it divided my life into two parts: The part where I didn’t understand the lines drawing of a hull shape, and the part where I could read them quite easily.

The model has been gathering dust and dirt and pieces have got broken or gone missing since my father's death. I inherited this model as part of my mother's estate and I have been cleaning and restoring it. The boat has a cuddy at the front and there are grab rails on its roof. On the model the clearance between the bottom of the grab rail and the top of the roof is so small that it was very hard to clean. The gap was far too small for a cotton bud, and a bit too small for a pipe cleaner. The answer was a small size interdental brush, but when I was wandering around the house trying to find something to squeeze into this tiny gap I remembered something. The gap on the real boat was also too small! If you walked up the sidedeck and quickly took hold of the grab rail you would skin you knuckles on the top of the cuddy. That's really fantastic model building!



Richard "repair shop" B


Saturday, 3 September 2022

Finite State Machine Tokenizers

"If you solve a problem with a regex, now you have two problems"

I have a dream! That one day software developers who have to do string processing will be just as likely to reach for a finite state machine as they are to start writing regular expressions. FSM's are not a magic bullet, but for some problems they're perfect: Easy to write, easy to understand, easy to test.

For the sake of this article, lets say that we want to grab the comments out of a c or java source file. These are the rules:

  • A line comment starts with a double slash and ends at the first newline
  • A block comment starts with slash star and ends at the first star slash
  • A double slash within a string literal does not start a comment
  • Slash star within a string literal does not start a comment
  • A string literal starts with doublequote and ends at the first double quote
  • A string literal does not end at a double quote preceded by a backslash
This is the definition of a FSM tokenizer that grabs the comments. This is not a real format, but I think it might be valid YAML. I also think it could be a fascinating project in java to implement a general purpose FSM tokenizer where the FSM is defined in yaml.

Start in the CODE state and for each character in the string do what your current state tells you, including moving to a new state.

---
CODE: {
'/': ONESLASH
'"': INQUOTE
?: CODE
}

ONESLASH: {
'/': LINECOMMENT
'*': BLOCKCOMMENT
?: CODE
}

LINECOMMENT: {
'\n': [endToken, CODE]
?: [appendTheChar, LINECOMMENT]
}

BLOCKCOMMENT: {
'*': BLOCKSTAR
?: [appendTheChar, BLOCKCOMMENT]
}

BLOCKSTAR: {
'/': [endToken, CODE]
?: [append:'*', appendTheChar; BLOCKCOMMENT]
}

INQUOTE: {
'"': CODE
'\\': ESCAPE
?: INQUOTE
}

ESCAPE: {
'"': INQUOTE
?: INQUOTE
}

Some of you are already familiar with this method of string processing. Some of you think that this type of FSM is better expressed with the characters at the root level. That is, all the state transitions that can be triggered by (say) a slash are grouped together, rather than all the transitions that start in (say) a quoted string being grouped together. "You're wrong and I hate you".

One of my teams has recently started to use the language Kotlin, so over the bank holiday weekend I gave myself a personal coding project: To write a FSM tokenizer in Kotlin and to make it as easy to read as I could. This what I ended up with.

/**
 * base class for Finite State Machine based tokenizers (lexers)
 */
abstract class FSMTokenizer {
    var token = StringBuilder();    // the current token
    var tokens = ArrayList<String>(); // all the tokens

    operator fun plus(c: Char?){    // add a character to the current token
        token.append(c);
    }

    operator fun plus(s: String){   // add a string to the current token
        token.append(s);
    }

    fun split() {                   // end one token and start the next
        tokens.add(token.toString());
        token = StringBuilder();
    }

    fun tokenize(s: String): List<String> { // feed the characters of a string into a Finite State Machine
        for (c in s) {
            step(c)
        }
        step(null) // the FSM is given a null after the last character - it might have some tidying up to do
        return tokens;    
    }

    abstract fun step(c: Char?)     // deal with one character. You need to implement this yourself
}

/*
 * tokeniser which grabs only the comments out of a java source file
 */
class FSMComments(): FSMTokenizer() {
    var state: States = States.CODE; // start the machine in the CODE state

    override fun step(c: Char?) = state.step(this, c) // pass this tokenizer into the enum class

    enum class States {                    // FSM model
        CODE, ONESLASH, LINECOMMENT, BLOCKCOMMENT, BLOCKSTAR, INQUOTE, ESCAPE;
        fun step(tokens: FSMComments, c: Char?) {
            tokens.state = when(tokens.state) { // our implementation of step() updates the state of the machine
            CODE ->
                when (c) {
                    '/' -> ONESLASH
                    '"' -> INQUOTE
                    else -> CODE
                }
            ONESLASH ->
                when (c) {
                    '/' -> LINECOMMENT
                    '*' -> BLOCKCOMMENT
                    else -> CODE
                }
            LINECOMMENT ->
                when (c) {
                    '\n' -> {tokens.split(); CODE}
                    else -> {tokens+c; LINECOMMENT}
                }
            BLOCKCOMMENT ->
                when (c) {
                    '*' -> BLOCKSTAR
                    else -> {tokens+c; BLOCKCOMMENT}
                }
            BLOCKSTAR ->
                when (c) {
                    '/' -> {tokens.split(); CODE}
                    else -> {tokens+'*'; tokens+c; BLOCKCOMMENT}
                }
            INQUOTE ->
                when (c) {
                    '"' -> CODE
                    '\\' -> ESCAPE
                    else -> INQUOTE
                }
            ESCAPE ->
                when (c) {
                    '"' -> INQUOTE
                    else -> INQUOTE
                }
            }
        }
    }
}

/*
 * tokeniser which splits a string on commas. It eats leading whitespace and multiple commas
 */
class FSMCommas(): FSMTokenizer() {
    var state: States = States.COMMA;    // start the machine in the COMMA state

    // extra operators can be implemented here

    override fun step(c: Char?) = state.step(this, c)

    enum class States {                    // FSM model
        TEXT, COMMA;
        fun step(tokens: FSMCommas, c: Char?) {
            tokens.state = when(tokens.state) {
            TEXT ->
                when (c) {
                    ',', null -> {tokens.split(); COMMA}
                    else -> {tokens+c; TEXT}
                }
            COMMA ->
                when (c) {
                    ',', ' ', '\t', '\n' -> COMMA
                    else -> {tokens+c; TEXT}
                }
            }
        }
    }
}

fun main() {
    val helloWorld = """
        /* The code below will print the words Hello World
        to the screen, and it is amazing */
        #include <stdio.h>
        int main() {
            // printf() displays the string inside quotation
            printf("Hello, World!");
            printf("//shhh this isn\"t a comment");
            return 0;
        }
        /* this block comment has a * in it */
    """.trimIndent();
    val list = """
            one, two,
            three, four""";
    for (t in FSMCommas().tokenize(list)) {
        println(t);    
    }
    println("---")
    for (t in FSMComments().tokenize(helloWorld)) {
        println(t);    
    }
}

Richard "Blogger is rubbish for code samples" B

Monday, 22 August 2022

GDK

 I grew up in Plymouth so I'm quite used to dealing with seagulls. I have pretty much never had an ice cream, pasty or sandwich stolen by them. The trick to eating outdoors while they are around is quite simple: Either put your back against a wall, or form a small huddle with your friends. You are responsible for looking at one sector of the sky and when one swoops down, try to grab it. They hate that and they invariably leave. I don't know what would happen if you actually managed to grab a seagull, I don’t believe it has ever happened in the history of human civilisation. I assume that it would crash the simulation that we live in and/or the nature of God would be revealed to humanity.

 I don’t know how much the herring gulls have been studying the development of aircraft fighter manoeuvres during world war two – but it's more than I would have guessed. One of the most important tactics is maintaining or evading visual contact, and now they know all about it.

 It was recommended to me that I should try German Donner Kebab and I did so on a very hot and sunny day. While I was eating this kebab in the street in the city centre a large herring gull dived down directly out of the sun (where I couldn't look) and tried to knock the kebab out of my hand. He made contact with me and the kebab but only a tiny amount of salad fell to the floor. Either they're getting better, or I'm getting worse.

 Richard "BFM" B

Parasol

 I wish I were one of those people who could make a good first impression, but instead I keep displaying the behaviour of an oddball.

I've got some ginger heritage, so I have to be very careful not to get sunburned. I spend most of Saturday standing in bright sunlight in a paddock with no shelter or shade. I'm English so I don't have a separate umbrella and parasol, so I was walking around on one of the hottest days of the year, in a drought, with an umbrella. Worse the handle broke off and the best temporary repair I could make was to stuff a large screwdriver up the broken shaft and use that instead. I think that the first thing everybody I met probably noticed was the broken umbrella grafted on to a screwdriver handle.

Richard "function over form" B

Monday, 8 August 2022

Socktopus

 I have recently been shamed by a laundry aid. I own a thing that I call a “socktopus” although I think its real name is an Ikea Pressa. It’s a plastic body with eight legs. The whole thing will dangle from the washing line, and each of the eight legs has two captive clothes pegs on it. I use it to dry socks. I have two feet and I change my socks daily, so if I do my laundry weekly then there are two unused pegs on the socktopus. That extra leg gives me one day of leeway in my laundry schedule. If it takes me more than eight days to get round to doing it, then the socktopus is full. Hanging a few socks on the line with individual wooden pegs feels like a punishment, and I can see the disapproval in the cartoonish painted eyes of the socktopus.

Richard “slovenly” B