PanuLogic Software Development Blog: November 2016

Friday, November 25, 2016

Getting the semantics right

In natural language "Getting the semantics right" means making sure other people understand you. It means picking the right words and putting them in right order, taking into account your intended audience of course.

In another sense you always get the semantics "right" in natural language. The semantics of a sentence is what it is, it is the perceived meaning of the sentence. There may be more or less of it however. You don't choose the semantics, it chooses you . Your sentence could be ambiguous, or (which is not the same thing) it could mean different things to different people. So yes, getting the semantics wrong means saying something that is either not understood at all, or easily misunderstood. ("Oh Lord, please don't let me …")

In programming "Getting the semantics right" is a more difficult thing. Not only do we have to pick the right statements to get our intention across to the machine. We must INVENT new words, new APIs. The name of an API-method is like a word, a call to that API with some arguments is like a sentence.

Getting the semantics of new API-words and sentences right means their meaning, their effect must be easily understood in relation to other existing and new words and sentences and ways of using them. Getting the "API sentences" right is sometimes called "Fluent programming style" (https://en.wikipedia.org/wiki/Fluent_interface).

"Fluent" means easy to read, easy to understand, using exactly the right combination of words to convey your meaning with nuance and precision. Being fluent means using the language of your choice like a native speaker would. In programming we must invent languages that _can be_ used fluently. But in programming there's more to it than just fluency because your API is creating a new language, there are no native speakers of it.

We could name our APIs arbitrarily, say with increasing numbers, API-1, API-2 etc. Then "getting semantics right" would simply mean calling the right APIs. But we would never do that . Compilers do that. They turn our meaningful API-names into addresses of binary code entry-points.

Making API-method-names and the possible ways of using them easy to understand means they must correspond to some real-world analogue. Example: "Search-Trees". When they do we can understand them in terms of that analogue, consistent with how we understand other words and sentence in terms of the same model. But it's not easy to come up with cohesive set of new words where understanding one term helps to understand the others.

In summary in programming getting semantics right means not only calling the right APIs but also coming up with API-method-names that are meaningful and consistent with other API-methods we "invent". In programming there's more to "getting semantics right", than just picking the right words in right order. #fluent_programming

Wednesday, November 16, 2016

Why Programming Is Difficult

Programming is difficult because you can not understand a program piecemeal. You can not ultimately "Divide and Conquer" a program, although you must try your best.

You try to divide your program into independent "modules" to manage the complexity of understanding it. But that can not fully solve the problem because what the program does is caused by interaction of its components.

There is no single component that would or could be "The Interaction". Each component must do its part to interact with others. In doing so it must assume something about those other parts. This makes it difficult to change any part of the program, changing it can break the assumptions other parts make about it.

Software Engineering is about trying to devise ways to divide programs into minimally dependent parts, and also to divide the problem of creating programs into multiple more or less independent parts, and making it easy to understand what those parts are, and what are their dependencies, the interactions between them. That's a mouthful, right? But that just indicates what a big task it is.

In the end that can not be done as well as we'd like because each part must interact with others, it must encode assumptions about other parts to be able to interact with them.

In this way Software Engineering is like other engineering disciplines. We're trying to make programs stronger, less likely to collapse like house of cards, cheaper, faster, easier to maintain, adapt and reuse for different purposes.. There is no ultimate solution to this, only better solutions to be discovered. No Sorcerer's Stone. No mythical Silver Bullet to kill the "Beast of Interactions". But there is progress I believe, as we gain a better understanding of what Software Engineering is, what it can and cannot do.

Saturday, November 5, 2016

If 6 turned out to be 9

You may have heard the smash hit "If 6 was 9" by Jimi Hendrix (if not there's a link below). In my JavaScript code I have statements like this:

if (6 == 9)
{ // ... then something
}

You might rightfully wonder why. What is this guy up to? Has he gone off the edge, or even "jumped the shark"?

The reason I have code like that is that Visual Studio marks all unreachable code as errors, as red dots in the scroll-bar. While I think that's a great feature in an IDE which I'd rather have than not have, I don't quite agree that unreachable code rises to the level of error. It doesn't break things, the program still runs, correctly perhaps. It's good to know the dead code is there, so it can be removed before going to production. But it's also good to know it was left there on some purpose, for the time being.

The reason I often don't remove unreachable code (right away) is that it it may be an example of good working code. Often it contains an earlier version of something I'm in the process of re-designing. I want to be able to easily switch back to it if needed, if only to see how in fact it did work, when it did. The unreachable code works, it just doesn't do what I want the program to do at the moment, maybe later. Maybe not. Maybe. You want to keep it around.

It might be NEW code which doesn't quite yet work because of the latest changes I have made, or need to make elsewhere in the program under development. I still want to be able to test and execute other parts of my program which do not depend on the as-yet-not-ready code.

I could comment it out. But then I would need to uncomment it when and if I want to run it. Problem with that is when uncommenting it is easy to uncomment too much, or too little. You can not be sure if what you uncommented ever worked actually. So, I often like to keep unreachable code as is in my program, during development.

The problem with the IDE marking unreachable code as error is that such "errors" become noise which can hide real errors, real show-stoppers.

I've discovered over time than one of the biggest time-wasters in development can be the continuous restarting of the debugger. If you KNOW there are errors which will cause it to crash, perhaps an undeclared variable, you should fix those errors before restarting the debugger.

You would like the IDE to notify you of any such possible "show-stoppers" before trying to start the show.

And yes it does, for many of such errors. The problem becomes if you already have 3 "errors" in your error-list, 3 red-dots in the scroll-bar caused by unreachable code. It then becomes hard to notice when a 4th red dot appears, because of some real error - which you should fix before wasting time on debugger restart. Only to crash because of an undeclared variable, fix that, rinse and repeat. The hour of coding time is soon up, my friend.

To clearly mark unreachable code as INTENTIONALLY unreachable, I put it inside:

if (6 == 9)
{ // ... then something
}

Voila! That segment no longer shows as an error in Visual Studio. BTW: Had I used 'false' as the condition the IDE would be "smart" enough to recognize it as unreachable code and would report it as error. But Jimi saves the day.

When I see this segment of code later it reminds me of the Jimi Hendrix song, and makes it clear that this, at least, is not my coding error but something I put there deliberately, for a good reason.

Check it out: https://youtu.be/vZuFq4CfRR8

Thursday, November 3, 2016

How to refer to functions when writing about them

When documenting or in general writing about or commenting code, what is the best way to refer to functions? I'm thinking in the context of commonly used languages such as JavaScript, Java, C, C# etc. When writing about source-code (not when writing code).

A function, in JavaScript, is defined like this:

function foo (argA, argB, argEtc)
{ // some code
}

When writing about a function like above a common way to refer to it is to write something like "foo() is ... ". But there's a problem, an ambiguity, a possibility for confusion here. Does "foo()" refer to the function "foo"? OR does it refer to its RESULT?

To distinguish between the two it might be better to just refer to the function as "foo", and to its result as "foo()". The value of the expression foo() after all is the RESULT of the function "foo". But this brings about another avenue to confusion. Maybe there is no function "foo" (or should I say "function 'foo()' ?). There might be a VARIABLE named "foo". So seeing just a reference to "foo" you don't know if I'm talking about a function, or a variable, or something else. There might be both a function AND a variable named like that. There might a file named like that. Or a macro?

It would be good if names would somehow indicate what type of thing they are referring to, to make it clearer. But naming my function "fooFunction" (or should I say "fooFunction()"?) would be way too verbose. We want non-ambiguity but ALSO brevity. In fact you could say something like:

CLARITY = BREVITY - AMBIGUITY

So what's the best way to refer to a function, as opposed to its value, or a variable named like it? I've come to this conclusion. It is to make the reference have this form:

foo(){}

"foo(){}" is short for "function foo(some args) {some code}". You can not confuse it with a variable named "foo",, and you can not assume it refers to the result of function - because it does not end in ().

It clearly looks like a function, just with some parts of that omitted, for clarity. And you know, if it walks like a duck, and so on. And it is brief enough to write multiple times even within a single comment or paragraph.