Friday, November 25, 2016

Getting the semantics right

In natural language "Getting the semantics right" means making sure other people understand you. It means picking the right words and putting them in right order, taking into account your intended audience of course.

In another sense you always  get the semantics "right" in natural language.  The semantics of a sentence is what it is, it is the perceived meaning of the sentence. There may be more or less of it however.  You don't choose the semantics, it chooses you .  Your sentence could be ambiguous, or (which is not the same thing) it could mean different things to different people. So yes, getting the semantics  wrong  means saying something that is either not understood at all, or easily misunderstood. ("Oh Lord, please don't let me …")

In programming "Getting the semantics right" is a more difficult thing. Not only do we have to pick the right statements to get our intention across to the machine.  We must INVENT new words, new APIs. The name of an API-method is like a word, a call to that API with some arguments is like a sentence.

Getting the semantics of new API-words and sentences right means their meaning, their effect must be easily understood in relation to other existing and new words and sentences and ways of using them. Getting the "API sentences" right is sometimes called "Fluent programming style" (https://en.wikipedia.org/wiki/Fluent_interface).

"Fluent" means easy to read, easy to understand, using exactly the right combination of words to convey your meaning with nuance and precision. Being fluent means using the language of your choice like a native speaker would.  In programming we must invent languages that _can be_  used fluently. But in programming there's more to it than just fluency because your API is creating a new language, there are no native speakers of it.


We could name our APIs arbitrarily, say with increasing numbers, API-1, API-2 etc.  Then "getting semantics right" would simply mean calling the right APIs. But we would never do that . Compilers do that. They turn our meaningful API-names into addresses of binary code entry-points.

Making API-method-names and  the possible ways of using them easy to understand means they must correspond to some real-world analogue. Example: "Search-Trees".  When they do  we can understand them in terms of that analogue, consistent with how we understand other words and sentence in terms of the same model. But it's not easy to come up with cohesive set of new words where understanding one term helps to understand the others.

In summary in programming getting semantics right means not only calling the right APIs but also coming up with API-method-names that are meaningful and consistent with other API-methods we "invent".  In programming there's more to "getting semantics right", than just picking the right words in right order. #fluent_programming



© 2016 Panu Viljamaa. All rights reserved

Wednesday, November 16, 2016

Why Programming Is Difficult

Programming is difficult because you can not understand a program piecemeal. You can not ultimately "Divide and Conquer" a program, although you must try your best.

You try to divide your program into independent "modules" to manage the complexity of understanding it.  But that can not fully solve the problem because what the program does is caused by interaction of its components.

There is no single component that would or could be "The Interaction". Each component must do its  part to interact with others.  In doing so it must assume something about those other parts. This makes it difficult to change any part of the program, changing it can break the assumptions other parts make about it.

Software Engineering is about trying to devise ways to divide programs into minimally dependent parts, and also to divide the problem of creating programs into multiple  more or less independent parts, and making it easy to understand what those parts are, and what are their dependencies, the interactions between them. That's a mouthful, right?  But that just indicates what a big task it is.

In the end that can not be done as well as we'd like  because each part must interact with others, it must encode assumptions about  other parts  to be able to  interact with them.

In this way Software Engineering is like other engineering disciplines. We're trying  to make programs stronger, less likely to collapse like house of cards, cheaper, faster, easier to maintain, adapt and reuse for different purposes.. There is no ultimate solution to this, only better solutions to be discovered. No Sorcerer's Stone. No mythical Silver Bullet to kill the "Beast of Interactions". But there is progress I believe, as we gain a better understanding of what Software Engineering is, what it can and cannot do.


© 2016 Panu Viljamaa. All rights reserved

Saturday, November 5, 2016

If 6 turned out to be 9

You may have heard the smash hit "If 6 was 9" by Jimi Hendrix (if not there's a link below). In my JavaScript code I have statements like this:

if (6 == 9)
{ // ... then something
}

You might rightfully wonder why.  What is this guy up to? Has he gone off the edge, or even "jumped the shark"?

The reason I have code like that is that Visual Studio marks all unreachable code as errors, as red dots in the scroll-bar. While I think that's a great feature in an IDE which I'd rather have than not have, I don't quite agree that unreachable code rises to the level of error. It doesn't break things, the program still runs, correctly perhaps. It's good to know the dead code is there, so it can be removed  before going to production. But it's also good to know it was left there on some purpose, for the time being.

The reason I often don't remove unreachable code (right away) is that it it may be an example of good working code. Often it contains an earlier version of something I'm in the process of re-designing. I want to be able to easily switch back to it if needed, if only to see how in fact it did work, when it did. The unreachable code works, it just doesn't do what I want the program to do at the moment, maybe later. Maybe not. Maybe. You want to keep it around.

It might be NEW code which doesn't quite yet work because of the latest changes I have made, or need to make  elsewhere in the program under development.  I still want to be able to test and execute other parts of my program which do not depend on the as-yet-not-ready code.

I could comment it out.  But then I would need to uncomment  it when and if I want to run it. Problem with that is when uncommenting it is easy to uncomment too much, or too little. You can not be sure if what you uncommented ever worked actually.  So, I often like to keep unreachable code as is in my program, during development.

The problem with the IDE marking unreachable code as error is that such "errors"  become noise which can hide real errors, real show-stoppers.

I've discovered over time than one of the biggest time-wasters in development can be the continuous restarting of the debugger. If you KNOW there are errors which will cause it  to crash, perhaps an undeclared variable,  you should fix those errors before restarting the debugger.

You would like the IDE to notify you of any such possible "show-stoppers" before trying to start the show.

And yes it does, for many of such errors. The problem becomes if  you already have 3 "errors"  in your error-list, 3 red-dots in the scroll-bar caused by unreachable code. It then becomes hard to notice when a 4th red dot appears, because of some real error - which you should fix before wasting time on  debugger restart. Only to crash because of an undeclared variable,  fix  that, rinse and repeat. The hour of coding time is soon up, my friend.


To clearly mark unreachable code as INTENTIONALLY unreachable,  I put it inside:  

if (6 == 9)
{ // ... then something
}

Voila! That segment no longer shows as an error in Visual Studio. BTW:  Had I used 'false'  as the condition the IDE would be "smart"  enough to recognize it as unreachable code  and would report it as error. But Jimi saves the day.

When I see this segment of code later it reminds me of the Jimi Hendrix  song, and makes it clear that this, at least, is not my coding error but something I put there deliberately, for a good reason.

Check it out: https://youtu.be/vZuFq4CfRR8


© 2016 Panu Viljamaa. All rights reserved

Thursday, November 3, 2016

How to refer to functions when writing about them

When documenting or in general writing about or commenting code, what is the best way  to  refer to  functions? I'm thinking in the context of commonly used languages such as JavaScript, Java, C, C# etc. When writing about source-code  (not when writing code).

A function, in JavaScript, is defined like this:

 function foo (argA, argB, argEtc)
 { // some code
 }

When writing about a function like above a common way to refer to it  is to write something like "foo() is ... ".  But there's a problem, an ambiguity, a possibility for confusion here.  Does "foo()" refer to the function  "foo"? OR does it refer to its RESULT?

To distinguish between the two it might be better to just refer to the function as "foo", and to its result as "foo()". The value of the expression  foo() after all is the RESULT of the function "foo". But this brings about another avenue to confusion. Maybe there is no function "foo" (or should I say "function 'foo()' ?).  There might be a VARIABLE named "foo". So seeing just a reference to "foo" you don't know if I'm talking about a function, or a variable, or something else. There might be both a function AND a variable named like that.  There might a file named like that. Or a macro?

It would be good if names would somehow indicate what type of thing they are referring to, to make it clearer.  But naming my function "fooFunction" (or should I say "fooFunction()"?)  would be way too verbose. We want non-ambiguity but ALSO brevity. In fact you could say something like:

  CLARITY  =  BREVITY   -   AMBIGUITY

So what's the best way to refer to a function, as opposed to its value, or a variable named like it? I've come to this conclusion. It is to make the reference have this form:

 foo(){}

"foo(){}" is short for "function foo(some args) {some code}".  You can not confuse it with a variable named "foo",, and you can not assume it refers to the result of function - because it does not end in ().

It clearly looks like a function, just with some parts of that omitted, for clarity. And you know, if it walks like a duck, and so on. And it is brief enough to write multiple times even within a single comment or paragraph.


© 2016 Panu Viljamaa. All rights reserved

Thursday, May 26, 2016

On the Nature of Software Development

I'll get straight to the point. What most non-developers and novice developers probably fail to appreciate is what makes software good, what makes software "high quality". What makes software good is its continuous incremental, iterative development. It is like those ancient Japanese swords the sword-smiths had to produce by countless iterations of annealing their steel.

I can't say I know the viewpoints and thoughts of all non-developers and end-users but I suspect they don't fully appreciate this fact because they can't see how many iterations the end-product they use has gone through earlier. Like the user of an ancient Japanese sword who can appreciate its superb quality, they can't see beyond its excellence, to what it took to create it.

A good software product serves many users. That's how it survives more fitly than others. It must find its ecological niche if it is to survive. All of those users have a different usage-profile for any given product. For the software app to become successful it must serve many of such user-groups well. But for its original developer it is impossible to know beforehand what such  usage profiles might be and what their requirements might evolve to in future.

This is because the world changes all the time. Competing products emerge which perhaps do some of those features better, causing users for which that feature is a high priority to switch. But the developer can not know what products the current and future competition is up to. They can only make an educated guess which is more or less wrong. Only as users start using the software, it becomes more evident how it could actually serve its users better.

In the end a mature software application becomes as comfortable as an old brown shoe Beatles sang about. Like an old brown shoe a mature software application eventually starts bursting at the seams because the platform it is built on becomes out-dated. Maybe it was Cobol before web.  But there is no other wear and tear than platform-erosion, in the virtual world. Therefore it can stay popular long after its hey-day. Ideally by the time its original platform becomes out-dated it has incarnated itself on to the newer platforms of the day.

Anyway,  non-developers, managers, sales-team, what have you, probably fail to fully appreciate what makes a software-application great:  Its incremental and continuous iterative improvement and adaptation, to the needs of its different user-groups, over time.



© 2016 Panu Viljamaa. All rights reserved

Tuesday, September 1, 2015

How I got my 40 GB disk-space back from Windows

1. THE C-DRIVE  
I was using the utility "Piriform Defragger" to defragment my  C-drive on an older Windows Vista PC. A nice thing about Defragger is it shows you the list of files it was NOT able to defragment, for whatever reason.  This list of files also often reveals what are some very big files on the disk because those are the ones that are hard to defrag, I think. Or maybe it's because some system process has locked them up for its own use only.

So I saw there was a 15GB file in the System Volume Information folder that would not defragment. The System Volume Information folders exist for storing the System Restore snapshots, which Windows creates automatically or on demand, right?  Plus they may contain some other stuff like "shadow copies" I don't  know about.  But so it looked like System Restore was taking up 15GB of my disk, which I think is too much. Yet I had  already asked the Disk Cleanup -utility to delete all but the latest snapshot, and so it was a bit puzzling.

Here's how I solved the problem and got my disk-space back:  I disabled System Restore totally.  The 15GB file was immediately gone. I then re-enabled System Restore and created a single new restore point. The 15 GB DID NOT COME BACK!  The size of the folder System Volume Information on my C-drive  is now 400 MB.

Now do we really need that additional free space so much? Well if the disk is getting full there may not be enough free space even to defragment it any more.  At that point this trick can help.

SOLUTION 1/2:  Disable System Restore totally, temporarily.  Then re-enable it and create a new restore-point. This will often (?) delete all old restore points that somehow are still hanging around.  At least on older Windows systems.



2. THE D-DRIVE
Having realized I'd been carrying 15GB of dead weight on my C-drive,  for who knows how long I thought:  Could there perhaps be something on the D-drive as well that was taking space for no good reason?  I don't usually have the "Show Protected System Files" -option on, but since I now did, I switched to the D-drive. And I and could see it too had a System Volume Information -folder. This one turned out to be even bigger: 25 Giga-Bytes. Holy Matrimony! I had never before even suspected this kind of thing  was going on, behind my back.

So I tried the same techniques as on the C-drive and more, turning the System Protection on and off on D-drive, but it wouldn't go away.  There was no way I could delete it either, and trying to change myself to be its "owner" didn't work, access was denied. Booting to "Safe Mode" did not help. And any of the cleaning utilities I'd been deploying like "CCleaner" never once told me this wasted space, this dead weight, was there, about to sink my ship .

A DISCLAIMER: I don't necessarily recommend that you do what I describe next, what I did to solve my problem. If you do remember, you are solely responsible for your own actions.  But so I did find  a way to reclaim that further 25GB as well. But buyer beware:

SOLUTION 2/2:  I downloaded the "Knoppix Live CD" Linux -distribution from the web, burned it on a CD and rebooted the PC from that CD. Meaning I booted into Linux. After that everything was relatively easy.  I navigated to the D-drive and DELETED the System Volume Information -folder.  The Windows was not there preventing me from doing it. Windows was in deep sleep while I surgically removed this large chunk of "dead tissue" from it. I then removed the CD from the tray and rebooted again. Everything worked fine and now I have 15 + 25 Giga-Bytes more space on my PC.

This is especially good for the system-drive C:\ since it was to about run out of space, meaning it would have been difficult or impossible to defragment it any more, meaning it would have just kept on fragmenting more and thus getting ever slower. And the system-drive is where speed matters because that affects how fast your Windows reacts, starts, or even shuts down.

15 + 25 GigaBytes freed. Not a bad day :-)




 © 2015 Panu Viljamaa. All rights reserved

Saturday, August 22, 2015

The Paradox of Agility

In Software Development "Agility" means that you try to improve your process continually, be aware of it, measure it, learn from your mistakes. Right?

To be able to do this you have to repeat your process again and again. You need to have a specific metric within a specific process you try to improve. The unspoken assumption is you want to have a "repeatable process", so you have something against which to measure the effect of changing some process-parameters.

Like say we try to improve our estimates about how much we can  accomplish in the next 2-week "sprint".  The fixed, repeating part is that we do two-week sprints.  It makes sense when you try to optimize the outcome to keep many parts of the process fixed and vary just some parameters in small increments to see how that affects the outcome.

That is the Paradox of Agility. You try to keep your process well-defined and repeatable in order to learn from it. But if you keep the process constant it means it can not change. And if you can not change it, how can you improve it?  Incrementally, yes. But small incremental change is not agility. Agility literally means "moving quickly and easily" (http://dictionary.reference.com/browse/agility).  
  
The practice of incremental improvement sounds like a recipe for finding a local maximum. By adjusting your process-parameters, your "bearings", you move upwards or downwards or sideways on the slope of a hill. You can always measure your altitude,  or your "burn-rate" and so try to maximize it.  Eventually you will end up on the top of hill having rejected process-parameters that took you downwards and keeping  ones that take you up.

Now you are on top of the hill, running your optimal Scrum process. Great. The only problem is it is a hill, but not The Hill.  You're on top of a small one.  Or at least you may be, you just don't know.

Take the agile SW development methodology "Scrum" as an example,  
Scrum is "iterative and incremental". I would add "highly structured". It has 2-week "sprints" which have phases and events like  "Sprint Planning", "Daily Scrum", "Sprint Review" and "Sprint Retrospective"

What I think the greatest  part of Scrum is the Sprint Retrospective. It
" ...   Identifies and agrees continuous process improvement actions".   

In other words you are trying to continually improve your project by repeating it over and over, to be able to learn the best ways to apply it.  This is the Paradox: You need to repeat it to improve it, but if you repeat it how can you change it? If you can't change it much, you can't improve it much. You can do that, but only incrementally.  And when you reach a local maximum, any further incremental change can only take you down.

One way to look at it is you are playing a game, and you can improve your game. Get a better score. But can you, should you, change the rules of the game? Why not, if other players agree.
 
So what can be done?  Not much. Ending up with local maxima is a well-known problem in operations research. It can be counteracted with techniques like "Simulated Annealing", https://en.wikipedia.org/wiki/Simulated_annealing.  How to apply that to process-improvement might be an interesting research-project for a PhD student.

If there's not much we can do, there is one thing we can do. Think. Think about different process-models, Scrum or Kanban, or Home-Grown?  Understand that while a repeatable process sounds like a great goal, it can lead to stagnation and local optimums. In my next post I plan to present an alternative to Scrum, to emphasize that we can at least THINK of other process models, other hills than we're currently on.  That may be the biggest benefit of having a process model: They allow us to think about how they could be different.


 © 2015 Panu Viljamaa. All rights reserved