PanuLogic Software Development Blog: 2013

Sunday, October 27, 2013

The "Hedge" -formatting convention

OR: Why it's better to put the separator in the beginning of the line.

Assume you have a data-structure such as the list of 'months' shown below. When you have such a structure, you often want to add new items to it. That is easiest to do by copying some existing item and then modifying it to suit your purpose. At the same time you hope this will automatically give you another syntactically correct structure - so you don't need to stress your brain much about whether syntax is correct or not, or type much to adjust it. The less you type, the fewer the errors you can make.

When you do that you also typically want to add the new items to the END of the structure. Why? Because that tells you most naturally the order in which the items were added. As an example think of writing down the months of the year as a list. You would most naturally add later months after the earlier ones:

var months =
[ January,
February,
March
];

Looks pretty tidy, right? You have three months already written in correct JavaScript syntax. But now how do you add April to the list?

You can not simply copy one of the existing lines and put it into the end of the list. Or you CAN, but you then come up with incorrect syntax:

var months =
[ January,
February,
March
February,
];

So above is bad. Not too bad. But there is a better way to format code like above. I call it the Hedge -formatting convention:

var months =
[ January
, February
, March
];

Notice above that because we read code from left to right your eyes will immediately pick the vertical 'hedge' that now gives a visual indication that all content to the right of it is part of the same structure. It is easy to see where this structure starts and where it ends, by following this vertical 'hedge' downwards.

The opening and closing brackets are now PART of that same edge because they can be aligned with the separator character. It becomes easy to find the closing bracket of any opening bracket.

Am I missing some commas perhaps? That is easy to find out by following the "hedge". Such a hedge needs to be on the left side if the opening and closing brackets are to be part of it. And because we read (most programming languages) from left to right, your eyes will pick up the commas faster the more left they are.

With recursive structures the benefits of the hedge -convention are even greater, because it is easy to follow it consistently, and thus make the structure of your data visible and easy to verify syntactically. In the below structure it is easy to add new elements to the structure by copying any existing element AFTER the first, adding the copy to the end of the list, and modifying the copy. Note that most lists contain at least two elements, else there would be little need to have a list.

It is easy in the sense that you will most of the time have a syntactically correct structure and you know it.

var monthsAndWeeks =
{ January:
[ week1
, week2
, week3
, week4
]
, February:
[ week5
, week6
]
, December:
[ week52
]
};

So why is this so? Can 'left' really be better than 'right'? Why is left better than right? The answer is: Because we read and write from left to right. That convention has implications to the best way for formatting code.

It's important to see where a new component of a data-structure starts. In the above example that is easy to see because the separator is at the beginning of the line. So reading from left to right we only need to read one character of each line, to know where each component starts.

Tuesday, July 23, 2013

Shipping-Inspired Project Planning

BACKGROUND

In a previous blog-post Linguistic Approach to System Description I wrote about how to specify and describe products, especially software. The post explored what are the important parts of a good specification - to make sure we include them all in about equal and balanced portions.

THIS post is about how to manage and describe the process of implementing a specification. In other words ... this is about Project Planning.

It is important to have a clear description of WHAT is to be produced, and also separately HOW it will be produced. Sounds like a no-brainer, but in real life when doing project planning we need checklists, and rules of thumb.

In particular this post is about project planning for software- and other creative projects. The shipping-metaphor presented below would not apply to "physical projects" like building a bridge. But it would apply to the project of designing the bridge, which needs to be done before it is built.

The earlier post was for the viewpoint of someone ordering the creation of a software system. The current post addresses the issues of someone producing it. The buyer, and the seller.

DELIVERABLES

I will first describe the four documents I've found valuable to my projects, before explaining what they have in common with shipping industry:

A) Action-Plan.rtf

Action Plan tells what I (or other team-member) should be working on NOW - or next time they sit on their work-chair. This is not about next week or next month, it is about the great now.

B) Release-Plan.rtf

Release Plan tells what will be in the NEXT release, be it an alpha-, beta- or production release. "Release" means it is given for someone to actually use, be it the developers themselves, test-team, or the public. This document might as well be named "Next Release Plan", since it is all about the next release.

C) Product-Plan.rtf

Product Plan tells what will be in releases that we intend to come after next release.

One specific purpose of the Product-Plan is to prevent you from trying to write one huge plan that includes everything. Having a separate document called Product-Plan.rtf makes you ask: Does this feature belong in the release-plan OR the product-plan? Does a specific feature really need to be in the next release, or could we "ship" the next release without it? That would bring the value of the new features in it to users earlier.

Release-Plan and Product-Plan describe WHAT is planned to be in the product, at some point in future. They don't really describe HOW to get there. The big difference with Action-Plan is that it describes what to DO. Release- and product-plans describe a GOAL.

Implicit in the definition of the above documents is that if you reach the goal stated in the release-plan, you are going in the "right direction". When you do get there, you can, and should re-visit the plans for your next destinations.

So is that all we need? No. There is a fourth one:

D) LogBook.rtf

We are getting close to the shipping-metaphor now. The LogBook.rtf records entries about how and when the project plans were executed, or not. Were they revised, how, when and why?

What was done and and why is important information to keep track of. Writing it down helps our learning process and prevents others from making the same mistakes we did. Or us repeating our own mistakes.

Why '.rtf'? I keep my project-documents in this format since it is a relatively standard format, whose editor-app launches fast. I don't use .txt -documents since I like to emphasize portions of the text by making them bold, italic, or even colored. .

REFLECTION: SHIPPING INDUSTRY

Shipping-industry moves goods and cargo (or are they the same thing?) from one harbor to another. But moving stuff from Manila to Murmansk is not the end of the story. A ship must deliver something to Murmansk, but once it gets there it will typically pick up some new cargo to take to the next destination.

So a shipper (or is it 'skipper'?) needs to plan where to go next and what to deliver there. Releasing cargo on a port creates the next portion of VALUE of the trip. Sounds like a "release plan", right?

While you are at the sea you must navigate the ship, maybe seek shelter from the storm, sometimes just keep the engines running and steer towards specific coordinates. That is the action-plan.

There is no value to the ship sailing on high seas until it actually releases its cargo to the next port of call. What exactly to transport to the next port is the release-plan.

When you take your next cargo to ship it somewhere, you naturally think: Is there something in the city we are sailing to we can profitably carry to the next destination. You try to optimize not just the value you get from taking the cargo to the next destination, you also think ahead. Where will you go after the next release. That is the product plan.

And of course, the captain needs his log-book. I call mine LogBook.rtf.

When you make your next port-of call, the situations might have changed. You may get a new order. You may hear a forecast of hurricane, or pirates on your planned route. Maybe a new opportunity presents itself to ship pogo-sticks to Novosibirsk. That is the time, to re-think your next cargo and your next destination. That is the time to create your next release-plan.

Shipping industry tells us we don't need to set in stone all the harbors we will visit on a given journey, Or what cargo we'll be exactly carrying and delivering: What will be the product features that will be in the next release.

IS THERE MORE?

Is there more to the analogue? Well when a ship comes to a harbor it often doesn't release ALL its cargo in that harbor. It will continue its journey to move rest of it to other destinations. Similarly we don't (try to) release all our planned-for features in the next release.

The ship takes some new cargo from the city it arrived in, to take it further. Similarly we take feedback and new requirements from the users of the release delivered, some of which we will deliver in the next release.

But some tomatoes on board may get rotten when we reach the hot climate of Horse Latitudes. Some requirements we planned to be in next release may be found to make no sense after all. We will have to throw them overboard!

Software developers use tools that allow them to produce systems faster, or tools that support coordination among a bigger team. Ships can be faster or slower, and they can take a bigger or smaller cargo on them at a time. Some smugglers even use speed-boats ...

So, software developers would seem to have much in common with the shipping industry. Aye Aye Captain!

Lesson from Shippers to Coders: If you don't make it to the port-of-call, your trip is worthless. If you do but you can't sell your cargo, at least you can ship back some feedback.

http://panuviljamaablog.blogspot.com/2013/07/shipping-inspired-project-planning.html

Friday, June 28, 2013

Why Software Brains Can Not Feel Pain

Assume that at some point in future we'll be able to build an android whose brain is a computer running a program that makes it behave and react just like any human. Call him Andy. Can Andy be said to have consciousness? Can it, or he, feel pain for instance? I believe not, for reasons given below.

To simulate human brain on a computer most accurately, we would model each neuron and their connections. Then feed sensory data similar to what humans get from their eyes for instance. Looking at the activity of such a software brain we could monitor its synapses and see that they get activated in a similar manner as those in a real brain. Shouldn't we then assume it has "consciousness" as well? Would it "feel pain" for instance if we fed it appropriate sensory data?

No. Our software program simulating a brain is only a simulation of a real brain. What is simulation? It is a more or less accurate dynamic interactive description of something. But a simulation of a factory is not a factory. It is just a dynamic description of how a factory behaves, when we give it simulated inputs. The simulation is "isomorphic" (having structure similar) to a real factory, but it is not a factory.

Imagine taking a class in neurology. It is an e-learning course that gives you a fancy interactive web-page that shows how the brain reacts when you give it simulated stimulus. By using such an e-learning software you are able to understand how the brain functions. And in the coming years the fidelity and granularity of such dynamic brain-description software can be down to the level of individual neurons, including as many of them as in a real brain.

That e-learning application/simulation is still just a DESCRIPTION of the brain, not a real brain. It can behave just like a real brain would, just like a simulation of a factory could represent every action in a factory - yet not be a factory.

Think about software for weather forecasting. Meteorologists are ever improving their models of the weather to be as accurate as possible. They are simulating weather. But no-one would claim that a weather-forecasting application "is weather". It is just a dynamic, perhaps interactive description of it. It can't rain on you.

Conceptually a simulation does not differ much from an "interactive N-dimensional movie". We are familiar with 3D movies, but movies could also be N-dimensional. That would mean you can choose to view it from more than two viewpoints. Such a technologically advanced N-dimensional interactive movie would really be a "simulation".

Like movies, simulations can be replayed. They can be paused, and played backwards. If we say that software simulating the brain IS a brain, then we should similarly say that a (highly accurate interactive) movie of the brain is also a brain.

Such a movie could show us what happens in a human brain when we give it pain-inducing simulated inputs, That doesn't mean the movie is feeling pain.

In the end, does it really matter whether simulated pain is real, or not? It does from the viewpoint of ethics. It is unethical to cause unnecessary pain. But it is not unethical to DESCRIBE pain. Which is what simulation really is.

http://panuviljamaablog.blogspot.com/2013/06/why-software-brains-can-not-feel-pain.html

Tuesday, May 7, 2013

Use Functions Luke

JavaScript has been called Lisp with a C-style syntax. It's main building-block is Function. You create functions that call other functions, and also functions that return or take functions as arguments. That is called "support for higher-order functions".

This blog-post is NOT about higher-order functions. Nor is this about the 'module-pattern'. This blog-post is about using functions to encapsulate your "direct" reads and writes.

In JavaScript code written by a novice, you may see something like this:

myObject[mode] = anotherObject.xyz ;
...

// In some other Galaxy:
if (someObject[someVariable] == something)
{ ...
}

The problem with the above? You make a "direct write", and you make a "direct read". Why is that bad?

The problem with direct writes is that you can make them from anywhere. If you allow that, it becomes very difficult to locate the place where a specific value is written to a field of a specific object. Who dunnit? Which statement (among ten thousand) wrote that phantom value into my field? It is a problem of JavaScript that you can do that. But you must not give in to the temptation.

With code like above, you can't use your editor's "find" -command either to locate places where the given field is written. That is because field-name CAN be in a variable, as in the example above. But even if it is not, you might find too many places that write SOMETHING into the given field. And you need regular expressions to locate both ".fieldX" and ". fieldX" and ... you get the point.

There's an easy remedy to this maintenance nightmare.
Use Functions, Luke.

function setit (object, field, value)
{ if (value == 'weird') && (field == 'leftField')
{ debugger
}
object[field] = value;
}

If you never assign a value EXCEPT inside setit(), you can start the debugger whenever you suspect something is written that shouldn't.

If you are a follower of Functional Programming (FP) you know that assignments are BAD. From that perspective the benefit of using setit() for all writes is that at least you KNOW where all the bad code is. So you can keep an eye on it.

The function setit() can be extended so that it does not allow assignment if the field already has a value. Then you are pretty close at least in spirit to FP. Another name for "once-only-assignment" is "binding". Binding is good, (multiple-) assignment is bad.

So is that all there is to it? Well it's also useful to never READ fields directly. If you code

var v = someObject [ fname ];

it becomes difficult to find all places that use data from that specific field of that specific object.

There is no way you can HALT your code every time the value of the field is read. So you can not see when it's used and by whom. That means you can't easily change the value to a different type because you can't find which other places assume it is something else.

It then becomes difficult to change anything without breaking something. And that problem usually only becomes obvious in mid-flight, when trying to escape the death-star.

So what do you do? Use Functions, Luke:

function getit (object, field)
{ if (field == 'field_of_interest')
{ debugger
// now we can see whose's asking for this data
}
var value = object[field];
return value;
}

This pattern in its slightly different O-O form is often called simply 'Getters and Setters'. The main thing about it is that you must follow it ALWAYS.

If you don't follow it "as a rule" you soon start skipping its use in most places, because direct reads and writes are faster to code.

Then you will have 10,000 places in the code of your hyper-drive that do direct reads and writes. At that point it is prohibitively expensive to re-factor your engine into maintainable form. Meaning you can't catch phantom reads and writes. You must surrender to the dark side. Don't let this happen, Luke. Use Functions.

http://panuviljamaablog.blogspot.com/2013/05/use-functions-luke-javascript-has-been.htm

Friday, May 3, 2013

Critique of Technical Debt

"Technical Debt" is a term used in Software Development. It means you take shortcuts in your development effort, not following the best practices. You are "in debt" because you will later need to spend extra effort to re-write, or re-factor your code or system properly.

Sounds reasonable, but is the metaphor of "Technical Debt" really valid?

I can see one situation where Technical Debt is the right term. It is when you are 100% sure the code you are writing must be re-written later.

You are creating a quick-and-dirty prototype. You know it will need to be rewritten when used as the basis for the real product, so you know you can take shortcuts in your coding practices. You are thus taking on some Technical Debt, just to create a prototype that allows you to sell the project. Once the project is sold you can do the well -designed maintainable, adaptable, extensible implementation and thus pay back the technical debt. Like a wise investor you took on some debt, invested it in product development, then payed it back.

If you are consciously taking on 'Technical Debt' that can be a wise thing to do. But the term is more often used with a negative connotation. That often happens in situations where "Technical Debt" is really not the right term after all.

Let's say you work on some code for a day, and use several less than best practices to get it working in 12 hours. How much deeper in technical debt are you then?

'Debt' is something we must pay back. But possibly your code will never need to be modified afterwards. There isn't any debt to pay back then. 'Debt' is not the right term for something that just possibly might increase our maintenance expenses in future.

Writing less than best-practices code is not like taking on debt. It is like not buying insurance, not buying an option-to-sell when buying stock.

You buy stock at $100. You also buy an option to sell it at $100. That option costs $10. If the stock goes down and you must sell it, you get $100. But you're not even, because you paid $10 for the option.

You write code for an hour at $100 per hour. You spend an additional 6 minutes (= $10) making sure the code follows the pattern "Pluggable Adapters". That means you can later adapt your code without having to modify it. You just need to create a new adapter around it.

Maybe you never need to change your code. But if you do, you have now paid for the option that makes it relatively cheap to adapt it to changing circumstances later. But if you don't need to adapt it - the time you took to make it adaptable is your loss.

Instead of Technical Debt I think the term we should be using is 'Software Maintenance Risk' (SMR). Granted, "Technical Debt" is more catchy.

Software Maintenance Risk can be defined as the risk that you will need to modify your code in future. The way to eliminate SMR is to hedge against it by paying for the extra effort to write 'Perfectly Maintainable Code' (PMC).

What is that you ask? Can anything be 'perfectly maintainable'? Well we can define PMC technically, as software which never needs to be modified. If any maintenance-task can be achieved by simply adding a new adapter into your system, then your existing code never needs to be modified. It is PMC - at least until you discover it is not.

In equity markets you don't typically hedge against all loss because that can be costly and can limit your upside. You take some risk to make some profits. But you still want to reduce the risk to a reasonable level by buying some options. Sometimes you'll need a bigger hedge, sometimes smaller - depending on your estimate and tolerance of risk.

Similarly in SW development it may be too expensive to always write perfectly maintainable code. Writing less than perfectly maintainable code does not mean you get into Technical Debt. It means there is a risk you will need to pay more for maintenance work in the future.

In the stock-market you can lose everything if you don't hedge your bets with options. In software your application can lose all its users if you don't pay for the effort to keep it maintainable.

REFERENCE:
https://en.wikipedia.org/wiki/Technical_debt

Thursday, May 2, 2013

The Linguistic Approach to System Description

You are doing a software project. How should you structure its documentation? What guiding principles should be used for creating and structuring its documentation? Should you include project-planning documents in it?

I propose the "Linguistic Paradigm for System Description" here. It may have been proposed before, but probably not in exactly the same form. It is a tool for thinking about not only of documentation, but the structure of "systems" in general.

Note that we have computer applications which we often call "systems". Then we have project planning (documents, models) to help the creation of such systems in an orderly manner. But a project plan can also be seen as a system on its own. It has components that relate to each other, rules for its actors to follow, conditions and events that trigger further actions. Executing a well-defined project plan is really executing a program.

I focus here on system descriptions in general, whether those systems be computer programs or procedures and plans for creating them.

Before getting too philosophical here's the structure of documentation I advocate:

1. Syntax
2. Semantics
3. Interpreter
4. Meta

And now the explanation and purpose of each:

1. SYNTAX

A computer system is a "smart system" that helps us in some way. Because it is 'smart' we are able to control it via some kind of language. What kind of language? What primitives and command-sequences can we use to communicate with it? Describing that, means describing the SYNTAX of the language that controls the system.

For a graphical application (aren't they all?) this would mean describing its GUI controls and dialogs. In what sequence can they be exercised? As an example, to choose an item from a menu, you first need to click something else to get the menu to pop up. Thus we can see that a user-interface defines a SYNTAX for how you can interact with the system.

Therefore the SYNTAX -section of our documentation is there there to describe how users will and can INTERACT with the system. It is important to describe this 'boundary' of the system separately from what is inside it, to keep it not too dependent on how it is implemented.

2. SEMANTICS

The actions that users can perform on a GUI, or on a command-line have some MEANING, called its SEMANTICS. That means (pun intended) what those actions cause. What the user hopes to accomplish with them? What is the intention of the user, when activating certain UI controls?

For the user to hope to accomplish something by some action, they need a "mental model" of the concepts they are manipulating by their actions. That mental model, the available actions on it and expected results CREATES meaning, the semantics, of the user-actions.

Syntax describes what the user does or can do. Semantics describes why a user would do it.

3. INTERPRETER

So we have a language, described by both its syntax and its semantics. But who understands that language? The part of the system that reacts to the user interactions, implemented as code, is the part that 'understands' it. We call it the INTERPRETER.

We use the term interpreter here in a more general sense, than parser/lexer/interpreter/compiler used in computer science. Systems INTERPRET the messages they receive BY REACTING to them.

Think of calling a function or procedure as a linguistic act. It transforms the function-call to another form, consisting of other calls to other functions. Thus executing a computer program can be seen as a continuous, recursive process of interpretation.

The end-result of interpretation must be some way of arriving at the "meaning" of the commands used by the user. The system however does not need to produce some other final representation of the meaning. The meaning of the commands is really what they do, how they are executed, what is their effect.

Thus, meaning is born by the fact that the system reacts in a specific way to user-inputs, and that the user expects it will react that way. The part of the system that produces these reactions is the code that reacts to the inputs. In our paradigm we call that code the 'interpreter'.

In summary the meaning of user-actions is defined by their effects, and results.

Syntax = What actions user can do
Semantics = What effects user-actions have

4. META

You've gone through three out of four sections of the documentation. But nobody has even told you why the system exists at all. What are the benefits of it?

Maybe you can infer some of those benefits by having understood what a user can do with the system (SYNTAX), and how the system will react (SEMANTICS). But shouldn't we also tell WHY the system was created? Yes. But not in the first 3 sections. Why not? Because REASON the system was built is not PART of the system. But, describing why our system exists is a relevant for understanding it. Therefore that is explained in the META-section of the documentation.

The META -section is information "about" the system like why and how the documentation was created, which means describing why the system was created in the first place. It includes project plans, procedures, methodology, history, personnel, cost-benefit analyses etc.

Our purpose here is to come up with a rationale as to what information should be put into each section of documentation. Their order does not matter so much - except to make clear that META -section differs from others on a conceptual level. The META -section is not a 'blueprint' of one part of the system. The system does not have a PART called 'meta'.

Meta is information about the system, not part of it. The other three sections SYNTAX, SEMANTICS, INTERPRETER in contrast, are all "blueprints" of the system.

Recursive System Descriptions

One thing to note about the above way to describe and documents systems is that it can be applied recursively, on multiple levels of the system. The INTERPRETER is the part of the system where most of its work gets done. It is typically implemented as a set of interacting SW-modules.

But each such module can be described as a system of its own, with its SYNTAX, SEMANTICS, INTERPRETER and META. The SYNTAX of a software module describes its 'methods' and the data-structures they consume and produce. It SEMANTICS is described by telling for each method how its results related to its arguments, and what side-effects it has. The private sub-modules inside a module, are its INTERPRETER.

Wednesday, April 24, 2013

The Triple Stress of the Software Maintainer

First, let's face it, every programmer is a software maintainer. By maintaining I mean not only fixing bugs or adapting the software to changing environment, but also in general adapting it to new needs, new use-cases.

When you create a software module, it stays around. It is not like a bouquet of flowers you put together and sell somebody. Because software can be duplicated for free, you won't lose it when you give it to someone else. THEREFORE much of software development is just about modifying, adapting, maintaining, improving existing code.

Now let's focus on maintenance as it's more traditionally understood. There's a bug-report. The software does something we don't like. You are tasked to fix it, by yourself or someone else. What happens? A Three-Fold Stress ensues.

The threefold stress is not caused by the knowledge that the bug exists, but by the fact that you need to fix it. Why is bug-fixing stressful, more so than the fun task of programming "from scratch"?

The first stress is because you don't know what is causing the bug. Where is the "bug" located? Well typically it's not located in any SINGLE place, but is caused by the interaction of multiple parts of the system. It's not like "finding a cog in a machine". It's more about understanding why a subset of modules don't work together as intended. It can take a lot of work finding out "where" the bug is. The first stress comes from the fact that at the outset you have no way of knowing how long it's going to take to locate the cause of the problem.

The second stress comes from the fact that you also know, you don't know how much time it will take to modify the system even after you have understood the cause. You may need to rewrite whole subsystems to get it working the way it needs to. Again doing the rewrite is actually fun. The stress comes from the fact that you have no good way of knowing how long its going to take. This is partially because once you start modifying the system, you typically break something else, which then needs to be fixed also, which ... you get my point.

The third stress comes from the fact that even when you've finished with the maintenance project, you can't be 100% sure whether you've actually created more problems that you've solved.

So there you have it. There are books to read about how to make your software very maintainable. But often you need to work with software created by others, or by yourself before you read those books, with no clear documentation attached.

Perhaps the best cure for this three-fold stress is the simplest: Realize the uncertainty you face, and live with it. Take a Zen.

Wednesday, March 6, 2013

The True Revenge of the Nerds (Why MOOCs are great)

There's a lot of controversy over MOOCs. If they're so great, why do we need 50k/year universities? Do you get what you pay for with MOOCs? I have my opinion, having worked a long time with e-learning in general.

Let me tell you why MOOCs are good. Their quality of the learning materials can be superior because it pays to invest in their quality when the user-base is so large. Just like Google invests in the quality of their search-results.

I studied in a "prestigious" university and saw that quality of the learning materials could vary greatly. On some courses they were great, on some they were just undecipherable lecture-notes written by a brilliant mathematician who was a flawed communicator.

Sometimes it felt the learning materials were difficult on purpose, to discourage the less-than-genious students from picking up theoretical physics. It was a process of natural selection for students who wanted to become scientists. But more likely the materials were difficult to understand because the professor just did not have the time or skill to produce great learning-aids.

MOOC developers get a lot of feedback, since they have so many students. Like with Open Source, with enough eyeballs, defects get weeded out.

So in a great university it is (more or less) about the process of selecting the best and most motivated students. In MOOCs the process is about selecting the best learning-aids.

MOOCs is the true revenge of the nerds. Now it is the professors who need to compete for producing the best study-aids.