PanuLogic Software Development Blog: May 2013

Tuesday, May 7, 2013

Use Functions Luke

JavaScript has been called Lisp with a C-style syntax. It's main building-block is Function. You create functions that call other functions, and also functions that return or take functions as arguments. That is called "support for higher-order functions".

This blog-post is NOT about higher-order functions. Nor is this about the 'module-pattern'. This blog-post is about using functions to encapsulate your "direct" reads and writes.

In JavaScript code written by a novice, you may see something like this:

myObject[mode] = anotherObject.xyz ;
...

// In some other Galaxy:
if (someObject[someVariable] == something)
{ ...
}

The problem with the above? You make a "direct write", and you make a "direct read". Why is that bad?

The problem with direct writes is that you can make them from anywhere. If you allow that, it becomes very difficult to locate the place where a specific value is written to a field of a specific object. Who dunnit? Which statement (among ten thousand) wrote that phantom value into my field? It is a problem of JavaScript that you can do that. But you must not give in to the temptation.

With code like above, you can't use your editor's "find" -command either to locate places where the given field is written. That is because field-name CAN be in a variable, as in the example above. But even if it is not, you might find too many places that write SOMETHING into the given field. And you need regular expressions to locate both ".fieldX" and ". fieldX" and ... you get the point.

There's an easy remedy to this maintenance nightmare.
Use Functions, Luke.

function setit (object, field, value)
{ if (value == 'weird') && (field == 'leftField')
{ debugger
}
object[field] = value;
}

If you never assign a value EXCEPT inside setit(), you can start the debugger whenever you suspect something is written that shouldn't.

If you are a follower of Functional Programming (FP) you know that assignments are BAD. From that perspective the benefit of using setit() for all writes is that at least you KNOW where all the bad code is. So you can keep an eye on it.

The function setit() can be extended so that it does not allow assignment if the field already has a value. Then you are pretty close at least in spirit to FP. Another name for "once-only-assignment" is "binding". Binding is good, (multiple-) assignment is bad.

So is that all there is to it? Well it's also useful to never READ fields directly. If you code

var v = someObject [ fname ];

it becomes difficult to find all places that use data from that specific field of that specific object.

There is no way you can HALT your code every time the value of the field is read. So you can not see when it's used and by whom. That means you can't easily change the value to a different type because you can't find which other places assume it is something else.

It then becomes difficult to change anything without breaking something. And that problem usually only becomes obvious in mid-flight, when trying to escape the death-star.

So what do you do? Use Functions, Luke:

function getit (object, field)
{ if (field == 'field_of_interest')
{ debugger
// now we can see whose's asking for this data
}
var value = object[field];
return value;
}

This pattern in its slightly different O-O form is often called simply 'Getters and Setters'. The main thing about it is that you must follow it ALWAYS.

If you don't follow it "as a rule" you soon start skipping its use in most places, because direct reads and writes are faster to code.

Then you will have 10,000 places in the code of your hyper-drive that do direct reads and writes. At that point it is prohibitively expensive to re-factor your engine into maintainable form. Meaning you can't catch phantom reads and writes. You must surrender to the dark side. Don't let this happen, Luke. Use Functions.

http://panuviljamaablog.blogspot.com/2013/05/use-functions-luke-javascript-has-been.htm

Friday, May 3, 2013

Critique of Technical Debt

"Technical Debt" is a term used in Software Development. It means you take shortcuts in your development effort, not following the best practices. You are "in debt" because you will later need to spend extra effort to re-write, or re-factor your code or system properly.

Sounds reasonable, but is the metaphor of "Technical Debt" really valid?

I can see one situation where Technical Debt is the right term. It is when you are 100% sure the code you are writing must be re-written later.

You are creating a quick-and-dirty prototype. You know it will need to be rewritten when used as the basis for the real product, so you know you can take shortcuts in your coding practices. You are thus taking on some Technical Debt, just to create a prototype that allows you to sell the project. Once the project is sold you can do the well -designed maintainable, adaptable, extensible implementation and thus pay back the technical debt. Like a wise investor you took on some debt, invested it in product development, then payed it back.

If you are consciously taking on 'Technical Debt' that can be a wise thing to do. But the term is more often used with a negative connotation. That often happens in situations where "Technical Debt" is really not the right term after all.

Let's say you work on some code for a day, and use several less than best practices to get it working in 12 hours. How much deeper in technical debt are you then?

'Debt' is something we must pay back. But possibly your code will never need to be modified afterwards. There isn't any debt to pay back then. 'Debt' is not the right term for something that just possibly might increase our maintenance expenses in future.

Writing less than best-practices code is not like taking on debt. It is like not buying insurance, not buying an option-to-sell when buying stock.

You buy stock at $100. You also buy an option to sell it at $100. That option costs $10. If the stock goes down and you must sell it, you get $100. But you're not even, because you paid $10 for the option.

You write code for an hour at $100 per hour. You spend an additional 6 minutes (= $10) making sure the code follows the pattern "Pluggable Adapters". That means you can later adapt your code without having to modify it. You just need to create a new adapter around it.

Maybe you never need to change your code. But if you do, you have now paid for the option that makes it relatively cheap to adapt it to changing circumstances later. But if you don't need to adapt it - the time you took to make it adaptable is your loss.

Instead of Technical Debt I think the term we should be using is 'Software Maintenance Risk' (SMR). Granted, "Technical Debt" is more catchy.

Software Maintenance Risk can be defined as the risk that you will need to modify your code in future. The way to eliminate SMR is to hedge against it by paying for the extra effort to write 'Perfectly Maintainable Code' (PMC).

What is that you ask? Can anything be 'perfectly maintainable'? Well we can define PMC technically, as software which never needs to be modified. If any maintenance-task can be achieved by simply adding a new adapter into your system, then your existing code never needs to be modified. It is PMC - at least until you discover it is not.

In equity markets you don't typically hedge against all loss because that can be costly and can limit your upside. You take some risk to make some profits. But you still want to reduce the risk to a reasonable level by buying some options. Sometimes you'll need a bigger hedge, sometimes smaller - depending on your estimate and tolerance of risk.

Similarly in SW development it may be too expensive to always write perfectly maintainable code. Writing less than perfectly maintainable code does not mean you get into Technical Debt. It means there is a risk you will need to pay more for maintenance work in the future.

In the stock-market you can lose everything if you don't hedge your bets with options. In software your application can lose all its users if you don't pay for the effort to keep it maintainable.

REFERENCE:
https://en.wikipedia.org/wiki/Technical_debt

Thursday, May 2, 2013

The Linguistic Approach to System Description

You are doing a software project. How should you structure its documentation? What guiding principles should be used for creating and structuring its documentation? Should you include project-planning documents in it?

I propose the "Linguistic Paradigm for System Description" here. It may have been proposed before, but probably not in exactly the same form. It is a tool for thinking about not only of documentation, but the structure of "systems" in general.

Note that we have computer applications which we often call "systems". Then we have project planning (documents, models) to help the creation of such systems in an orderly manner. But a project plan can also be seen as a system on its own. It has components that relate to each other, rules for its actors to follow, conditions and events that trigger further actions. Executing a well-defined project plan is really executing a program.

I focus here on system descriptions in general, whether those systems be computer programs or procedures and plans for creating them.

Before getting too philosophical here's the structure of documentation I advocate:

1. Syntax
2. Semantics
3. Interpreter
4. Meta

And now the explanation and purpose of each:

1. SYNTAX

A computer system is a "smart system" that helps us in some way. Because it is 'smart' we are able to control it via some kind of language. What kind of language? What primitives and command-sequences can we use to communicate with it? Describing that, means describing the SYNTAX of the language that controls the system.

For a graphical application (aren't they all?) this would mean describing its GUI controls and dialogs. In what sequence can they be exercised? As an example, to choose an item from a menu, you first need to click something else to get the menu to pop up. Thus we can see that a user-interface defines a SYNTAX for how you can interact with the system.

Therefore the SYNTAX -section of our documentation is there there to describe how users will and can INTERACT with the system. It is important to describe this 'boundary' of the system separately from what is inside it, to keep it not too dependent on how it is implemented.

2. SEMANTICS

The actions that users can perform on a GUI, or on a command-line have some MEANING, called its SEMANTICS. That means (pun intended) what those actions cause. What the user hopes to accomplish with them? What is the intention of the user, when activating certain UI controls?

For the user to hope to accomplish something by some action, they need a "mental model" of the concepts they are manipulating by their actions. That mental model, the available actions on it and expected results CREATES meaning, the semantics, of the user-actions.

Syntax describes what the user does or can do. Semantics describes why a user would do it.

3. INTERPRETER

So we have a language, described by both its syntax and its semantics. But who understands that language? The part of the system that reacts to the user interactions, implemented as code, is the part that 'understands' it. We call it the INTERPRETER.

We use the term interpreter here in a more general sense, than parser/lexer/interpreter/compiler used in computer science. Systems INTERPRET the messages they receive BY REACTING to them.

Think of calling a function or procedure as a linguistic act. It transforms the function-call to another form, consisting of other calls to other functions. Thus executing a computer program can be seen as a continuous, recursive process of interpretation.

The end-result of interpretation must be some way of arriving at the "meaning" of the commands used by the user. The system however does not need to produce some other final representation of the meaning. The meaning of the commands is really what they do, how they are executed, what is their effect.

Thus, meaning is born by the fact that the system reacts in a specific way to user-inputs, and that the user expects it will react that way. The part of the system that produces these reactions is the code that reacts to the inputs. In our paradigm we call that code the 'interpreter'.

In summary the meaning of user-actions is defined by their effects, and results.

Syntax = What actions user can do
Semantics = What effects user-actions have

4. META

You've gone through three out of four sections of the documentation. But nobody has even told you why the system exists at all. What are the benefits of it?

Maybe you can infer some of those benefits by having understood what a user can do with the system (SYNTAX), and how the system will react (SEMANTICS). But shouldn't we also tell WHY the system was created? Yes. But not in the first 3 sections. Why not? Because REASON the system was built is not PART of the system. But, describing why our system exists is a relevant for understanding it. Therefore that is explained in the META-section of the documentation.

The META -section is information "about" the system like why and how the documentation was created, which means describing why the system was created in the first place. It includes project plans, procedures, methodology, history, personnel, cost-benefit analyses etc.

Our purpose here is to come up with a rationale as to what information should be put into each section of documentation. Their order does not matter so much - except to make clear that META -section differs from others on a conceptual level. The META -section is not a 'blueprint' of one part of the system. The system does not have a PART called 'meta'.

Meta is information about the system, not part of it. The other three sections SYNTAX, SEMANTICS, INTERPRETER in contrast, are all "blueprints" of the system.

Recursive System Descriptions

One thing to note about the above way to describe and documents systems is that it can be applied recursively, on multiple levels of the system. The INTERPRETER is the part of the system where most of its work gets done. It is typically implemented as a set of interacting SW-modules.

But each such module can be described as a system of its own, with its SYNTAX, SEMANTICS, INTERPRETER and META. The SYNTAX of a software module describes its 'methods' and the data-structures they consume and produce. It SEMANTICS is described by telling for each method how its results related to its arguments, and what side-effects it has. The private sub-modules inside a module, are its INTERPRETER.