Fallacies of Distributed Computing
It would be an understatement to say that technology in the last 20 years has improved drastically. The rapid innovations in technology has helped us scale heights previously thought impossible.
Both software and hardware technologies are seeing unprecedented growth in speed, affordability, variety and dare I say, reliability. For all practical purposes, unless one is working with embedded devices, disk space is considered unlimited, memory available in abundance, network bandwidth in hundreds of mega bytes per seconds etc. Gone are the days when developers had to fit in code instructions in to 64 kilo bytes!
However, not everything has improved or is trouble-free. There are still mistakes being made, wrong assumptions being made. They eventually turn out to be incorrect and cause re-writing of applications, huge losses to corporations etc.
Fallacies of Distributed Computing
During the 90’s, a number of distinguished engineers from Sun Microsystems (including Peter Deutsch and James Gosling) created a list of fallacies that programmers new to distributed computing assume to be true when they are not.
20 years have gone by since these fallacies were written, a time when software and hardware technology was still not as sophisticated as today. But they are still as true as they were when originally written.
The word fallacy/fallacies means - A false or mistaken idea.
A brief explanation of the fallacies follows. Programmers and Architects should not ignore these fallacies today - more so than ever before because of the complex and hybrid nature of the systems today.
The network is reliable
There are a facilities such as redundant nodes, retry mechanisms etc available today. But don’t ever assume that the network is reliable. The network could go down because of human, natural of Artificial Intelligence errors. With complex networks that exist today, remember that finding and fixing issues are difficult as well.
There could be other “simpler” cases that can have a negative effect on the software - such as missed messages, packets of data, signalling etc.
Latency is zero
Latency is the amount of time it takes for the data to be transferred from source to destination. Most programmers today develop their applications in an intranet or standalone environment.
When the same system is installed for users over internet, the latency is significant. Frozen screens are as big a problem today in many large applications. Always minimize the communication and the amount of data that’s passed over the network.
Bandwidth is infinite
This is closely related to latency. However, it is important to know the difference between latency and bandwidth. Whilst latency is the time it takes for data to be transferred, bandwidth is the amount of data that can be carried at any given moment.
Given the vast improvements to bandwidth, it might be less of a factor today if you are working with smaller amounts of data. But there are many applications being built today that service huge amounts of data such (Big Data, video services etc).
Bandwidth does become an issue if there are many requests for large amounts of data over a great distance.
The network is secure
In the connected world that we are in today, nothing is secure. Millions are being spent on protection against hacks, malware, Trojans, leaks etc. In 2013, Forbes quoted that there 30,000 hacks a day. These numbers have only increased since then.
Because most applications built today have to interface with external data or applications, they expose themselves to the vulnerabilities.
Remember that hackers breach systems not just by breaking security, but also by creating valid user credentials that a system expects. All varieties of security threats need to be evaluated. Hire security specialists if necessary.
Topology doesn’t change
A network topology is the list of nodes and the connections that make the communication possible.
Never assume that the network topology will remain the same. When developing applications, it generally means not to depend on specific nodes, network state, IP addresses, ports etc.
All networks today are heterogeneous and continually change. There are an infinite number and varieties of devices today ranging from servers to laptops to mobile devices. More users are “connected” to a network than ever before in the history of mankind.
There is one administrator
This is not an aspect to be considered while coding the software. This aspect comes into play after the application has been deployed in the network.
Today, whole departments of IT personal are assigned to maintain critical networks. An incorrect change made by one could bring down an entire deployment. A good way to help administrators is by providing them intelligent network monitoring tools to operate and maintain the network.
Transport cost is zero
This fallacy is about cost of maintaining a network. This includes a number of aspects such as introduction of new resources into the network, monitoring and resolving issues, maintenance of network etc.
This costs add up to be quite significant over a period of time. It is however minimized by the recent trend of using cloud based SaaS solutions.
Also remember that Tech people don’t think about costs when programming. That’s probably the last thing on their mind when they have many other challenges to handle.
The network is homogeneous
There are two ways to look at this fallacy.
- At the transportation level, don’t assume that the network is homogeneous. It would be fair to say that no engineer worth one’s salt would make this assumption.
- The second way to interpret this fallacy is at the application level. At this level, one should not assume that the application is going to work on all types of devices. This is very important today because of the amount of different types of devices.
If you are a programmer or an architect, keep these fallacies in mind, the system will be much more responsive, reliable, usable and secure. More importantly, there will be a happy customer using it!