Techniques For Scaling Distributed Systems
There are presently three techniques used for scaling which attempt to solve some of the inherent problems in scaling but also have their own disadvantages. One of these scaling techniques or a combination of these scaling techniques are used by organizations depending on organizational priorities in relation to data consistency, allowable performance degradation, acceptable latency etc.
Hiding Communication Latencies
Communication latencies can be often mistaken for low remote server performance or capabilities. Consequently, hiding communication latencies becomes essential to achieving geographical scaling. There are two ways to hide communication latencies.
Avoid (idly) Waiting for remote server response:
Make use of asynchronous communication to execute other independent processes in the time it takes for the server response to arrive. When the response arrives an interrupt is executed to inform the client process of the response and a special handler for completing the process.
Pros: Communication latencies are hidden as the client process is busy executing other processes in the time the response comes
Cons: Most interactive applications do not have any other processes to execute till the response arrive, making the solution irrelevant.
Reduce server communication:
One way to reduce server communication is to shift part of the server communication to the client-side. An example is to shift the data validation in a form filling to the client-side, instead of having to communicate with the server for each blank filled.
Pros: Speeds up the execution of processes
Distribution as a scaling technique involves partitioning a resource into smaller units and distributing it across the system. A good example is a way the internet Domain Name service is organized. The namespace is hierarchically organized into a tree of non-overlapping zones, each of which only resolve names in their own zone.
Pros: Distributes the burden on request handling Increases transparency, strengthening the single system view
Cons: Performance degradation
Replication as a scaling technique involves creating copies of resource components and distributing them across the system.
There is also a special case of replication called caching. It is similar to caching in the sense that caching involves having a copy of the resource in close proximity to the client. It differs from replication in the sense that a caching decision is taken by a client and on-demand whereas a replication decision is made by the owner of the resource and is preplanned. An example of caching would be a web browser storing a copy of a document whose validity has not been verified for some time.
Pros: Increases availability Balances the load between components leading to an increased performance Helps hide communication latencies in geographically distributed systems
Cons: Maintaining data consistency for applications that require strong consistency would be extremely difficult Replication or caching for geographically distributed systems might require some kind of global synchronization which is highly impossible