Scalability addresses an architecture's need to support a large number of instances or concurrent interactions. Four basic approaches for dealing with scalability demands are identified and can be combined in various ways:
- scaling up - increasing the capacity of services, consumers, and network devices
- scaling out - distributing load across services and programs
- smoothing out - evening out the number of interactions over peak and non-peak periods to optimize the infrastructure (thereby reducing the impact of the peaks to avoid the infrastructure sitting idle at other times)
- decoupling the consumption of finite resources - such as memory from concurrent consumers
Most REST constraints do little to support scaling up or smoothing out the number of interactions, but the focus is on supporting the scaling out of service instances and on aligning finite resource consumption to the number of active requests (rather than the number of concurrent consumers).
REST-style architectures support load balancing across service instances via the Layered System and Uniform Contract constraints. This can be further simplified through the Stateless constraint, which structures interactions as request-response pairs that can each be handled by services independently of other requests. There is no need to keep directing the same service consumer to the same service instance or to synchronize session state explicitly between service instances; the session data is contained in each request.