Chapter 5: Deployment Patterns
J.D. Meier, Alex Homer, David Hill,
Jason Taylor, Prashant Bansode, Lonnie Wall, Rob Boucher Jr, Akshay Bogawat
- Learn the key factors that influence deployment choices.
- Understand the recommendations for choosing a deployment pattern.
- Understand the effect of deployment strategy on performance, security, and other quality attributes.
- Understand the deployment scenarios for each of the application types covered in this guide.
- Learn common deployment patterns.
Application architecture designs exist as models, documents, and scenarios. However, applications must be deployed into a physical environment where infrastructure limitations may negate some of the architectural decisions. Therefore, you must consider the
proposed deployment scenario and the infrastructure as part of your application design process.
This chapter describes the options available for deployment of different types of applications, including distributed and non-distributed styles, ways to scale the hardware, and the patterns that describe performance, reliability, and security issues. By considering
the possible deployment scenarios for your application as part of the design process, you prevent a situation where the application cannot be successfully deployed, or fails to perform to its design requirements because of technical infrastructure limitations.
Choosing a Deployment Strategy
Choosing a deployment strategy requires design tradeoffs; for example, because of protocol or port restrictions, or specific deployment topologies in your target environment. Identify your deployment constraints early in the design phase to avoid surprises
later. To help you avoid surprises, involve members of your network and infrastructure teams to help with this process.
When choosing a deployment strategy:
- Understand the target physical environment for deployment.
- Understand the architectural and design constraints based on the deployment environment.
- Understand the security and performance impacts of your deployment environment.
Distributed vs. Non-distributed Deployment
When creating your deployment strategy, first determine if you will use a distributed or a non-distributed deployment model. If you are building a simple application for which you want to minimize the number of required servers, consider a non-distributed deployment.
If you are building a more complex application that you will want to optimize for scalability and maintainability, consider a distributed deployment.
In a non-distributed deployment, all of the functionality and layers reside on a single server except for data storage functionality, as shown in Figure 1.
Figure 1 Non-distributed deployment
This approach has the advantage of simplicity and minimizes the number of physical servers required. It also minimizes the performance impact inherent when communication between layers has to cross physical boundaries between servers or server clusters. Keep
in mind that by using a single server, even though you minimize communication performance overhead, you can hamper performance in other ways. Because all of your layers share resources, one layer can negatively impact all of the other layers when it is under
heavy utilization. The use of a single tier reduces your overall scalability and maintainability since all of the layers share the same physical hardware.
In a distributed deployment, the layers of the application reside on separate physical tiers. Distributed deployment allows you to separate the layers of an application on different physical tiers, as shown in Figure 2.
Figure 2 Distributed deployment
A distributed approach allows you to configure the application servers that host the various layers in order to best meet the requirements of each layer. Distributed deployment also allows you to apply more stringent security to the application servers; for
example, by adding a firewall between the Web server and the application servers, and by using different authentication and authorization options. For instance, in a rich client application, the client may use Web services exposed through a Web server, or
may access functionality in the application server tier using Distributed COM (DCOM) or Windows Communication Foundation (WCF) services.
Distributed deployment provides a more flexible environment where you can more easily scale out or scale up each physical tier as performance limitations arise, and when processing demands increase.
Performance and Design Considerations for Distributed Environments
Distributing components across physical tiers reduces performance because of the cost of remote calls across server boundaries. However, distributed components can improve scalability opportunities, improve manageability, and reduce costs over time.
Consider the following guidelines when designing an application that will run on a physically distributed infrastructure:
- Choose communication paths and protocols between tiers to ensure that components can securely interact with minimum performance degradation.
- Consider using services and operating system features such as distributed transaction support and authentication that can simplify your design and improve interoperability.
- Reduce the complexity of your component interfaces. Highly granular interfaces (“chatty” interfaces) that require many calls to perform a task work best when located on the same physical machine. Interfaces that make only one call to accomplish each task
(“chunky” interfaces) provide the best performance when the components are distributed across separate physical machines.
- Consider separating long-running critical processes from other processes that might fail by using a separate physical cluster.
- Determine your failover strategy. For example, Web servers typically provide plenty of memory and processing power, but may not have robust storage capabilities (such as RAID mirroring) that can be replaced rapidly in the event of a hardware failure.
- Take advantage of asynchronous calls, one-way calls, or message queuing to minimize blocking when making calls across physical boundaries.
- Determine how best to plan for the addition of extra servers or resources that will increase performance and availability.
Recommendations for Locating Components within a Distributed Deployment
When designing a distributed deployment, you need to determine which layers and components you will put into each physical tier. In most cases you will place the presentation layer on the client or on the Web server; the business, data access, and service layers
on the application server; and the database on its own server. In some cases you will want to modify this pattern.
Consider the following guidelines when determining where to locate components in a distributed environment:
- Only distribute components where necessary. Common reasons for implementing distributed deployment include security policies, physical constraints, shared business logic, and scalability.
- In Web applications, deploy business components that are used synchronously by user interfaces (UIs) or user process components in the same physical tier as the UI in order to maximize performance and ease operational management.
- Don’t place UI and business components on the same tier if there are security implications that require a trust boundary between them. For instance, you might want to separate business and UI components in a rich client application by placing the UI on
the client and the business components on the server.
- Deploy service agent components on the same tier as the code that calls the components, unless there are security implications that require a trust boundary between them.
- Deploy asynchronous business components, workflow components, and business services on a separate physical tier where possible.
- Deploy business entities on the same physical tier as the components that use them.
Scale Up vs. Scale Out
Your approach to scaling is a critical design consideration. Whether you plan to scale out your solution through a Web farm, a load-balanced middle tier, or a partitioned database, you need to ensure that your design supports this.
When you scale your application, you can choose from and combine two basic choices:
- Scale up: Get a bigger box.
- Scale out: Get more boxes.
Scale Up: Get a Bigger Box
With this approach, you add hardware such as processors, RAM, and network interface cards (NICs) to your existing servers to support increased capacity. This is a simple option and can be cost-effective because it does not introduce additional maintenance and
support costs. However, any single point of failure remain, which is a risk. Beyond a certain threshold, adding more hardware to the existing servers may not produce the desired results. For an application to scale up effectively, the underlying framework,
run time, and computer architecture must scale up as well. When scaling up, consider which resources the application is bound by. If it is memory-bound or network-bound, adding CPU resources will not help.
Scale Out: Get More Boxes
To scale out, you add more servers and use load-balancing and clustering solutions. In addition to handling additional load, the scale-out scenario also protects against hardware failures. If one server fails, there are additional servers in the cluster that
can take over the load. For example, you might host multiple Web servers in a Web farm that hosts presentation and business layers, or you might physically partition your application’s business logic and use a separately load-balanced middle tier along with
a load-balanced front tier hosting the presentation layer. If your application is I/O-constrained and you must support an extremely large database, you might partition your database across multiple database servers. In general, the ability of an application
to scale out depends more on its architecture than on underlying infrastructure.
Consider Whether You Need to Support Scale Out
Scaling up with additional processor power and increased memory can be a cost-effective solution. This approach also avoids introducing the additional management cost associated with scaling out and using Web farms and clustering technology. You should look
at scale-up options first and conduct performance tests to see whether scaling up your solution meets your defined scalability criteria and supports the necessary number of concurrent users at an acceptable performance level. You should have a scaling plan
for your system that tracks its observed growth.
If scaling up your solution does not provide adequate scalability because you reach CPU, I/O, or memory thresholds, you must scale out and introduce additional servers. Consider the following practices in your design to ensure that your application can be scaled
- You need to be able to scale out your bottlenecks, wherever they are. If the bottlenecks are on a shared resource that cannot be scaled, you have a problem. However, having a class of servers that have affinity with one resource type could be beneficial,
but they must then be independently scaled. For example, if you have a single Microsoft SQL Server® instance that provides a directory, everyone uses it. In this case, when the server becomes a bottleneck, you can scale out and use multiple copies. Creating
an affinity between the data in the directory and the SQL Servers that serve the data allows you to concentrate those servers and does not cause scaling problems later, so in this case affinity is a good idea.
- Define a loosely coupled and layered design. A loosely coupled, layered design with clean, remotable interfaces is more easily scaled out than tightly-coupled layers with “chatty” interactions. A layered design will have natural clutch points, making
it ideal for scaling out at the layer boundaries. The trick is to find the right boundaries. For example, business logic may be more easily relocated to a load-balanced, middle-tier application server farm.
Consider Design Implications and Tradeoffs up Front
You need to consider aspects of scalability that may vary by application layer, tier, or type of data. Know your tradeoffs up front and know where you have flexibility and where you do not. Scaling up and then out with Web or application servers might not be
the best approach. For example, although you can have an 8-processor server in this role, economics would probably drive you to a set of smaller servers instead of a few big ones. On the other hand, scaling up and then out might be the right approach for your
database servers, depending on the role of the data and how the data is used. Apart from technical and performance considerations, you also need to take into account operational and management implications and related total cost of ownership (TCO) costs.
If you have stateless components (for example, a Web front end with no in-process state and no stateful business components), this aspect of your design supports both scaling up and scaling out. Typically, you optimize the price and performance within the boundaries
of the other constraints you may have. For example, 2-processor Web or application servers may be optimal when you evaluate price and performance compared with 4-processor servers; that is, four 2-processor servers may be better than two 4-processor servers.
You also need to consider other constraints, such as the maximum number of servers you can have behind a particular load-balancing infrastructure. In general, there are no design tradeoffs if you adhere to a stateless design. You optimize price, performance,
For data, decisions largely depend on the type of data:
- Static, reference, and read-only data. For this type of data, you can easily have many replicas in the right places if this helps your performance and scalability. This has minimal impact on design and can be largely driven by optimization considerations.
Consolidating several logically separate and independent databases on one database server may or may not be appropriate even if you can do it in terms of capacity. Spreading replicas closer to the consumers of that data may be an equally valid approach. However,
be aware that whenever you replicate, you will have a loosely synchronized system.
- Dynamic (often transient) data that is easily partitioned. This is data that is relevant to a particular user or session (and if subsequent requests can come to different Web or application servers, they all need to access it), but the data for user
A is not related in any way to the data for user B. For example, shopping carts and session state both fall into this category. This data is slightly more complicated to handle than static, read-only data, but you can still optimize and distribute quite easily.
This is because this type of data can be partitioned. There are no dependencies between the groups, down to the individual user level. The important aspect of this data is that you do not query it across partitions. For example, you ask for the contents of
user A’s shopping cart but do not ask to show all carts that contain a particular item.
- Core data. This type of data is well maintained and protected. This is the main case where the “scale up, then out” approach usually applies. Generally, you do not want to hold this type of data in many places because of the complexity of keeping
it synchronized. This is the classic case in which you would typically want to scale up as far as you can (ideally, remaining a single logical instance, with proper clustering), and only when this is not enough, consider partitioning and distribution scale-out.
Advances in database technology (such as distributed partitioned views) have made partitioning much easier, although you should do so only if you need to. This is rarely because the database is too big, but more often it is driven by other considerations such
as who owns the data, geographic distribution, proximity to the consumers, and availability.
Consider Database Partitioning at Design Time
If your application uses a very large database and you anticipate an I/O bottleneck, ensure that you design for database partitioning up front. Moving to a partitioned database later usually results in a significant amount of costly rework and often a complete
Partitioning provides several benefits:
- The ability to restrict queries to a single partition, thereby limiting the resource usage to only a fraction of the data.
- The ability to engage multiple partitions, thereby getting more parallelism and superior performance because you can have more disks working to retrieve your data.
Be aware that in some situations, multiple partitions may not be appropriate and could have a negative impact. For example, some operations that use multiple disks could be performed more efficiently with concentrated data. So when you partition, consider the
benefits together with alternate approaches.
Performance deployment patterns represent proven design solutions to common performance problems. When considering a high-performance deployment, you can scale up or scale out. Scaling up entails improvements to the hardware on which you are already running.
Scaling out entails distributing your application across multiple physical servers to distribute the load. A layered application lends itself more easily to being scaled out. Consider the use of Web farms or load-balancing clusters when designing a scale-out
A Web farm
is a collection of servers that run the same application. Requests from clients are distributed to each server in the farm, so that each has approximately the same load. Depending on the routing technology used, it may detect failed servers
and remove them from the routing list to minimize the impact of a failure. In simple scenarios, the routing may be on a “round robin” basis where a Domain Name System (DNS) server hands out the addresses of individual servers in rotation. Figure 3 illustrates
a simple Web farm where each server hosts all of the layers of the application except for the data store.
Figure 3 A simple Web farm
Affinity and User Sessions
Web applications often rely on the maintenance of session state between requests from the same user. A Web farm can be configured to route all requests from the same user to the same server—a process known as affinity—in order to maintain state where this is
stored in memory on the Web server. However, for maximum performance and reliability, you should use a separate session state store with a Web farm to remove the requirement for affinity.
In ASP.NET, you must also configure all of the Web servers to use a consistent encryption key and method for viewstate encryption where you do not implement affinity. You should also enable affinity for sessions that use Secure Sockets Layer (SSL) encryption,
or use a separate cluster for SSL requests.
If you use a distributed model for your application, with the business layer and data layer running on different physical tiers from the presentation layer, you can scale out the business layer and data layer by using an application farm. Requests from the
presentation tier are distributed to each server in the farm so that each has approximately the same load. You may decide to separate the business layer components and the data layer components on different application farms, depending on the requirements
of each layer and the expected loading and number of users.
You can install your service or application onto multiple servers that are configured to share the workload, as shown in Figure 4. This type of configuration is known as a
Figure 4 A load-balanced cluster
Load balancing scales the performance of server-based programs, such as a Web server, by distributing client requests across multiple servers. Load-balancing technologies, commonly referred to as load balancers, receive incoming requests and redirect them to
a specific host if necessary. The load-balanced hosts concurrently respond to different client requests, even multiple requests from the same client. For example, a Web browser might obtain the multiple images within a single Web page from different hosts
in the cluster. This distributes the load, speeds up processing, and shortens the response time to clients.
Reliability deployment patterns represent proven design solutions to common reliability problems. The most common approach to improving the reliability of your deployment is to use a failover cluster to ensure the availability of your application even if a
A failover cluster
is a set of servers that are configured in such a way that if one server becomes unavailable, another server automatically takes over for the failed server and continues processing. Figure 5 shows a failover cluster.
Figure 5 A failover cluster
Install your application or service on multiple servers that are configured to take over for one another when a failure occurs. The process of one server taking over for a failed server is commonly known as
. Each server in the cluster has at least one other server in the cluster identified as its standby server.
Security patterns represent proven design solutions to common security problems. The impersonation/delegation approach is a good solution when you must flow the context of the original caller to downstream layers or components in your application. The trusted
subsystem approach is a good solution when you want to handle authentication and authorization in upstream components and access a downstream resource with a single trusted identity.
In the impersonation/delegation authorization model, resources and the types of operations (such as read, write, and delete) permitted for each one are secured using Windows Access Control Lists (ACLs) or the equivalent security features of the targeted resource
(such as tables and procedures in SQL Server). Users access the resources using their original identity through impersonation, as illustrated in Figure 6.
Figure 6 The impersonation/delegation authorization model
In the trusted subsystem (or trusted server) model, users are partitioned into application-defined, logical roles. Members of a particular role share the same privileges within the application. Access to operations (typically expressed by method calls) is authorized
based on the role membership of the caller. With this role-based (or operations-based) approach to security, access to operations (not back-end resources) is authorized based on the role membership of the caller. Roles, analyzed and defined at application
design time, are used as logical containers that group together users who share the same security privileges or capabilities within the application. The middle-tier service uses a fixed identity to access downstream services and resources, as illustrated in
Figure 7 The trusted subsystem (or trusted server) model
Multiple Trusted Service Identities
In some situations, you might require more than one trusted identity. For example, you might have two groups of users, one who should be authorized to perform read/write operations and the other read-only operations. The use of multiple trusted service identities
provides the ability to exert more granular control over resource access and auditing, without having a large impact on scalability. Figure 8 illustrates the multiple trusted service identities model.
Figure 8 The multiple trusted service identities model
Network Infrastructure Security Considerations
Make sure that you understand the network structure provided by your target environment, and understand the baseline security requirements of the network in terms of filtering rules, port restrictions, supported protocols, and so on. Recommendations for maximizing
network security include:
- Identify how firewalls and firewall policies are likely to affect your application’s design and deployment. Firewalls should be used to separate the Internet-facing applications from the internal network, and to protect the database servers. These can limit
the available communication ports and, therefore, authentication options from the Web server to remote application and database servers. For example, Windows authentication requires additional ports.
- Consider what protocols, ports, and services are allowed to access internal resources from the Web servers in the perimeter network or from rich client applications. Identify the protocols and ports that the application design requires, and analyze the
potential threats that occur from opening new ports or using new protocols.
- Communicate and record any assumptions made about network and application layer security, and what security functions each component will handle. This prevents security controls from being missed when both development and network teams assume that the other
team is addressing the issue.
- Pay attention to the security defenses that your application relies upon the network to provide, and ensure that these defenses are in place.
- Consider the implications of a change in network configuration, and how this will affect security.
The choices you make when deploying an application affect the capabilities for managing and monitoring the application. You should take into account the following recommendations:
- Deploy components of the application that are used by multiple consumers in a single central location to avoid duplication.
- Ensure that data is stored in a location where backup and restore facilities can access it.
- Components that rely on existing software or hardware (such as a proprietary network that can only be established from a particular computer) must be physically located on the same computer.
- Some libraries and adaptors cannot be deployed freely without incurring extra cost, or may be charged on a per-CPU basis; therefore, you should centralize these features.
- Groups within an organization may own a particular service, component, or application that they need to manage locally.
- Monitoring tools such as System Center Operations Manager require access to physical machines to obtain management information, and this may impact deployment options.
- The use of management and monitoring technologies such as Windows Management Instrumentation (WMI) may impact deployment options.
Key patterns are organized by key categories such as Deployment, Manageability, Performance & Reliability, and Security in the following table. Consider using these patterns when making design decisions for each category.
||Three-Layered Services Application
|Performance & Reliability
||Impersonation and Delegation
- For more information on the Layered Application, Three-Layered Services Application, Tiered Distribution, Three-Tiered Distribution, and Deployment Plan patterns, see “Deployment Patterns” at
- For more information on the Server Clustering, Load-Balanced Cluster, and Failover Cluster patterns, see “Performance and Reliability Patterns” at
- For more information on the Brokered Authentication, Direct Authentication, Impersonation and Delegation, and Trusted Subsystem patterns, see “Web Service Security” at
- For more information on the Provider pattern, see “Provider Model Design Pattern and Specification, Part 1” at
- For more information on the Adapter pattern, see “data & object factory” at
- Adapter. An object that supports a common interface and translates operations between the common interface and other objects that implement similar functionality with different interfaces.
- Brokered Authentication. A pattern that authenticates against a broker, which provides a token to use for authentication when accessing services or systems.
- Direct Authentication. A pattern that authenticates directly against the service or system that is being accessed.
- Layered Application. An architectural pattern where a system is organized into layers.
- Load-Balanced Cluster. A distribution pattern where multiple servers are configured to share the workload. Load balancing provides both improvements in performance by spreading the work across multiple servers, and reliability where one server
can fail and the others will continue to handle the workload.
- Provider. A pattern that implements a component that exposes an API that is different from the client API, in order to allow any custom implementation to be seamlessly plugged in. Many applications that provide instrumentation expose providers that
can be used to capture information about the state and health of your application and the system hosting the application.
- Tiered Distribution. An architectural pattern where the layers of a design can be distributed across physical boundaries.
- Trusted Subsystem. A pattern where the application acts as a trusted subsystem to access additional resources. It uses its own credentials instead of the user’s credentials to access the resource.
patterns & practices Solution Assets
- Enterprise Library provides a series of application blocks that simplify common tasks such as caching, exception handling, validation, logging, cryptography, credential management, and facilities for implementing design patterns such as Inversion
of Control and Dependency Injection. For more information, see
- Unity Application Block is a lightweight, extensible dependency injection container that helps you to build loosely coupled applications. For more information, see