Note that the information on this page is the BETA 1 of a guide that is now released. See
http://www.codeplex.com/AppArchGuide for the latest PDF and HTML content.
Chapter 4 – Deployment Patterns
- J.D. Meier, Alex Homer, David Hill,
Prashant Bansode, Lonnie Wall, Rob Boucher Jr, Akshay Bogawat
- Learn the key factors that influence deployment choices.
- Understand deployment scenarios for each of the application types covered in this guide.
- Learn common deployment patterns.
This chapter covers key issues to consider when deploying an application or service. The chapter contains recommendations for choosing a deployment pattern and the factors that affect performance, reliability, and security. These represent the key areas for
deployment where mistakes are most often made.
Choosing a Deployment Strategy
The target deployment environment for an application may already be rigidly defined, and so the application design must reflect the restrictions. Sometimes design tradeoffs are required; for example, because of protocol or port restrictions, or specific deployment
topologies. Identify constraints early in the design phase to avoid surprises later, and involve members of the network and infrastructure teams to help with this process. General recommendations are:
- Know your target physical deployment environment early, from the planning stage of the lifecycle.
- Clearly communicate the environmental constraints that drive software design and architecture decisions.
- Clearly communicate the software design decisions that require certain infrastructure attributes.
Distributed vs. Non-Distributed Deployment
A non-distributed deployment is where all of the functionality and layers reside on a single server except for data storage functionality.
This approach has the advantage of simplicity and minimizes the number of physical servers required. It also minimizes the performance impact inherent when communication between layers has to cross physical boundaries between servers or server clusters.
A non-distributed deployment does have some disadvantages:
- The processing requirements of the layers differ. For example, the presentation layer must cope with multiple concurrent users and short bursts of activity, while the business and data layers should be optimized to deal with a steady stream of requests
from a limited number of callers. Processing on one layer could absorb sufficient resources to slow the processing in other layers.
- The security requirements of the presentation layer may differ from those of the business and data layers. For example, the presentation layer will not store sensitive data, while this may be stored in the business and data layers.
- It is difficult to share business logic between applications.
A distributed deployment is where the layers of the application reside on separate physical tiers. Distributed deployment allows you to separate the layers of an application on different physical tiers as shown in the following figure.
This approach allows you to configure the application servers that host the various layers to best meet the requirements of each layer. Distributed deployment also allows you to apply more stringent security to the application servers; for example, by adding
a firewall between the Web server and the applications servers and by using different authentication and authorization options.
In rich client applications, the client may use Web services exposed through a Web server, or may access functionality in the application server tier using DCOM or Windows Communication Foundation (WCF) services.
Distributed deployment provides a more flexible environment where you can more easily scale out or scale up each physical tier as performance limitations arise, and when processing demands increase.
Performance and Design Considerations for Distributed Environments
Distributing components across physical tiers reduces performance due to the cost of remote calls across server boundaries. However, distributed components can improve scalability opportunities, improve manageability, and reduce costs over time.
Consider the following guidelines when designing an application that will run on a physically distributed infrastructure:
- Choose communication paths and protocols between tiers to ensure that components can securely interact with minimum performance degradation.
- Use services and operating system features such as distributed transaction support and authentication that can simplify your design and improve interoperability.
- Reduce the complexity of your component interfaces. Highly granular interfaces ("chatty" interfaces) that require many calls to perform a task work best when on the same physical machine. Interfaces that make only one call to accomplish each task
("chunky" interfaces) provide the best performance when the components are distributed across separate physical machines.
- Consider separating long-running critical processes from other processes that might fail by using a separate physical cluster.
- Determine your failover strategy. For example, Web servers typically provide plenty of memory and processing power, but may not have robust storage capabilities (such as RAID mirroring) that can be replaced rapidly in the event of a hardware failure.
- Take advantage of asynchronous calls, one-way calls, or message queuing to minimize blocking when making calls across physical boundaries.
- How best to plan for the addition of extra servers or resources that will increase performance and availability.
Recommendations for locating components within a distributed deployment
Consider the following guidelines when determining where to locate components in a distributed environment:
- Only distribute components where necessary. Common reasons for implementing distributed deployment include security policies, physical constraints, shared business logic, and scalability.
- In Web applications, deploy business components that are used synchronously by user interfaces or user process components in the same physical tier as the user interface to maximize performance and ease operational management.
- Don’t place UI and business components on the same tier if there are security implications that require a trust boundary between them. For instance you may wish to separate business and UI components in a rich client application by placing UI on the client
and business components on the server.
- Deploy service agent components on the same tier as the code that calls the components, unless there are security implications that require a trust boundary between them.
- Deploy asynchronous business components, workflow components, and business services on a separate physical tier where possible.
- Deploy business entities on the same physical tier as the code that uses them.
Scale Up vs. Scale Out
Your approach to scaling is a critical design consideration because whether you plan to scale out your solution through a Web farm, a load-balanced middle tier, or a partitioned database, you need to ensure that your design supports this.
When you scale your application, you can choose from and combine two basic choices:
- Scale up: Get a bigger box.
- Scale out: Get more boxes.
Scale Up: Get a Bigger Box
With this approach, you add hardware such as processors, RAM, and network interface cards to your existing servers to support increased capacity. This is a simple option and one that can be cost effective. It does not introduce additional maintenance and support
costs. However, any single points of failure remain, which is a risk. Beyond a certain threshold, adding more hardware to the existing servers may not produce the desired results. For an application to scale up effectively, the underlying framework, runtime,
and computer architecture must scale up as well. When scaling up, consider which resources the application is bound by. If it is memory-bound or network-bound, adding CPU resources will not help.
Scale Out: Get More Boxes
To scale out, you add more servers and use load balancing and clustering solutions. In addition to handling additional load, the scale-out scenario also protects against hardware failures. If one server fails, there are additional servers in the cluster that
can take over the load. For example, you might host multiple Web servers in a Web farm that hosts presentation and business layers, or you might physically partition your application's business logic and use a separately load-balanced middle tier along
with a load-balanced front tier hosting the presentation layer. If your application is I/O-constrained and you must support an extremely large database, you might partition your database across multiple database servers. In general, the ability of an application
to scale out depends more on its architecture than on underlying infrastructure.
Consider Whether You Need to Support Scale Out
Scaling up with additional processor power and increased memory can be a cost-effective solution, It also avoids introducing the additional management cost associated with scaling out and using Web farms and clustering technology. You should look at scale-up
options first and conduct performance tests to see whether scaling up your solution meets your defined scalability criteria and supports the necessary number of concurrent users at an acceptable level of performance. You should have a scaling plan for your
system that tracks its observed growth.
If scaling up your solution does not provide adequate scalability because you reach CPU, I/O, or memory thresholds, you must scale out and introduce additional servers. To ensure that your application can be scaled out successfully, consider the following practices
in your design:
- You need to be able to scale out your bottlenecks, wherever they are. If the bottlenecks are on a shared resource that cannot be scaled, you have a problem. However, having a class of servers that have affinity with one resource type could be beneficial,
but they must then be independently scaled. For example, if you have a single SQL Server™ that provides a directory, everyone uses it. In this case, when the server becomes a bottleneck, you can scale out and use multiple copies. Creating an affinity between
the data in the directory and the SQL Servers that serve the data allows you to specialize those servers and does not cause scaling problems later, so in this case affinity is a good idea.
- Define a loosely coupled and layered design. A loosely coupled, layered design with clean, remotable interfaces is more easily scaled out than tightly-coupled layers with "chatty" interactions. A layered design will have natural clutch
points, making it ideal for scaling out at the layer boundaries. The trick is to find the right boundaries. For example, business logic may be more easily relocated to a load-balanced, middle-tier application server farm.
Consider Design Implications and Tradeoffs Up Front
You need to consider aspects of scalability that may vary by application layer, tier, or type of data. Know your tradeoffs up front and know where you have flexibility and where you do not. Scaling up and then out with Web or application servers may not be
the best approach. For example, although you can have an 8-processor server in this role, economics would probably drive you to a set of smaller servers instead of a few big ones. On the other hand, scaling up and then out may be the right approach for your
database servers, depending on the role of the data and how the data is used. Apart from technical and performance considerations, you also need to take into account operational and management implications and related total cost of ownership costs.
If you have stateless components (for example, a Web front end with no in-process state and no stateful business components), this aspect of your design supports scaling up and out. Typically, you optimize the price and performance within the boundaries of
the other constraints you may have. For example, 2-processor Web or application servers may be optimal when you evaluate price and performance compared with 4-processor servers; that is, four 2-processor servers may be better than two 4-processor servers.
You also need to consider other constraints, such as the maximum number of servers you can have behind a particular load-balancing infrastructure. In general, there are no design tradeoffs if you adhere to a stateless design. You optimize price, performance,
For data, decisions largely depend on the type of data:
- Static, reference, and read-only data. For this type of data, you can easily have many replicas in the right places if this helps your performance and scalability. This has minimal impact on design and can be largely driven by optimization considerations.
Consolidating several logically separate and independent databases on one database server may or may not be appropriate even if you can do it in terms of capacity. Spreading replicas closer to the consumers of that data may be an equally valid approach. However,
be aware that whenever you replicate, you will have a loosely synchronized system.
- Dynamic (often transient) data that is easily partitioned. This is data that is relevant to a particular user or session (and if subsequent requests can come to different Web or application servers, they all need to access it), but the data for user
A is not related in any way to the data for user B. For example, shopping carts and session state both fall into this category. This data is slightly more complicated to handle than static, read-only data, but you can still optimize and distribute quite easily.
This is because this type of data can be partitioned. There are no dependencies between the groups, down to the individual user level. The important aspect of this data is that you do not query it across partitions. For example, you ask for the contents of
user A's shopping cart but do not ask to show all carts that contain a particular item.
- Core data. This type of data is well maintained and protected. This is the main case where the "scale up, then out" approach usually applies. Generally, you do not want to hold this type of data in many places due to the complexity of keeping
it synchronized. This is the classic case in which you would typically want to scale up as far as you can (ideally, remaining a single logical instance, with proper clustering), and only when this is not enough, consider partitioning and distribution scale-out.
Advances in database technology (such as distributed partitioned views) have made partitioning much easier, although you should do so only if you need to. This is rarely because the database is too big, but more often it is driven by other considerations such
as who owns the data, geographic distribution, proximity to the consumers and availability.
Consider Database Partitioning at Design Time
If your application uses a very large database and you anticipate an I/O bottleneck, ensure that you design for database partitioning up front. Moving to a partitioned database later usually results in a significant amount of costly rework and often a complete
Partitioning provides several benefits:
- The ability to restrict queries to a single partition, thereby limiting the resource usage to only a fraction of the data.
- The ability to engage multiple partitions, thereby getting more parallelism and superior performance because you can have more disks working to retrieve your data.
- Be aware that in some situations, multiple partitions may not be appropriate and could have a negative impact. For example, some operations that use multiple disks could be performed more efficiently with concentrated data. So, when you partition, consider
the benefits together with alternate approaches.
A Web farm is a collection of servers that run the same application. Requests from clients are distributed to each server in the farm, so that each has approximately the same loading. Depending on the routing technology used, it may detect failed servers and
remove them from the routing list to minimize the impact of a failure. In simple scenarios, the routing may be on a "round robin" basis where a DNS server hands out the addresses of individual servers in rotation. The following figure illustrates
a simple Web farm where each server hosts all of the layers of the application except for the data store.
Affinity and User Sessions
Web applications often rely on the maintenance of session state between requests from the same user. A Web farm can be configured to route all requests from the same user to the same server – a process known as affinity – in order to maintain state where this
is stored in memory on the Web server. However, for maximum performance and reliability, you should use a separate session state store with a Web farm to remove the requirement for affinity.
In ASP.NET, you must also configure all of the Web servers to use a consistent encryption key and method for viewstate encryption where you do not implement affinity. You should also enable affinity for sessions that use SSL, or use a separate cluster for SSL
If you use a distributed model for your application, with the business layer and data layer running on different physical tiers from the presentation layer, you can scale out the business layer and data layer using an application farm. An application farm is
a collection of servers that run the same application. Requests from the presentation tier, , are distributed to each server in the farm so that each has approximately the same loading. You may decide to separate the business layer components and the data
layer components on different application farms depending on the requirements of each layer and the expected loading and number of users.
Load Balancing Cluster
Install your service or application onto multiple servers that are configured to share the workload. This type of configuration is a load-balanced cluster.
Load balancing scales the performance of server-based programs, such as a Web server, by distributing client requests across multiple servers. Load balancing technologies, commonly referred to as load balancers, receive incoming requests and redirect them to
a specific host if necessary. The load-balanced hosts concurrently respond to different client requests, even multiple requests from the same client. For example, a Web browser may obtain the multiple images within a single Web page from different hosts in
the cluster. This distributes the load, speeds up processing, and shortens the response time to clients.
A failover cluster is a set of servers that are configured so that if one server becomes unavailable, another server automatically takes over for the failed server and continues processing.
Install your application or service on multiple servers that are configured to take over for one another when a failure occurs. The process of one server taking over for a failed server is commonly known as failover. Each server in the cluster has at least
one other server in the cluster identified as its standby server.
In the impersonation/delegation authorization model, resources and the types of operation (such as read, write, and delete) permitted for each one are secured using Windows Access Control Lists (ACLs) or the equivalent security features of the targeted resource
(such as tables and procedures in SQL Server). Users access the resources using their original identity through impersonation, as illustrated in the following Figure.
In the trusted subsystem (or trusted server) model, users are partitioned into application-defined, logical roles. Members of a particular role share the same privileges within the application. Access to operations (typically expressed by method calls) is authorized
based on the role membership of the caller. With this role-based (or operations-based) approach to security, access to operations (not back-end resources) is authorized based on the role membership of the caller. Roles, analyzed and defined at application
design time, are used as logical containers that group together users who share the same security privileges or capabilities within the application. The middle tier service uses a fixed identity to access downstream services and resources, as illustrated in
the following Figure.
Multiple Trusted Service Identities
In some situations, you may require more than one trusted identity. For example, you may have two groups of users, one who should be authorized to perform read/write operations and the other read-only operations. The use of multiple trusted service identities
provides the ability to exert more granular control over resource access and auditing, without having a large impact on scalability. The following figure illustrates the multiple trusted service identities model.
Network Infrastructure Security Considerations
Make sure you understand the network structure provided by your target environment, and understand the baseline security requirements of the network in terms of filtering rules, port restrictions, supported protocols, and so on. Recommendations for maximizing
network security include:
- Identify how firewalls and firewall policies are likely to affect your application's design and deployment. Firewalls should be used to separate the Internet-facing applications from the internal network, and to protect the database servers. These can
limit the available communication ports and, therefore, authentication options from the Web server to remote application and database servers. For example, Windows authentication requires additional ports.
- Consider what protocols, ports, and services are allowed to access internal resources from the Web servers in the perimeter network or from rich client applications. Identify the protocols and ports that the application design requires and analyze the potential
threats that occur from opening new ports or using new protocols.
- Communicate and record any assumptions made about network and application layer security, and what security functions each component will handle. This prevents security controls from being missed when both development and network teams assume that the other
team is addressing the issue.
- Pay attention to the security defenses that your application relies upon the network to provide, and ensure that these defenses are in place.
- Consider the implications of a change in network configuration, and how this will affect security.
The choices you make when deploying an application affect the capabilities for managing and monitoring the application. You should take into account the following recommendations:
- Deploy components of the application that are used by multiple consumers in a single central location to avoid duplication.
- Ensure that data is stored in a location where backup and restore facilities can access it.
- Components that rely on existing software or hardware (such as a proprietary network that can only be established from a particular computer) must be physically located on the same computer.
- Some libraries and adaptors cannot be deployed freely without incurring extra cost, or may be charged on a per-CPU basis, and therefore you should centralized these features.
- Groups within an organization may own a particular service, component, or application that they need to manage locally.
- Monitoring tools such as System Center Operations Manager require access to physical machines to obtain management information, and this may impact deployment options.
- The use of management and monitoring technologies such as Windows Management Instrumentation (WMI) may impact deployment options.
||Three-Layered Services Application
|Performance & Reliability
||Federated Authentication (SSO)
||Impersonation and Delegation
- Adapter – An object that supports a common interface and translates operations between the common interface and other objects that implement similar functionality with different interfaces.
- Brokered Authentication – Authenticate against a broker, which provides a token to use for authentication when accessing services or systems.
- Direct Authentication – Authenticate directly against the service or system that is being accessed.
- Layered Application – An architectural pattern where a system is organized into layers.
- Load-Balanced Cluster – A distribution pattern where multiple servers are configured to share the workload. Load balancing provides both improvements in performance by spreading the work across multiple servers, and reliability where one server
can fail and the others will continue to handle the workload.
- Provider – Implement a component that exposes an API that is different from the client API to allow any custom implementation to be seamlessly plugged in. Many applications that provide instrumentation expose providers that can be used to capture
information about the state and health of your application and the system hosting the application.
- Tiered Distribution – An architectural pattern where the layers of a design can be distributed across physical boundaries.
- Trusted Sub-System – The application acts as a trusted subsystem to access additional resources. It uses its own credentials instead of the user's credentials to access the resource.
patterns & practices Solution Assets
- Enterprise Library provides a series of application blocks that simplify common tasks such as caching, exception handling, validation, logging, cryptography, credential management, and facilities for implementing design patterns such as Inversion
of Control and Dependency Injection. For more information, see
- Unity Application Block** is a lightweight, extensible dependency injection container that helps you to build loosely coupled applications. For more information, see