Content Delivery Networks (CDN) is evolved in first 1998 as a technique for improving the web performance by replicating web contents over several Surrogate Servers (Mirrored servers) strategically place at various locations to deal with the flash crowds. A set of surrogate servers distributed around the world can cache the origin serverâ„¢s content. Routers and network elements that deliver content requests to the optimal location and the optimal surrogate server. Under CDN, the client-server communication is replaced by two communication flows: one between the client and the surrogate server, and another between the surrogate and origin server. The key advantage of Content Delivery Networks is that improve the content delivery quality, speed and reliability, reduce the load on the origin server and bypassing traffic jams over the web.
INTRODUCTION TO CDN
Content Delivery Networks (CDN) is evolved in first 1998 as a technique for improving the web performance by replicating web contents over several Surrogate Servers (Mirrored servers) strategically place at various locations to deal with the flash crowds. A set of surrogate servers distributed around the world can cache the origin serverâ„¢s content. Routers and network elements that deliver content requests to the optimal location and the optimal surrogate server. Under CDN, the client-server communication is replaced by two communication flows: one between the client and the surrogate server, and another between the surrogate and origin server. Figure below shows the overview of a Content Delivery Network Akamai, Limelight and Mirror Images are some of the examples of Content Delivery Networks. Akamai is the CDN of the website http://www.discovery.com. Limelight is for live delivery of videos, audios and games. Mirror Image is for online content and application delivery.
The important benefits of Content Delivery Networks
• Improve the content delivery quality, speed and reliability.
• Reduce the load on the origin server
• Bypassing traffic jams over the web
ORGANIZATION OF CDN
There are two approaches are used for building CDN:
1 Overlay approach
2 Network approach
In the overlay approach, application-specific servers and caches at several places in the network handle the distribution of specific content types. Other than providing the basic network connectivity and guaranteed QoS for specific request/traffic, the core network components such as routers and switches play no active role in content delivery. Most of the commercial CDN providers such as Akamai and Limelight Networks follow the overlay approach for CDN organization. These CDN providers replicate content to cache servers worldwide. When content requests are received from the end users, they are redirected to the nearest CDN server, thus improving Web site response time. As the CDN providers need not to control the underlying network elements, the management is simplified in an overlay approach and it opens opportunities for new services.
In the network approach, the network components including routers and switches are equipped with code for identifying specific application types and for forwarding the requests based on predefined policies. Examples of this approach include devices that redirect content requests to local caches or switch traffic to specific servers, optimized to serve specific content types. Some CDNs use both network and overlay approaches for CDN organization. In such case, a network elements can act at the front end of a server farm and redirects the content request to a nearby application-specific surrogate server.
ISSUES IN CONTENT DELIVERY NETWORKS
The major issues related with CDN are
1 Surrogate placement
2 Content selection and delivery
3 Content out sourcing
They are explained below.
Choosing the best location for each surrogate server is important for each
CDN infrastructure. Determining the best network locations for CDN surrogate
servers known as the web server replica placement problem.
There are mainly three approaches are used for selecting the location of CDN
Theoretical approaches such as minimum k- center problem and k-Hierarchically well-Separated Trees (k-HST) model the server placement problem as the center placement problem which is defined as follows: for the placement of a given number of centers, minimize the maximum distance between a node and the nearest center. The k-HST algorithm solves the server placement problem according to graph theory. In this approach, the network is represented as a graph G(V,E), where V is the set of nodes and E ? V Ãƒâ€”V is the set of links. The algorithm consists of two phases. In the first phase, a node is arbitrarily selected from the complete graph (parent partition) and all the nodes, which are within a random radius from this node form a new partition (child partition). The radius of the child partition is a factor of k smaller than the diameter of the parent partition. This process continues until each of the nodes is in a partition of its own. Thus the graph is recursively partitioned and a tree of partitions is obtained with the root node being the entire network and the leaf nodes being individual nodes in the network. In the second phase, a virtual node is assigned to each of the partitions at each level. Each virtual node in a parent partition becomes the parent of the virtual nodes in the child partitions and together the virtual nodes form a tree. Afterwards, a greedy strategy is applied to find the number of centers needed for the resulted k-HST tree when the maximum center-node distance is bounded by D. The minimum k-center problem can be described as follows: (1) Given a graph G(V,E) with all its edges arranged in non-decreasing order of edge cost c :
c(e2) = . . .. = c(em), construct a set of square graphs G
2, . . .. , G
square graph G, denoted by G
is the graph containing nodes V and edges (u,v) wherever there is a path between u and v in G. (2) Compute the maximal independent set Mi for each G2i. An independent set of G2 is a set of nodes in G that are at least three hops apart in G and a maximal independent set M is defined as an independent set V_ such that all nodes in V -V_ are at most one hop away from nodes in V_. (3) Find smallest i such that Mi = K, which is defined as j. Finally, Mj is the set of K center.
:Hot spot algorithm places replicas near the
clients generating greatest load. It sorts the N potential site according to the amount
of traffic generated surrounding them and places replicas at the top M sites that
generate maximum traffic.
:In this strategy servers are placed on
candidate hosts in descending order of out degrees (the number of other nodes
connected to a node). Here the assumption is that nodes with more out degrees can reach more nodes with smaller latency. For surrogate server placement, the CDN administrators also determine the optimal number of surrogate servers using single-ISP and multi-ISP approach. In the Single-ISP approach, a CDN provider typically deploys at least 40 surrogate servers around the network edge to support content delivery. The policy in a single-ISP approach is to put one or two surrogates in each major city within the ISP coverage. The ISP equips the surrogates with large caches. An ISP with global network can thus have extensive geographical coverage without relying on other ISPs. The drawback of this approach is that the surrogates may be placed at a distant place from the clients of the CDN provider. In Multi-ISP approach, the CDN provider places numerous surrogate servers at as many global ISP Points of Presence (POPs) as possible. It overcomes the problems with single-ISP approach and surrogates are placed close to the users and thus content is delivered reliably and timely from the requesting clientâ„¢s ISP. Large CDN providers such as Akamai have more than 25000 servers. Other than the cost and complexity of setup, the main disadvantage of the multi-ISP approach is that each surrogate server receives fewer (or no) content requests which may result in idle resources and poor CDN performance . Estimation of performance of these two approaches shows that single-ISP approach works better for sites with low-to- medium traffic volumes, while the multi-ISP approach is better for high-traffic sites
Content selection and delivery The efficiency of content delivery lies in the right selection of content to be delivered to the end users. An appropriate content selection approach can assist in the reduction of client download time and server load. Content can be delivered to the customer in two ways.
1.Full-site content selection and delivery.
2.Partial-site content selection and delivery.
Full-site content selection and delivery
:In this approach surrogate
servers perform entire replication in order to deliver the total content site to the end users. With this approach, a content provider configures its DNS in such a way that all client requests for the web site are resolved by CDN server, which then delivers all the content. The main advantage of this approach is its simplicity. This approach is not a feasible one because of that the sizes of the web objects are increasing day by day and so there is a chance for insufficient storage space on the CDN servers. Another problem is that web objects are not static, the problem of updating such a huge collection of web objects is unmanageable. Partial-site content selection and delivery :
In the partial-site content selection and delivery, surrogate servers perform partial replication of the web objects. Only embedded objects such as web images are delivered from CDN server. (An object created with one application and embedded into a document created by another application is called embedded object). Thus the base HTML is retrieved from origin server, while embedded objects are retrieved from CDN cache servers. The selection of web objects from large collection of web objects is needed in the case of partial-site content selection and delivery. There are mainly three methods 1.
:In this approach the web site administrator selects the content to be replicated to the CDN servers. Heuristics are used in making such a empirical decision. The main drawback of lies approach lies in the uncertainty in choosing the right heuristics.
: In this the most popular objects are replicated to the surrogates. This approach is time consuming and reliable objects request statistics is not guaranteed due to the popularity of each object varies considerably. Moreover, such statistics are not available for newly introduced contents. Cluster-based
In a cluster-based approach, web content is grouped based on either correlation or access frequency and is replicated in units of content clusters. Content outsourcing Choosing an efficient content outsourcing technique is important same as placement of surrogate servers and content selection for delivery. Content out sourcing is performed using
:Cooperative push-based approach
depends on the pre-fetching of content to the surrogates. Content is pushed to surrogate servers from the origin, and surrogate servers cooperate to reduce replication and update cost. In this scheme, the CDN maintains a mapping between content and surrogate servers, and each request is directed to closest surrogate server and if it cannot handle the request then request is directed to the origin server. In this approach heuristic algorithm is suitable for making replication decision among cooperating servers. Non-cooperative pull-based
:In this approach client request are
directed to their closest surrogate servers. If there is cache miss, surrogate servers pull content from the origin server. Most CDN providers such as Akamai, Mirror Image use this approach. The drawback of this approach is that an optimal server not always chosen to serve content request. Many CDN s use this approach since the cooperative push-based approach is still at the experimental stage. Cooperative pull-based
: In this approach surrogate servers cooperate with each other to get the requested content in the case of cache miss. In the cooperative pull-based approach client requests are directed to the closest surrogate server through DNS redirection. Using a distributed index, the surrogate server find nearby copies of requested content and store it in the cache. In the context of content outsourcing, it is crucial to determine in which surrogate servers the outsourced content should be replicated. Several works can be found in literature demonstrating the effectiveness of different replication strategies for outsourced content. Kangasharju et al.  have used four heuristics, namely random, popularity, greedy-single, and greedy-global, for replication of outsourced content.Tse  has presented a set of greedy approaches where the placement is occurred by balancing the loads and sizes of the surrogate servers. Pallis et al.  have presented a self-tuning, parameterless algorithm called lat- cdn for optimally placing outsourced content in CDNâ„¢s surrogate servers. This algorithm uses objectâ„¢s latency to make replication decision. An objectâ„¢s latency is defined as the delay between a request for a Web object and receiving the object in its entirety. An improvement of the lat-cdn algorithm is il2p , which places the outsourced objects to surrogate servers with respect to the latency and load of the objects.
There are two protocols are used in CDN for interaction between network
1Network Element Control Protocol (NECP)
2Web Cache Control Protocol (WCCP)
3Cache Array Routing Protocol (CARP)
4Cache Array Routing Protocol (CARP)
5Internet Cache Protocol (ICP)
6Hyper Text Caching Protocol (HTCP)
Network Element Control Protocol (NECP) NECP is a protocol used for signaling between servers and the network elements that forward traffic to them. It is a lightweight protocol (A protocol is called lightweight if it is designed with less complexity in order to reduce overhead. For e.g. Uses fixed length headers) .The network elements consists of a range of devices, including content-aware switches and load-balancing routers. NECP does not have any load balancing policy, rather this protocol provides methods for network elements to learn about server capabilities, availability and hints as to which flows can and cannot be served. Hence network elements can make load balancing decisions.
NCEP uses Transport Control Protocol (TCP). When a server is initialized, it establishes a TCP connection to the network elements using a well known port number. Messages can then send bi-directionally between the server and network elements. Most messages consist of a request followed by a reply or acknowledgement. Receiving a positive acknowledgement implies the recording of some state in a peer system. This can be assumed to remain in that peer until it expires or the peer crashes. Application level KEEPALIVE messages are used to detect a crashed peer in such communications. When a node detects that its peer has been crashed, it assumes that all the states in that peer need to be reinstalled after the peer is revived. Web Cache Control Protocol (WCCP) TheWeb Cache Control Protocol (WCCP) specifies interaction between one or more routers and one or more Web-caches. It runs between a router functioning as a redirecting network element and interception proxies. The purpose of such interaction is to establish and maintain the transparent redirection of selected types of traffic flow through a group of routers. The selected traffic is redirected to a group of Web-caches in order to increase resource utilization and to minimize response time. WCCP allows one or more proxies to register with a single router to receive redirected traffic. This traffic includes user requests to view s and graphics on World Wide Web (WWW) servers, whether internal or external to the network, and the replies to those requests. This protocol allows one of the proxies, the designated proxy, to dictate to the router how redirected traffic is distributed across the caching proxy array. WCCP provides the means to negotiate the specific method used to distribute load among Web caches. It also provides methods to transport traffic between router and cache. Cache Array Routing Protocol (CARP) The Cache Array Routing Protocol (CARP)  is a distributed caching protocol based on a known list of loosely coupled proxy servers and a hash function for dividing URL space among those proxies. An HTTP client implementing CARP can route requests to any member of the Proxy Array. The proxy array membership table is defined as a plain ASCII text file retrieved from an Array Configuration URL. The hash function and the routing algorithm of
CARP take a member proxy defined in the proxy array membership table, and make an on-the-fly determination about the proxy array member which should be the proper container for a cached version of a resource pointed to by a URL. Since requests are sorted through the proxies, duplication of cache content is eliminated and global cache hit rates are improved.
Internet Cache Protocol (ICP)
The Internet Cache Protocol (ICP) is a lightweight message format used for inter- cache communication. Caches exchange ICP queries and replies to gather information to use in selecting the most appropriate location in order to retrieve an object. Other than functioning as an object location protocol, ICP messages can also be used for cache selection. ICP is a widely deployed protocol. although, Web caches use HTTP for the transfer of object data, most of the caching proxy implementations support it in some form. It is used in a caching proxy mesh to locate specific Web objects in neighboring caches. One cache sends an ICP query to its neighbors and the neighbors respond with an ICP reply indicating a HIT or a MISS. Failure to receive a reply from the neighbors within a short period of time implies that the network path is either congested or broken. Usually, ICP is implemented on top of User Datagram Protocol (UDP) in order to provide important features to Web caching applications. Since UDP is an unreliable and connectionless network transport protocol, an estimate of network congestion and availability may be calculated by ICP loss. This sort of loss measurement together with the round-trip-time provides a way to load balancing among caches. Hyper Text Caching Protocol (HTCP)
The Hyper Text Caching Protocol (HTCP) is a protocol for discovering HTTP caches, cached data, managing sets of HTTP caches and monitoring cache activity. HTCP is compatible with HTTP 1.0. This is in contrast with ICP, which was designed for HTTP 0.9. HTCP also expands the domain of cache management to include monitoring a remote cacheâ„¢s additions and deletions, requesting immediate deletions, and sending hints about Web objects such as the third party locations of cacheable objects or the measured uncacheability or unavailability of Web objects. HTCP messages may be sent over UDP or TCP. HTCP agents must not be isolated from network failure and delays. An HTCP agent should be prepared to act in useful ways in the absence of response or in case of lost Or damaged responses.
CDN providers host third-party content for fast delivery of any digital content, including â€œ static content, dynamic content, streaming media (e.g. audio, real time video), and different content services (e.g. directory service, e-commerce service, and file transfer service). The sources of content are large enterprises, Web service providers, media companies, and news broadcasters. Variation in content and services delivered requires a CDN to adopt application-specific characteristics, architectures, and technologies. Due to this reason, some of the CDNs are dedicated for delivering particular content and/or services. Here, we analyze the characteristics of the content/service types to reveal their nature. Static content refers to content for which the frequency of change is low. It does not change depending on user requests. It includes static HTML s, embedded images, executables, PDF documents, software patches, audio and/or video files. All CDN providers support this type of content delivery. This type of content can be cached easily and their freshness can be maintained using traditional caching technologies. Dynamic content refers to the content that is personalized for the user or created on-demand by the execution of some application process. It changes frequently depending on user requests. It includes animations, scripts, and DHTML. Due to the frequently changing nature of the dynamic content, usually it is considered as uncachable. Streaming media can be live or on-demand. Live media delivery is used for live events such as sports, concerts, channel, and/or news broadcast. In this case, content is delivered instantly from the encoder to the media server, and then onto the media client. In case of on-demand delivery, the content is encoded and then is stored as streaming media files in the media servers. The content is available upon requests from the media clients. On-demand media content can include audio and/or video on-demand, movie files and music clips. Streaming servers are adopted with specialized protocols for delivery of content across the IP network. A CDN can offer its network resources to be used as a service distribution channel and thus allows the value-added services providers to make their application as an Internet infrastructure service. When the edge servers host the software of value added services for content delivery, they may behave like transcoding proxy servers, remote callout servers, or surrogate servers . These servers also demonstrate capability for processing and special hosting of the value- added Internet infrastructure services. Services provided by CDNs can be directory, Web storage, file transfer, and e-commerce services. Directory services are provided by the CDN for accessing the database servers. Users query for certain data is directed to the database servers and the results of frequent queries are cached at the edge servers of the CDN. Web storage service provided by the CDN is meant for storing content at the edge servers and is essentially based on the same techniques used for static content delivery. File transfer services facilitate the worldwide distribution of software, virus definitions, movies on-demand, and highly detailed medical images. All these contents are static by nature. Web services technologies are adopted by a CDN for their maintenance and delivery. E-commerce is highly popular for business transactions through the Web. Shopping carts for e-commerce services can be stored and maintained at the edge servers of the CDN and online transactions (e.g. third-party verification, credit card transactions) can be performed at the edge of CDNs. To facilitate this service, CDN edge servers should be enabled with dynamic content caching for e-commerce sites.
CACHE ORANIZATION AND MANAGEMENT
Content management is essential for CDN performance, which is mainly depend on the cache organization followed by the CDN. Cache organization includes the caching technique used and the frequency of cache update to ensure the freshness, availability, and reliability of the content. Caching Techniques
There are four caching techniques are used in the CDN.
:In query based scheme, when a cache miss
occur the CDN server broadcasts a query to other cooperating CDN servers. The problems with this scheme are the significant query traffic and delay because a CDN server has to wait for last miss reply from all the cooperating surrogates before concluding that none of its peers has the requested content. Because of these drawbacks, query-based scheme suffers from implementation overhead. Digest-based
In the digest-based scheme, each of the CDN servers maintains a digest of content held by the other cooperating surrogates. The cooperating surrogates are informed about any sort of update of the content by updating CDN server. On checking the content digest, CDN server can take decision to route a content request to a particular surrogate. The main drawback is that it suffers from update traffic overhead, because of the frequent exchange of update traffic needs to make sure that the cooperating surrogates have correct information about each other.
:The directory-based scheme is a centralized version of digest-based scheme. In directory-based scheme, a centralized server keeps content information of all the cooperating surrogates inside a cluster. Each CDN server only notifies the directory server when local updates occur and queries the directory server whenever there is a local cache miss. The main drawback of this scheme is that due to the centralized approach if the directory server failed the overall cache management will not be possible and the other is that the single directory server receives updates and query traffic from all cooperating surrogates. Hashing-based
:In hashing-based scheme, the cooperating CDN servers maintain the same hashing function. A designated CDN server holds a content based on the contentâ„¢s URL, IP address of CDN servers, and hashing function. All requests for that particular content are directed to that designated server. Hashing based scheme is more efficient than other schemes since it has the smallest implementation overhead and highest content sharing efficiency.
Cache Updating Methods
To ensure the consistency and freshness of the content at replicas, CDNs deploy different cache update techniques. The common methods used for updating caches are
:In periodic update caches are updated
in regular fashion. Means at each interval the origin server updates the caches of
surrogate servers. But this approach suffers from significant levels of unnecessary traffic generated from update traffic at each interval. Update Propagation
:Update propagation is triggered with the change in the content. It performs active content pushing to the cache servers. In this mechanism, an updated version of a document is delivered to all caches whenever a change is made to the document at the origin server. For frequently change contents, this approach generates excess update traffic. On-demand Update
:On-demand update is a cache update mechanism where the latest copy of a document is propagated to the surrogate cache server based on prior request to that content. This approach follows an assume nothing structure and content is not updated unless it is requested. The disadvantage of this approach is the back-and-forth traffic between the cache and origin server in order to ensure that the delivered content is latest. Invalidation
:In this method an invalidation message is sent to all surrogate caches when a document is changed at the origin server. The surrogate caches are blocked from accessing the documents when it is being changed. Each cache needs to fetch an updated version of the document individually later. The drawback of this approach is that delayed fetching of content by caches may lead to inefficiency of managing consistency among cached contents.
Performance measurement of a CDN is done to measure its ability to serve the customers with the desired content and/or service. Typically the content providers to evaluate the performance of a CDN use five key matrices. Those are
Cache hit ratio
: It is defined as the ratio of number of cached documents versus total documents requested. A high hit rate reflects that the CDN is using an effective cache technique to manage its caches.
:It is the measure of the bandwidth
used by the origin server. It is measured in bytes and is and is retrieved from the origin server.
:It refers to the user perceived response time. Reduced latency indicates that less bandwidth is reserved by origin server.
Surrogate server utilization :
It refers to the fraction of time during
which the surrogate servers remain busy. This metric is used by the
administrators to calculate CPU load, number of requests served and storage
:Packet-loss measurements are used to
determine the reliability of CDN. High reliability indicates that the CDN incurs less packet loss and is always available to clients. Performance measurement can be accomplished based on internal measures as well as from the customer perspective. A CDN providerâ„¢s own performance testing can be misleading, since it may perform well for a particular web site and/or content, but poorly for others. To ensure the reliable performance measurement, a CDNâ„¢s performance can be measured by independent third-party.
The performance measurement taxonomy is shown below.
CDN servers could be equipped with the ability to collect statistics in order to get an end-to-end measurement of its performance. In addition, deployment of probes throughout the network and correlation of the information collected by probes with the cache and server logs can be used to measure the end-to-end performance.
In addition to internal performance measurement, external measurement of performance by an independent third-party informs the CDN customers about the verified and guaranteed performance. This process is efficient since independent performance-measuring companies support benchmarking networks strategically located measurement computers connected through major Internet backbones in several cities. These computers measure how a particular web site performs from the end userâ„¢s perspective, considering service performance metrics in critical areas.
Network statistics acquisition for performance measurement For internal or external performance measurement, different network statistics acquisition techniques are deployed based on several parameters. Such techniques may involve network probing, traffic monitoring, and feedback from
surrogates. Typical parameters in the network statistics acquisition process include geographical proximity, network proximity, latency, server load, and server performance as a whole. Network probing is a measurement technique where the possible requesting entities are probed in order to determine one or more metrics from each surrogate or a set of surrogates. Network probing can be for P2P-based cooperative CDNs where the the surrogates servers are not controlled by a single CDN provider. Examples of such probing techniques sometimes not suitable and limited for some reasons. It introduces additional network latency which may be significant for small web requests. Moreover, performing several probes to an entity often triggers intrusion-detection alerts, resulting in abuse complaints. Probing sometimes may lead to an inaccurate metric as ICMP traffic can be ignored or reprioritized due to concerns of Distributed Denial of Service attacks. Traffic monitoring is a measurement technique where the traffic between the client and surrogate is monitored to know the actual performance metrics. Once the client connects, the actual performance of transfer is measured. This data is then fed back into the request-routing system. An example of such monitoring is to watch the packet loss from a client to surrogate or the user perceived response time(latency) by observing the TCP behaviour. Latency is simplest and mostly used distance metric, which can be estimated by monitoring the number of packets traveled along the route between client and the surrogate. Performance measurement through simulation Other than using internal and external performance measurement, researchers use simulation tools to measure a CDNâ„¢s performance. Some researchers also experiment their CDN policies on real platforms such as PlanetLab. The CDN simulators implemented in software are valuable tools for researchers to develop, test and diagnose a CDN Some researchers also experiment their CDN policies on real platforms such as PlanetLab . The CDN simulators implemented in software are valuable tools for researchers to develop, test and diagnose a CDNâ„¢s performance, since accessing real CDN traces and logs is not easy due to the proprietary nature of commercial CDNs. Such a simulation process is economical because of no involvement of dedicated hardware to carryout the experiments. Moreover, it is flexible because it is possible to simulate a link with any bandwidth and propagation delay and a router with any queue size and queue management technique. A simulated network environment is free of any uncontrollable factors such as unwanted external traffic, which the researchers may experience while running experiments in a real network. Hence, simulation results are reproducible and easy to analyze. A wide range of network simulators are available which can be used to simulate a CDN to measure its performance. Moreover, there are also some specific CDN simulation systems that allow a (closely) realistic approach for the research community and CDN developers to measure performance and experiment their policies. However, the results obtained from a simulation may be misleading if a CDN simulation system does not take into account several critical factors such as the bottlenecks that are likelyto occur in a network, the number of traversed nodes etc., considering the TCP/IP network infrastructure. Mapping of the Taxonomy to Representative CDNs In this section, we provide the categorization and mapping of our taxonomy to a few representative CDNs that have been surveyed in Chap. 1 of this book. We also present the perceived insights and a critical evaluation of the existing systems while classifying them. Our analysis of the CDNs based on the taxonomy also examines the validity and applicability of the taxonomy.
REQUEST ROUTING IN CDN
A request-routing system is responsible for routing client requests to an appropriate surrogate server for the delivery of content. It consists of a collection of network elements to support request-routing for a single CDN. It directs client requests to the replica server closest to the client. However, the closest server may not be the best surrogate server for servicing the client request. Hence, a request-routing system uses a set of metrics such as network proximity, client perceived latency, distance, and replica server load in an attempt to direct users to the closest surrogate that can best serve the request. The content selection and delivery techniques (i.e. full-site and partial-site) used by a CDN have a direct impact on the design of its request-routing system. If the full-site approach is used by a CDN, the request routing system assists to direct the client requests to the surrogate servers as they hold all the outsourced content. On the other hand, if the partial-site approach is used, the request-routing system is designed in such a way that on receiving the client request, the origin server delivers the basic content while surrogate servers deliver the embedded objects. The request-routing system in a CDN has two parts: deployment of a request- routing algorithm and use of a request-routing mechanism. A request-routing algorithm is invoked on receiving a client request. It specifies how to select an edge server in response to the given client request. On the other hand, a request-routing mechanism is a way to inform the client about the selection. Such a mechanism at first invokes a request-routing algorithm and then informs the client about the selection result it obtains.
Content Delivery Networks are still in an early stage of development and itâ„¢s future evolution remains an open issue. It is essential to understand the existing practices involved in CDN framework in order to propose or predict the evolutionary steps. The challenge is to provide a dedicate balance between costs and customers satisfaction. In this framework, caching-related practices, content personalization processes, and data mining techniques seem to offer an effective roadmap for the further evolution of CDNs.
1. Pallis, G. and Vakali,A. Insight and Perspective for Content Delivery
Networks. communications of ACM 49, 1(January 2006),101-106
2. Buyya, R. Pathan, M. And Vakali, A. Content Delivery Networks
3. Held,G. A Practical Guide To Content Delivery Networks