WO2012152824A1

WO2012152824A1 - Method for managing the infrastructure of a content distribution network service in an isp network and such an infrastructure

Info

Publication number: WO2012152824A1
Application number: PCT/EP2012/058527
Authority: WO
Inventors: Eguzki ASTIZ LEZAUN; Mattias Barthel; Parminder Chhabra; Armando Antonio GARCÍA MENDOZA; Nikolaos Laoutaris; Arcadio PANDO CAO; Pablo Rodriguez Rodriguez; Alvaro SAURIN PARRA; Xiaoyuan Yang
Original assignee: Telefonica, S.A
Priority date: 2011-05-12
Filing date: 2012-05-09
Publication date: 2012-11-15
Also published as: ES2410654B1; ES2410654R1; CL2013003223A1; BR112013028989A2; AR086342A1; ES2410654A2; EP2707997A1

Abstract

Method for managing the infrastructure of a Content Distribution Network service in an ISP network and such an Infrastructure. The method comprises using a CDN manager to administer one or more regions in an operating business, OB, by means of one or more Application Programming Interfaces, APIs hosted by the CDN manager of the content delivery network. The infrastructure comprises a CDN manager hosting one or more APIs for implementing the method.

Description

Method for managing the infrastructure of a Content Distribution Network service in an ISP network and such an Infrastructure

Field of the art

The present invention generally relates, in a first aspect, to a method for managing the infrastructure of a Content Distribution Network (CDN) service in an ISP network, and more particularly to a method further comprising using a CDN manager to administer one or more regions in an operating business.

A second aspect of the invention relates to an infrastructure of a CDN service in an ISP network comprising a CDN manager implementing the method of the first aspect.

Prior State of the Art

Content Delivery Networks (CDNs) are a network of computers that contain copies of data, placed at various points in the network to speed up access to data for clients throughout the network [1 ]. The key idea here is to ensure that end users don't all go to the central sever to access data. Rather, the end user accesses a copy of the data close by so as to avoid causing network bandwidth and computation bottleneck at the central server.

As the demand for multimedia content grows on the Internet, so does the need to keep bandwidth costs low and reduce upgrade cycles of backbone links. CDNs have taken off in a big way in the last few years. CDNs can dynamically distribute content to different points of the edge network and manage network load and optimize capacity per customer. CDNs can offer content availability in the face of power, network and hardware outages.

Content delivery networks have been around for over a decade. There are several architectures for building and operating such networks. Akamai for instance, has several tens of thousands of servers spread across many remote PoPs. Akamai uses a hierarchy of DNS servers to identify the content servers that can best serve the requested content.

Web caches are also used to store the most popular content on servers that have the most number of client requests. Web caches reduce server load and improves the end user response time for content that is stored in the cache.

Under Amazon Cloudfront, the customer first creates an S3 bucket that will serve as the origin server for the content. The customer then places the content in the S3 bucket and makes the content publicly readable. As a next step, the CDN customer creates a cloudfront distribution and the infrastructure generates a unique distribution ID and a cloudfront domain name. The CDN customer can then embed the URLs in their website so that when an end user requests content, cloudfront will receive the appropriate URL that will be used to serve the content.

Cloudfront offers HTTP for serving basic files and RTMP streaming of media files. Cloudfront also allows a CDN customer to use their custom domain names and use of origin servers located at the customer's premises.

Limelight uses a small number of large datacenters connected by a high speed, high capacity optical network. These datacenters are strategically located to connect with access networks at multiple locations worldwide. Limelight uses a DNS redirection mechanism that maps requests to an end point in one of its datacenters. For instance, a DNS query of www.shrek3.com is first resolved to drmwrks.vo.llnwd.net via the public DNS infrastructure [2]. This is then resolved to a specific end point IP address in a datacenter by Limelight's private DNS infrastructure.

Limelight also allows customers to directly use Limelight hostnames on their websites. For example, customers downloading movies from Amazon Unbox may find that their request is answered by Amazon-xxx.vo.llnwd.net (where xxx is an integer) [2].

Limelight and Amazon Cloudfront have a small number of large datacenters that connect via a global network. Highwinds Network's architecture has many small globally distributed datacenters connect via a global network. In all these cases, a server in the datacenter (of the hosting CDN network) closest to the requesting client is identified to deliver the content. However, building such large and overloaded datacenters involve significant capital expenditure and yet, don't fully utilize the flexibility offered by existing network infrastructure.

There are several problems with existing solutions.

(1 ) DNS and load balancing is coupled. This makes it difficult to distinguish between an overloaded content server and an overloaded DNS server.

(2) Several of the above datacenter solutions are expensive and require significant capital expenditure to build and deploy.

(3) Some of these solutions require specialized hardware in the form of load balancers, high-end servers to deliver content.

(4) Some of the solutions are pure P2P based. Here, every end user is treated as a peer to distribute content. End user resources are required to distribute content. This implies additional requirements for security and authentication when distributing content.

(5) P2P based solutions use unreliable bandwidth at end users, who have different resource constraints depending on the application they run on their machines to attempt to deliver content reliably. Reliable content delivery is impossible on a large scale in such a scenario.

(6) Only bittorrent-like protocols use the latest topology information when finding network links that are least loaded to distribute content, but these are pure P2P-based.

(7) It is not possible to deploy the solutions discussed above deep within an

ISP topology. Of the above, only Akamai's solution may be deployed deep within an ISP. However, this would require the ISP to expose significant details about their topology. Description of the Invention

It is necessary to offer an alternative to the state of the art that covers the gaps found therein, particularly those related to the lack of proposals offering solutions which can be deployed as deep within an ISP network as necessary and without exposing the details of the ISP network topology.

To that end, the present invention provides, in a first aspect, a method for managing the infrastructure of a Content Distribution Network service in an ISP network, which, contrary to known proposals, comprises using a CDN manager to administer one or more regions in an operating business, or OB by means of one or more Application Programming Interfaces, APIs, hosted by the CDN manager.

Other embodiments of the method of the first aspect of the invention, including those relating to several proposed specific APIs, are described according to appended claims 2 to 24, and in a subsequent section related to the detailed description of several embodiments.

A second aspect of the invention relates to an infrastructure of a CDN service in an ISP network that comprises a CDN manager hosting one or more APIs for implementing the method of the first aspect of the invention according to several embodiments.

Next, some definitions are given that are useful for understanding the terminology used for both, the prior art disclosures and the present invention. Point of Presence (PoP): A point-of-presence is an artificial demarcation or interface point between two communication entities. It is an access point to the Internet that houses servers, switches, routers and call aggregators. ISPs typically have multiple PoPs.

ISP DNS Resolver: Residential users connect to an ISP. Any request to resolve an address is sent to a DNS resolver maintained by the ISP. The ISP DNS resolver will send the DNS request to one or more DNS servers within the ISP's administrative domain.

Bucket: A bucket is a logical container for a customer. A bucket either makes a link between origin server URL and CDN URL or it may contain the content itself (that is ftp-ed into the bucket at the entryPoint). An end point will replicate files from the origin server to files in the bucket. Each file in a bucket may be mapped to exactly one file in the origin server. A bucket has several attributes associated with it. These attributes are of two kinds (i) file-system based (much like the attributes of a file under Unix) and (ii) for content distribution, e.g. duration that a piece of content is valid, geo- blocking of content etc. Mechanisms are also in place to ensure that new versions of the content at the origin server get pushed to the bucket at the endpoints and old versions are removed.

A customer may create as many buckets as she wants. A bucket is really a directory that contains media files. A bucket may contain sub-directories and files within those sub-directories.

Geo-location: It is the identification of real-world geographic location of an Internet connected device. The device may be a computer, mobile device or an appliance that allows for connection to the Internet for an end user. The IP-address geo-location data can include information such as country, region, city, zip code, latitude / longitude of a user.

Consistent Hashing: This method provides hash-table functionality in such a way that adding or removing a slot does not significantly alter the mapping of keys to slots. Consistent hashing is a way of distributing requests among a large and changing population of web servers. The addition of removal of a web server does not significantly alter the load on the other servers.

MD5: In cryptography, MD5 is a widely used cryptographic function with a 128- bit hash value. MD5 is widely used to test the integrity of the files. MD5 is typically expressed as a hexadecimal number. OB (Operating Business): An OB is an arbitrary geographic area in which the TID's CDN is installed. An OB may operate in more than one region. A region is an arbitrary geographic area and may represent a country, or part of a country or even a set of countries. An OB may consist of more than one region. An OB may be composed of one or more ISPs. An OB has exactly one instance of DNS and Topology Server.

PID (Partition ID): This is merely a mapping of IP prefixes at the AS level to integers. This is a one-to-one mapping. This is a very coarse partitioning of IP prefixes.

DNS (Domain Name Service): DNS is a service that translates domain names into IP addresses. DNS resolution is hierarchical. If one DNS server does not know how to translate a domain name, it asks other DNS servers starting with the root DNS server until it finds the DNS server that can resolve the domain name.

CDN (Content Delivery Network): This refers to a system of nodes (or computers) that contain copies of customer content that is stored and placed at various points in a network (or public Internet). When content is replicated at various points in the network, bandwidth is better utilized throughout the network and users have faster access times to content. This way, the origin server that holds the original copy of the content is not a bottleneck.

URL: Uniform Resource Locator (URL) is the address of a web page on the world-wide web. No two URLs are unique. If they are identical, they point to the same resource.

URL (or HTTP) Redirection: URL redirection is also known as URL forwarding. A page may need redirection if (1 ) its domain name changed, (2) creating meaningful aliases for long or frequently changing URLs (3) spell errors from the user when typing a domain name (4) manipulating visitors etc. For this purpose, a typical redirection service is one that redirects users to the desired content. A redirection link can be used as a permanent address for content that frequently changes hosts (much like DNS).

Video on Demand (VoD): These are systems that allow users to select and watch video streams. These systems may stream content via a set-top-box, a computer (allowing for content to be viewed in real-time) or allow the file to be downloaded before viewing.

Hypertext Transfer Protocol (HTTP): HTTP is a request-response protocol designed for client-server communication on the web. A web browser, acting as a client submits HTTP requests to a web site that acts as a server. Internet Service Provider (ISP): An Internet Service Provider (ISP) provides its customer's access to the Internet. The ISP may use dial-up, DSL, cable modem or high speed interconnects to do this.

Least Recently Used (LRU): It is an example of replacement algorithm or a replacement policy. LRU discards the least recently used items first. Often used to pre- place caches, in this invention, it is used this to replace files on a disk.

Application Programming Interface (API): An application-programming interface (API) is a set of rules and specifications that software program can follow to access and make use of the services and resources provided by another software program that implements the API. In this document, it is referred to APIs as Web APIs (or web service) based on REST style communication.

NS and A Records: An A record is a 32 bit IPv4 address. It is used to map hostnames to an IP address of the host. An NS (Name Server) record delegates a region to use a given authoritative name server.

Brief Description of the Drawings

The previous and other advantages and features will be more fully understood from the following detailed description of embodiments, with reference to the attached drawings, which must be considered in an illustrative and non-limiting manner, in which:

Figure 1 shows a typical OB setup with all components of the CDN infrastructure of the second aspect of the invention. Note that the log processing and CDN monitoring infrastructure may be operated on a separate network.

Figure 2 shows a deployment of the service provider's CDN. Each Operating Business (OB) runs a CDN that connects to the Internet backbone, according to an embodiment of the invention.

Figure 3 shows the interaction between CDN customer and the service provider's CDN infrastructure on uploading content, according to the method of the first aspect of the invention.

Figure 4 illustrates an example that shows all the interactions in a service provider's CDN infrastructure. More exactly, this figure shows all the interactions of a CDN customer, end users and interactions between the components of the CDN infrastructure, as per an embodiment of the method and infrastructure of the invention.

Figure 5 shows a DNS resolution in a service provider's CDN infrastructure as per an embodiment of the first aspect of the invention. In all, the DNS resolution entails two DNS queries and one HTTP redirect.

Detailed Description of Several Embodiments

The service provider's CDN infrastructure of the second aspect of the invention comprises several embodiments. An embodiment includes a CDN manager hosting one or more APIs for implementing the method of the first aspect of the invention. Other elements of the CDN infrastructure are: entry point (or publishing point), end point, tracker, origin server, topology server, live splitter, DNS server and log server. In the present invention, the terms content servers and end points will be used without distinction. Also, the terms entry point and publishing point or publishing server will be used without distinction.

For an embodiment, the elements of a CDN service according to the method and deployment scenario of the CDN infrastructure of the invention may be deep within an ISP network. This allows the ISP to host content servers (or end points) at the edge and as close as possible to the end users as needed. For instance, end points could be hosted at same location as the DSLAMs. The goal of a CDN is to move content from origin servers to end points that are close to requesting end users. Thus, content may be served with lowest delay to requesting end users.

The method of the first aspect of the invention further comprises different CDN elements interacting with one another according to the several embodiments of the invention for deployment of a CDN infrastructure.

For an embodiment of the invention, the CDN infrastructure of the second aspect and also described in the method of the first aspect comprises of CDN elements that are part of a typical OB deployment scenario. Other embodiments of the invention are: the deployment of CDN elements in a region, a multiple OB deployment scenario for entities of a CDN service.

Another embodiment of the method of the invention comprises using the topology server to describe the topology at an OB level for creating an abstraction of the topology in a region and/or an abstraction for connecting two regions within the same OB.

For another embodiment, the method comprises providing steps for interaction between different types of CDN elements, the CDN customer (content provider or owner) and end-users (requesting content).

The function of each of the CDN elements is described first. Next, the CDN design is described. , A deployment scenario of an OB and how the deployment may be extended to multiple OBs is also detailed. The description includes detailing the actions resulting from a CDN customer uploading content to the service provider's CDN and actions resulting from an end user requesting content hosted by the service provider's CDN.

Description of individual CDN elements

Here, the function of each of the CDN elements is described.

Entry Point (Publishing point): Any CDN service provider's customer may interact with the CDN infrastructure solely via the entry point. The entry point runs a web services interface to create / update their CDN accounts and to create / delete and update buckets for their account.

Each bucket has associated with it meta-data of two kinds, Unix like file-system meta-data and content distribution meta-data. The customer may define bucket-level meta-data when creating a bucket. On creating a bucket (a folder that holds content files), a CDN customer may upload content file together with file-level meta-data.

A customer may create two kinds of buckets: VoD for Video on demand content and Live to stream live content. For VoD content, a CDN customer has two options for uploading data. The customer can either upload files into the bucket or give URLs to the files that point to the CDN customer's website. The data is then downloaded for processing by the CDN publishing point. Subsequently, the downloaded file is moved to another directory for post-processing. For distributing live content, the CDN customer merely gives the URL of the live stream as a meta-data parameter to the live bucket. The CDN infrastructure is responsible for getting the original content stream and distributing it to the CDN end users.

CDN Manager: The CDN manager hosts the Content Manager API, the DNS API and the Network Topology API (all APIs are on this server). There is one instance of the CDN manager for the entire CDN. The CDN manager may reside at one of the publishing points or in a separate server. The CDN manager may reside in OB0 or any one of the OBs.

End Point: An end point is the entity that manages communication between requesting end users and the CDN infrastructure. It is essentially a custom HTTP server.

The end points maintain a list of buckets/files that are available with other end points in the same datacenter. This allows end points in the same datacenter to get content from one another when possible and go to the origin server for the requested content when necessary. The end points also have a list of all files and buckets in the OB. Thus, they are also responsible for geo-blocking requests to the CDN infrastructure, discarding frivolous content requests, performing authentication on requested content if deemed necessary by content owners. The end points also play a central role in DNS resolution since it has both the consistent hash ring and the regions database to identify the region of the originating request and issue a HTTP redirect to the request to the appropriate region if necessary.

Tracker: The tracker is the key entity that enables intelligence and coordination of the CDN infrastructure. In order to do this, a tracker maintains (1 ) information about content at each end point through consistent hashing and (2) collects resource usage statistics periodically from each end point. As part of this, the tracker maintains information like number of outbound bytes, number of inbound bytes, number of active connections for each bucket, size of content being served etc.

Tracker assists in communication between end points by identifying neighbourhoods of end points allowing them to share data between one another rather than going to the origin server to retrieve data each time. The tracker also helps end points synchronize with other entities of the CDN infrastructure: change in bucket metadata and geodb with the CDN manager, regions database between with the TLD DNS server, pidlocdb and cost-matrix with the topology server.

Origin Server: This is the element of the service provider's CDN infrastructure that contains the master copy of the data. Any end point that does not have a copy of the data can request it from the origin server. A CDN customer does not have access to the origin server. The service provider's CDN infrastructure moves data from the publishing server to the origin server after performing sanity-checks on the downloaded data (MD5 of data etc.). The origin server also maintains a block level integrity of the downloaded data.

Origin server is the property of buckets in the service provider's CDN. Every file in a bucket has associated with it, an origin server URL. Files from the CDN customer are first uploaded to the publishing point and then to origin server after postprocessing.

Topology Server: The Topology server provides a topology aware service that has information about the topology of the OB in which it operates. Every OB can specify their routing policies between IP prefixes at the topology server. The topology server maintains information about the transit networks and cost of traversing links as well.

The topology server helps the tracker select the path of least cost (as defined by the OB policy) in the event that there are multiple paths between a content source and a requesting end user.

Live Splitter: When the customer creates a bucket for a "live" channel (or event), it is passed to the live splitter. The live splitter serves as an origin server for a live stream. The live splitter is responsible for getting a live-stream, splitting a live stream and using a segmenter to create a playlist. The playlist generated is sent to end points that can serve live content. The live-splitter is limited only by the available bandwidth.

DSN Server: The DNS server provides the name resolution within the CDN infrastructure. The root DNS forwards a query for t-cdn.net to the DNS server that is authoritative for the top-level domain (TLD). Subsequent query to TLD DNS server resolves queries for DNS server authoritative for the second-level domain. These servers, authoritative for the second-level domain are within each OB.

Log Server: HTTP log files from all end points are sent to this server. These log files are then processed by the processing infrastructure at the log server for real-time display and accounting for the CDN customers. Next, the entities that are part of the CDN deployment are described. First, the elements that are part of a typical single OB deployment are identified. Next, an example with multiple OBs is detailed. The global entities in the CDN are also identified and detailed.

Single OB Deployment:

Figure 1 shows a typical deployment of an OB, according to an embodiment of the invention. This deployment assumes one region in the OB. The deployment consists of a tracker, a DNS server that is authoritative for an OB, a publishing point (comprising both the CDN manager and the Content Publisher), a live splitter to handle live content, a topology server an origin server to hold content uploaded by the CDN customers, a log server to hold logs from all end points and a number of end points that serve content to the requesting end users. The log server feeds into the reporting infrastructure of the OB. The log processing/reporting may occur in a separate network and is part of an OB deployment. In addition, the TLD for the CDN may also reside within an OB. The requirements for the deployment of an OB are summarized as follows:

(1) The TLD DNS subnets are independent. They are not considered as part of the OB's topology and never subject to geo-blocking.

(2) An OB must have exactly one topology server.

(3) An OB may deploy the CDN in more than one region (within the same OB).

(4) A region must have exactly one tracker and DNS server authoritative for that region.

(5) Two OBs may have the same region ID. The CDN administrator at an

OB assigns region IDs within each OB (an OB may have several regions). Region IDs are not globally unique.

(6) A region may contain more than one subnet.

(7) A subnet belongs to exactly one region.

(8) If an administrator defines two subnets and one subnet includes all the other ones (a subnet cannot be only partially included; either it is fully included or they are disjoint), then both subnets can belong to different regions and OBs. The smaller subnet takes precedence in determining what sub-region it belongs to. That is, if subnet A contains subnet B, and A and B belong to regions R_A and R_B, then any IP from subnet B will belong only to region B, and not to region A.

(9) A default region 0 is defined for every OB. An IP address not in any IP subnet of the OB is assigned to region 0. The TLD DNS server is responsible for forwarding end user requests to the appropriate OB DNS server.

Thus, as explained above, an OB may have as many trackers and DNS servers as the number of regions in an OB. The DNS servers of an OB are authoritative only in the region in which they operate (one DNS server per region).

Deployment across multiple OBs:

Figure 2 shows a deployment of the CDN with three OBs. Each deployment runs as an independent CDN. Links marked 0 are backbone links that connect the OBs with large, core routers in the Internet. The labels on OB1 are explained as follows: Label 1 is the publishing point (also called the publishing server). This is where the CDN customer creates buckets, uploads content. The content is moved to the origin server (label 1 a) after post processing and sanity checks. When an end user requests content, the DNS server (after one HTTP re-direction) forwards the request to the tracker at the OB (label 1 b). After executing a consistent hash on the request, the tracker forwards the request to the end point (label 1 c) that can best serve the request. When the request for content arrives directly from the end user, if the end point has the content, it will serve the content. If not, the end point will fetch the content from the origin server (label 1 d) before serving the request.

The CDN infrastructure also contains a global element, the TLD DNS server. It can either sit independently or it can reside within one of the OBs. The CDN Manager resides with the publishing server in one of the OBs (since there is one instance of the CDN Manager for the CDN).

Description of region zero (global entities)

Every OB must set a default region 0. An OB administrator is responsible for defining the range of IPs under administration of each of its regions. Since no OB will define the whole range of IPs, the remaining IP prefixes are assigned to region 0 (region zero).

Region zero is a default global region. It has a tracker to coordinate all of the end points deployed in this region. Any request from an IP that is not part of any of the regions will be served by an end point from the default region. This global region is a separate entity altogether. This global region is dimensioned so that it can handle requests and has enough capacity to serve content requests from all parts of the world where regions are not properly defined. API Definitions

As stated above, the method and deployment scenario of the CDN infrastructure of the invention provides for the use of a CDN manager hosting all the APIs for the CDN. This allows a CDN service provider for an OB to use the APIs for content management, DNS service when defining a region and Network Topology when defining the underlying network. The CDN manager provides Web APIs (or web service) based on REST style communication with defined request and response messages. Said APIs are described next for some embodiments. Content Management API

This API provides all necessary methods to manage content that the CDN will deliver. The delivery is based on the concept of an intelligent bucket, a logical container with appropriate meta-data. The meta-data for a bucket is composed of two parts: file- system level meta-data (as in a Unix file-system) and content distribution metadata.

Two kinds of buckets are defined: (1 ) VoD bucket for video streaming or static file delivery and (2) Live bucket for live streaming.

The VoD bucket defines meta-data for authentication type, distribution bitrate, geo-blocking, content pre-positioning, whitelist / blacklist, etc.

The Live stream on the other hand defines meta-data for source URL of the live stream, the live-splitter, geo-blocking, whitelist / blacklist, etc.

The API defines methods for creating, deleting, retrieving and updating bucket meta-data.

In addition, the content management API defines file level meta-data updates for each file in a bucket. The file level meta-data override the bucket level meta-data. The API defines methods for listing files in a bucket, adding files to and deleting files from a bucket. Some of the meta-data parameters that may be over-written are: enabled, startdate (valid from), enddate (valid until), bandwidth (delivery rate), geoloc (for geo-blocking), etc. DNS API

The DNS entry point is part of the CDN Manager in the form of a DNS manager. The DNS manager provides a DNS API for communication between a CDN / OB administrator and the CDN infrastructure. There is one name space for the CDN: t- cdn.net. The namespace is divided into OBs. An OB may have multiple regions. The regions are each assigned a region ID (an integer). Once an OB or regions administrator adds a CDN component in an existing region or adds a DNS server in a new region via DNS API calls, the DNS update is sent to the authoritative DNS server of the region. The DNS API is used to assign names and regions, performing geo- location and to translate names in the name space.

The TLD DNS server is responsible of the geo-localization, translating names in the CDN name-space: (a) bucket names will always be geo-localized using the regions database. For example, b10.t-cdn.net will become b10.1 1 1.t-cdn.net for a user in region 1 1 1. (b) Named hosts (i.e., webcachel ), will be geo-localized if they are present in the names database. If the name host is not present in the names database, the name will be resolved as an alias for the region 0. The TLD DNS is in the default region 0. IP address space that does not belong to any region in any OB defaults to region 0. The TLD DNS is responsible for forwarding end user requests to the appropriate OB DNS servers.

The API defines a method for creating entities in an OB or a region in an OB, a GET on the names method to list all the entities in an OB. The API also defines methods that use a regions code to retrieve, update and delete information about a network element (tracker, DNS server, topology server etc.).

The API is useful in creating a region, retrieving the list of current regions, retrieving information about the region, updating region information (e.g. subnets that belong to a region) and deleting a region.

The API also defines methods to retrieve the regions database (all IP prefixes in a region from a regional DNS server) and the entire database of regions (all IP prefixes in all regions) from the TLD DNS server. The tracker executes the last two API calls.

Network Topology API

The network topology API allows the specification of a logical network topology graph from the point of view of a network operator, by acting as an interface with the Topology Server. The API allows the operators to abstract physical connections into logical partitions containing subnets. The partitions are linked to one another with edges that have weights associated with them. These weights indicate the cost of transferring data on the link between the two partitions.

Subnets: A subnet object has the following attributes: subnet and mask. The API defines methods to create a new subnet, delete an existing subnet and retrieve a collection of subnets. The API also defines methods to add subnets to a partition ID (PID), delete subnets from a PID, and retrieve all subnets that belong to a PID. A method to retrieve the entire partition - subnet pairs list is also defined.

Partitions: The API defines methods to create a partition object that returns a PID, return a collection of partitions, retrieve information about (subnets in) a PID, update / create a PID and delete a PID.

Edges: A network edge defines the cost of transporting one unit of data between two PIDs (or simply, the weight of an edge). The API defines methods to get the costmatrix (a matrix that defines the cost between every partition pair), create, update and retrieve the cost between two PIDs. As a default, the cost between two PIDs is set to a large value. The API also defines methods to retrieve all edges with a given origin PID, retrieve all edges to a destination PID. The topology described at a Topology Server can extend far beyond its OB boundaries. Only the OB point of view of the network topology is described. As an example, an OB in Spain creates a partition that defines all the subnets in USA while another OB in Argentina creates partitions that defines all of the subnets of USA.

Method for adding/deleting a region in an OB:

For the next described embodiment, the APIs described above that are hosted at the CDN manager are used to add and delete regions in an OB.

The CDN manager also hosts the DNS manager that is the entry point to TLD

DNS server. This allows the DNS manager to remotely manage the SQL database at the TLD DNS server.

An OB administrator assigns a region ID for a new region in an OB. Once all the elements of a region within an OB are configured, the DNS API is invoked. This call to the DNS API associates the new region with the TLD DNS server. This allows the TLD DNS server to recursively resolve a request destined to a region in an OB.

When a region within an OB is deleted, the DNS API is invoked. This has the effect of (i) deleting the region from the TLD DNS server, (ii) deleting all the NS records and A records of the region just removed and (iii) deleting all IP address ranges that belong to the deleted region.

Description of the interaction with service provider's CDN:

Next, the interactions for the service provider's CDN are described. The interactions consist of three parts: (a) A CDN customer's interaction with the service provider's CDN when adding content, (b) interaction between all of the components of the CDN infrastructure and (c) how does a CDN serve content when an end user makes a request for content hosted by the CDN.

- Customer interaction with a service provider's CDN infrastructure:

The CDN customer interactions with the service provider's CDN are next described. Figure 3 shows the sequence diagram of interactions between CDN customer and the CDN infrastructure and between various elements of the CDN infrastructure on uploading content.

The entry point is the sole point of communication for a CDN customer with a service provider's CDN infrastructure. A CDN customer can use her account to create, remove and edit buckets (a directory or folder that holds the CDN customer's content files).

A CDN customer must first create a content bucket within the CDN infrastructure. The bucket has a globally unique id that helps the CDN service provider to identify a CDN customer. The CDN manager as shown in Figure 3 provides this unique bucket id. When customers create a bucket, associate appropriate meta-data to the bucket and upload content to the bucket, a requests. xml file is created. This file (requests. xml) contains a list of files that need to be placed in a bucket as well as metadata associated with each file.

A CDN customer may add content by any one of two methods: upload content or give the IP address of the CDN customer's origin server from where the CDN infrastructure can download the content. The content is downloaded at the publishing point. Before the content is transferred to the origin server, the integrity of content is checked. The customer can either upload the files themselves to the publishing server or point to the files as being in a server at the customer's site. In this case, the CDN infrastructure will pull the content from the CDN customer's server. Since the XML also contains meta-data for each file in the bucket and is set by the customer, it ensures that the customer has full control over the content.

If the content is uploaded at the publishing server, the CDN customer may also assign some file-level meta-data and / or override some of the bucket meta-data. This has the effect of creating a requests. xml file. The requests. xml file defines meta-data of the uploaded file(s) in the bucket. At the publishing point, a monitoring process looks for the existence of requests. xml file. Once it detects that all files referenced in the xml are present (and have the right size), the files are moved to another directory for post- processing. At this time, the content publisher is invoked.

The content publisher checks the uploaded files for content integrity. Once the integrity of the content is checked against the meta-data defined by a CDN customer, a block-level checksum for each uploaded file is created. This ensures that the master copy of the residing in the origin server of the CDN is always valid.

Once the request. xml file is detected at the publishing point, it is read, and if the media files are uploaded, they are moved to the origin server after checking for their integrity. On the other hand, if the URL of the media files point to a server in the customer's premises, these files are downloaded to the publishing point and then to origin server of the service provider's CDN infrastructure. All files are now under the control of the CDN infrastructure but all meta-data control over individual files and buckets is with the CDN customer. This ensures that the CDN customer has full control over the duration for which content is valid and also the geographic location in which the content may be shown. The meta-data allows a CDN customer to add, remove and update content file(s).

After processing the files, the monitoring process generates a responses. xml file in the same directory. This notifies the CDN customer if the content uploaded to the publishing point was successfully ingested. In addition, the CDN customer gets the CDN URL(s) of the uploaded content.

Next, the content publisher moves the uploaded files together with the block level integrity check to the origin server. Any end point requesting content will make the request to the origin server. The same holds if the content is downloaded from the origin sever by any of the CDN end points that serve the content to a requesting CDN customer.

The end points have a list of buckets and files together with block level checksum information from the CDN manager (via the tracker). When end users request a content file, the end point contacts the origin server. The origin server (or other end points in the same datacenter) serve the requested content. The receiving end point verifies the block level integrity of the downloaded file. Once verified, the end point serves the content to the requesting end user(s). - CDN Interaction among its core components:

The CDN infrastructure consists of a variety of components - tracker, topology server, end points, publishing server, origin server and entry point. A CDN customer can interact with the service provider's CDN infrastructure only at the publishing server.

The communication between each of the CDN entities occurs via RPCs. The RPCs may take any of the following formats: XML, binary, json, REST API call etc. with HTTP as a transport mechanism.

In this section, the interaction between the entities of the CDN infrastructure is described.

Interaction between the CDN manager and the tracker: Any change in meta- data of a bucket (or the file in a bucket) by a CDN customer is reflected at the CDN Manager immediately. Since the tracker synchronizes the buckets with the CDN Manager periodically, any change in the bucket meta-data is reflected at the tracker in a matter of seconds. The tracker also synchronizes with the end points frequently. So, any change in the meta-data of the bucket (or any file in a bucket) at the content manager is propagated to the end points in a very short time (tens of seconds).

The geo-location database, geodb (and any changes in the geodb) are propagated to the tracker periodically. The geodb information is sent to the end points to exercise geo-blocking.

Interaction between tracker and topology server: The trackers keep an open connection with the topology server. The tracker fetches information about (1 ) partitions (or subnets), called pidlocdb and (2) cost-matrix (called costmatrix) between partitions (or subnets). The topology server has a global view across geographically distributed trackers in an OB. ISPs can define their routing policies via the topology server. The topology server can recommend certain routes to the tracker over others depending on cost of sending data across the transit networks, IGP or BGP policies and the current load on the links. This information is contained in the cost-matrix between partitions.

Interaction between the CDN end points and the log server: Periodically, all content access logs in every end point of an OB are forwarded to the log server. These log files are processed for maintaining accounting for the CDN customers and to monitor the end point traffic in the CDN.

Interaction between tracker and TLD DNS Server: The tracker receives a file that contains information about regions (called regionsdb) from the TLD DNS server. This regions information is passed to the end point periodically or when updated. This information is useful for an end point for (1 ) performing geo-blocking and (2) determining the region of an originating request.

If the region of the originating request is not the same as that of the end point, the end point returns a HTTP 302 while encoding the region as part of the URL (using regionsdb obtained from the TLD DNS server).

Interaction between tracker and end points: The tracker synchronizes with the end points regularly. The tracker updates the end points with the following information regularly: (1 ) list of all buckets meta-data, (2) pidlocdb, a table of PIDs, (3) regionsdb, information about regions in the CDN, (4) geodb, the geolocation database, (5) costmatrix, the cost of transferring data between two PIDs and (6) neighbourhood IP list. Any change in a customer's bucket meta-data is propagated to the tracker. The change in bucket meta-data is also propagated to the end points in a matter of seconds. This ensures that any change a CDN customer makes to her files in a content bucket has an almost immediate effect on its distribution. All end points keep an open connection to the tracker. The tracker collects detailed usage (cpu, disk, connection count) and file statistics (bytes transferred inbound/outbound) at each end point. It uses this information to load-balance connections across endpoints after determining the end points that have the requested content upon executing consistent hashing on the requested file.

The tracker also helps the end point keep an updated list of all its neighbours. An end point can request the tracker to update its list of neighbours. In response, the tracker sends a list of neighbouring IP addresses and their file statistics. This allows an end point to determine which of the end points in its neighbourhood is best positioned to send the content requested (rather than go to the origin server to retrieve the content) by an end user.

Interaction between end points in the same datacentre: End points in the same datacenter inform their neighbours of the availability of newly downloaded content. Thus, other end points download the content (if needed) from end point(s) in the same datacenter that have the content thereby reducing the load on the origin server. This is referred to as peer-to-peer communication (since end points in the same datacenter are involved).

In Figure 4, label 1 corresponds to the CDN customer's interaction with service provider's CDN infrastructure. Labels 1 a and 1 b correspond to data being downloaded to the origin server of the CDN infrastructure.

- End-user interaction with service provider's CDN infrastructure:

Next, it is described how the Domain Name Service (DNS) works for a service provider's CDN. This is the first step for a CDN in identifying an end point that is best positioned to serve the requested content to an end user. Once the end point is identified, it gets the requested content from either the origin server or another end point in the same datacenter. It is described the entire process in more detail below:

The end users in a home do not communicate directly with a DNS resolver. Instead, the DNS resolution occurs transparently regardless of the application program a user runs. DNS resolution occurs in the following steps: (1 ) the requesting program sends an address resolution request to the DNS resolver in the local operating system. If the local cache contains the answer, the resolver returns the value in the cache to the requesting program. (2) If the cache does not contain the answer, the resolver will send the request to the designated DNS server of the organization. For most home users, the ISP to which the machine connects provides the DNS server (or resolver). (3) The DNS resolver of the ISP will attempt to resolve the IP address first in its local cache or request the root DNS server. So, the IP address received by any of the TLD is not the IP address of the client, but that of the DNS resolver of the ISP.

Any DNS resolver is configured with known address of root servers. A query to one of the root servers will return the server authoritative for the top-level domain (TLD). In the case of the purposed invention, the root DNS server will return the server authoritative for .net domain. As a next step, it is queried the TLD DNS server for the server authoritative for the second-level domain (in the case of the invention, this is the t-cdn.net domain). The subsequent labelled interaction of Figure 5 is explained below.

(0) The user makes a request for a video bucketjd.t- cdn.net/bucket_id/video01.flv. For simplicity of explanation, let's set bucketjd = 87. The request will now look like 87.t-cdn.net/87/video01 .flv.

(1) The ISP DNS resolver queries the TLD DNS server for the .net domain to resolve t-cdn.net domain.

(2) The .net nameserver responds with the IP address of server authoritative for t-cdn.net domain.

(3) The DNS server authoritative for second-level domain, t-cdn.net first infers 87.t-cdn.net to be an alias for 87. g. t-cdn.net. So, a query for the g. t-cdn.net domain will be performed.

(4) To resolve the sub-zone, it uses its geo-IP database.

(5) The authoritative DNS server for the t-cdn.net domain resolves the sub- zone of ISP DNS resolver. In the example of this invention, the DNS server authoritative for the second-level domain responds with the sub-zone of the client as "es" (for Spain). DNS server resolves the IP address in the sub-zone to be es.t-cdn.net.

(6) The DNS server es.t-cdn.net is the authoritative DNS server for the "es" sub-zone. The ISP DNS resolver will then attempt to resolve 87.es.t- cdn.net by querying the authoritative DNS server in the sub-zone.

(7) The authoritative DNS server in the sub-zone has a list of all the end points. It forwards the request to any one of the end points using either a round-robin scheme or a best effort geo-location scheme that tries to match the request with the closest end point from {End Point 1 , End Point 2, End Point 3, ... , End Point N}. (8) End point 2 receives the request. The end point performs a fine-grained geo-location on the requested IP address and identifies the location as BCN. This implies that an end point in BCN datacenter may be best suited to serve content to the requesting end user.

(9) End point 2 performs a consistent hash on the requested URL (here,

"87/video01.flv"). Next, End point 2 returns HTTP 302, Moved Location abf8.bcn.es.t-cdn.net where abf8 = sub-string(MD5(URL)). The HTTP response is returned to the End User.

(10) The End User sends an address resolution request for abf8. bcn.es. t- cdn.net.

(1 1 ) The ISP DNS resolver forwards the address resolution request from the client to the DNS resolver in the "es" sub-zone.

(12) The datacenter / PoP is identified (in BCN). The address resolution request is now sent to the tracker.

(13) The tracker performs consistent hashing on abf8 to obtain {End Point 3,

End Point 2, and End Point N} as end points that can best serve the request.

(14) The tracker takes into account current load at the end points and identifies end point 3 as being best suited to serve content. The response from the tracker identifying end point 3 is returned to the authoritative DNS resolver in the "es" sub-zone.

(15) The "es" sub-zone DNS resolver forwards the response to ISP DNS resolver.

(16) The ISP DNS resolver returns the response to the End User.

(17) The End User will now attempt to directly connect to end point 3.

If the end point 3 does not have the requested content, it will attempt to get the content from any one of the other neighbourhood end points in the same datacenter (if the content already exists in one of the end points). Alternately, the end point 3 will revert to the origin server to get the requested content and serve the end user. Before serving the content, the end point may also communicate with an authentication server at the content owner if specified in the meta-data of the bucket. In summary, the whole resolution process entailed two queries: one query to the DNS server in the second level domain t-cdn.net to identify the datacenter closest to the requesting client. This step involves URL rewriting and HTTP redirect message from one of the end point in the CDN infrastructure that is chosen to address the query. In the second step, a DNS query is made to identify the end point in the datacenter closest to the client that can best serve content. As part of this procedure, the DNS request is forwarded to the tracker that uses consistent hashing to identify the content and then uses statistical knowledge of the end points to return a set of end points (in order, with the first end point best positioned and so on) to best serve the content to the requesting end user.

Note that DNS server authoritative for t-cdn.net domain may also be authoritative for the sub-zone "es". In such an instance, the DNS server detects that and will attempt to send the DNS query to one of the endpoints to first resolve the closest datacenter that may best serve the request. Advantages of the Invention:

This invention provides the following features:

• The solution is designed to run on commodity hardware. In addition, the solution obviates the need for expensive load balancers and any high-end network elements.

· DNS resolution and load balancing for content distribution is de-coupled.

This makes it easy to distinguish between an overloaded content server and an overloaded DNS server.

The invention allows the CDN service provider to add trackers and end points as needed, making the CDN infrastructure of the invention very scalable.

The invention shows how the CDN infrastructure may add additional OBs easily.

The CDN architecture has a small number of global entities. A number of other entities are deployed on a per-OB basis.

The global entity, the CDN manager is responsible for the following: a. Assigning a globally unique bucket id. A CDN customer may have as many buckets as desired. b. Assign globally unique OB ids. Thus, the CDN manager is invoked when creating a new OB.

• The CDN incorporates the latest network cost in identifying end points that are best located to serve content to requesting end users based on least network cost.

• The CDN infrastructure of this invention does not need very large data centres for hosting. The end points may be added in small datacenters as needed. Depending on how deep in their network the ISP wants to host the end points, the datacenter can be at a block level or city level or at the region level. Unlike traditional CDNs, the CDN of this invention is designed for ISPs.

• The end points may be placed at the same location as the DSLAMs. The physical location of DSLAMs makes it easy for content to be located close to the end users.

• The tracker behaves as a file-level load-balancer for the entry points obviating the need for other hardware to behave as a load-balancer.

A person skilled in the art could introduce changes and modifications in the embodiments described without departing from the scope of the invention as it is defined in the attached claims.

ACRONYMS

ADSL Asymmetric Digital Subscriber Line

CDN Content Distribution Network

DNS Domain Name Service

TLD DNS Top Level Domain DNS

PoP Point of Presence

MD5 Message-Digest algorithm 5

OB Operating Business

PID Partition ID

TLD Top Level Domain

URL Universal Record Locator

HTTP HyperText Transport Protocol

VoD Video on Demand

ISP Internet Service Provider

FTP File Transfer Protocol

TTL Time To Live

MD5 Message Digest algorithm 5

LRU Least Recently Used

API Application Programming Interface

REFERENCES

[1] Content delivery network.

At http://en.wikipedia.org/wiki/Content_delivery_network

[2] C. Huang, J. Li, Angela Wang and K.W. Ross, Understanding Hybrid CDN-

P2P: Why Limelight Needs its Own Red Swoosh, NOSSDAV, Braunschweig, Germany, 2008

Claims

'\ .- Method for managing the infrastructure of a Content Distribution Network service in an ISP network, characterised in that it comprises using a CDN manager to administer one or more regions in an operating business or OB by means of one or more Application Programming Interfaces APIs hosted by the CDN manager of the Content Delivery Network.

2.- Method as per claim 1 , comprising using said CDN manager to manage geographically distributed operating businesses.

3.- Method as per claim 1 or 2, comprising using said APIs by a CDN service provider for at least adding and/or deleting regions in an OB.

4. - Method as per claim 3, wherein said APIs are at least one of:

- a content management API allowing said CDN service provider to provide the content owners the tools to manage their content,

- a DNS API that allows for the configuration of DNS servers in new regions, and names of CDN components both in the same region as said DNS server and at OB level, and

- a Network Topology API allowing the CDN service provider to define its view of Network Topology, either in detail or as a high level abstraction of the underlying network.

5. - Method as per claim 3 or 4, wherein said APIs are Web APIs based on REST style communication with defined request and response messages.

6. - Method as per claim 4, wherein said content management API provides all necessary procedures to manage content to be delivered through the CDN in the form of buckets with associated meta-data, including procedures for creating, deleting, retrieving and updating said bucket meta-data and/or procedures for listing files in a bucket, adding files to and deleting files from a bucket and procedures for defining and updating meta-data for content distribution.

7. - Method as per claim 4, 5 or 6, wherein said DNS API is provided by a DNS manager hosted by the CDN manager, said DNS manager being the entry point of a

TLD DNS server and region DNS servers.

8. - Method as per claim 7, wherein said DNS manager configures TLD DNS server and region DNS servers, by at least adding names and regions.

9. - Method as per claim 8, comprising an OB administrator or region administrator calling said DNS API for adding a CDN component in an existing region or adding a DNS server in a new region.

10. - Method as per claim 8, comprising an OB administrator assigning a region ID for a new region in an OB and, once all the elements of a region within an OB are configured, calling said DNS API to associate new region with the TLD DNS server to allow the latter to resolve a request destined to a region in an OB.

11. - Method as per claim 10, comprising, when a region is deleted by calling the DNS API, the DNS manager:

- deletes the region in the TLD DNS server,

- removes all the NS and A records of the region just removed,

- deletes all IP address ranges that belong to the deleted region; and

- deletes all the Names in a region at the regions DNS server.

12. - Method as per claim 8 or 9, comprising using said DNS API for creating, modifying or deleting names and regions in a service provider's CDN.

13. - Method as per claim 12, comprising using said DNS API for performing at least one of:

- creating entities in an OB or a region in an OB;

- a GET on the names procedure to list all the entities in an OB;

- procedures that use a regions ID to retrieve, update and delete information about a network entity;

- creating a region, retrieving the list of current regions, retrieving information about the region, updating region information and deleting a region; and

- procedures to retrieve the regions database, including all IP ranges in a region from a regional DNS server, and the entire database of regions, including all IP prefixes in all regions from the TLD DNS server.

14. - Method as per any of claims 4 to 13, comprising using said Network Topology API to specify a logical network topology graph from the point of view of a network operator, by abstracting physical connections into logical partitions containing subnets.

15. - Method as per claim 14, wherein said partitions are linked to one another with edges that have weights associated with them, said weights indicating the cost of transferring data on the link connecting the two partitions.

16. - Method as per claim 14 or 15, comprising using said Network Topology API to create a new subnet, delete an existing subnet and retrieve a collection of subnets.

17. - Method as per claim 14, 15 or 16, comprising using said Network Topology API for at least one of:

- creating a partition object that returns a partition ID, or PID that has associated with it, a description and a name or label,

- adding subnets to a PID,

- deleting subnets from a PID,

- retrieving all subnets that belong to a PID,

- retrieving the entire partition-subnet pairs list,

- returning a collection of partitions,

- updating a PID, and

- deleting a PID.

18. - Method as per claim 15, comprising using said Network Topology API for at least one of:

- getting the entire costmatrix that defines the cost between every partition pair that is connected by a link,

- creating, updating and retrieving the cost between two PIDs;

- retrieving all edges with a given origin PID, and

- retrieving all edges to a destination PID.

19. - Method as per any of the previous claims, comprising using said CDN manager for providing a unique ID to a CDN customer bucket when a content bucket is created by the CDN customer.

20. - Method as per any of the previous claims, comprising using said CDN manager to interact with a CDN tracker to periodically synchronize buckets such that any change in the meta-data of a bucket or a file in a bucket is reflected at the tracker within seconds, and vice versa.

21. - Method as per claim 4 or any of claims 14 to 18, comprising said CDN manager using said Network Topology API as an interface with a Topology Server of said OB to define either a region and/or OB level view of network topology, either in detail or as an abstraction of the underlying network.

22.- Method as per claim 21 , comprising using said Topology server via said

Network Topology API at the CDN manager, to describe the topology at an OB level and/or creating an abstraction for the topology in a region and/or creating an abstraction for connecting two regions within the same OB and/or creating an abstraction of the topology connecting multiple OBs.

23. - Method as per claim 21 or 22, wherein said Topology server provides a topology aware service that has information about the network topology as seen from the OB in which it operates and/or information about OB routing policies between IP prefixes and/or maintains information about the transit networks and cost of traversing links.

24. - Method as per any of claims 21 to 23, comprising using said Topology server, via said CDN manager Network Topology API, to help a CDN tracker to select the path of least cost, as defined by one of said OB routing policies, in the event that there are multiple paths between a content source and a requesting end user.

25.- Infrastructure of a Content Distribution Network service in an ISP network, characterised in that it comprises a CDN manager hosting one or more APIs for implementing the method as per any of claims 1 to 24.