Welcome, Guest Login

Public Cloud Center

How Xervo Works

Last Updated: Jan 03, 2017 11:12AM EST

Xervo is a Platform as a Service (PaaS). Our goal is to abstract the difficulties of managing infrastructure, environments and runtimes, and allow you to focus entirely on developing software. The Xervo platform handles everything required to deploy, manage, and monitor your product in a production ready, scalable, and secure environment.

The simplicity of using Xervo hides a tremendous amount of activity happening behind the scenes. This article describes the technical details of how Xervo works and what happens between you deploying source code and the source code running in production.

Xervo has a lot of features, but the core of the platform boils down to three major pieces:    
    1. Deploying - pushing new revisions of your product and getting those into production.
    2. Scaling - once a product is deployed, scaling it up to meet any amount of demand.
    3. Load balancing - ensure even distribution of traffic across all of your application instances.

In each of these pieces, Xervo makes use of Docker for its containerization technology. All customer applications are run inside Docker containers to ensure proper resource and security isolation without requiring the overhead of traditional virtual machines. 


Deploying is typically the most common activity. Each time you make a revision to your product, your product will need to be deployed into production. Each deploy goes through four major steps: upload, build, provision and run. 

Upload - Build - Provision - Run


Uploading a new revision of a product is done by simply running the "Xervo deploy" command using our CLI or by uploading a zip through the web interface. Many developers prefer using the command line tool because it fits nicely into normal development workflows.

The uploaded bundle is inspected and then stored into our internal storage system. What you deploy depends on the runtime you've chosen for your application. The uploaded bundle could be Node.js source code, PHP source code, a WAR file describing a Java application, or simply static content.


Xervo makes use of a distributed build system that takes your application, does whatever build process is required, and outputs a bundle. The bundle is then stored in our storage system to be used later in the process.

Xervo build system makes use of ephemeral Docker images that live only long enough to perform the build. When a build is invoked, the build system pulls the appropriate image from our internal registry, runs the build process, zips up the output, and then destroys the Docker container.

Each supported runtime has its own build image. The Xervo build images are open sourced so you can see exactly what steps your source code will go through.

Source Code - Build Server - Docker -


Provisioning is the act of allocating a Docker container, what we call a servo, for your application. Provisioning is one of the most fundamental and largest benefits of using the Xervo platform. The Xervo orchestration software converts regular infrastructure into a pool of container capacity ready to run production applications.

Provisioning only occurs if the project does not already have the needed containers. This is usually during an initial deploy or when scaling up. If the project already has all needed containers, subsequent deploys are sent to the existing, already provisioned containers.

Provisioner - Application Host

The Xervo container runtime allows physical servers (e.g. AWS EC2 instances, DigitalOcean droplets, etc) to run any number of Docker images and image sizes. A layer of intelligent capacity planning ensures application hosts run the optimal number and size of containers for the best possible performance.

When provisioned, each container receives a 2GB volume that can read and written to as needed by the application. When the container is deprovisioned, anything written to this volume is destroyed. For persistent storage we recommend using a Xervo MongoDB instance or any number of block storage solutions like Amazon S3 or Joyent Manta.

Where containers are provisioned depends on the project's scale options. 
Xervo supports multiple infrastructure providers and multiple regions that can be provisioned simultaneously for truly global and highly available applications. Xervo automatically handles the load balancing and failover for provisioned Docker containers.


After the provisioner has provisioned the appropriate runtime image, we can send the compiled bundle into that image. In most cases the provision step will be skipped and new revisions the app will be directly deployed into the already provisioned containers. Since the bundle is already built, making the switchover from the old version of your application to the new is very quick. The process is: 

    1. Send graceful shutdown request to old application. 
    2. Stop the old application instance. 
    3. Remove the old application instance. 
    4. Extract new application instance. 
    5. Start the new application instance. 

Under normal conditions, the actual switchover takes a few milliseconds. The amount of time it takes the application to start serving requests then depends on how long it takes your application to startup.


After a project is deployed to Xervo and it starts growing, scaling it up is an important next step. Xervo supports two types of scaling: vertical and horizontal. Vertical scaling is done by adding more available memory to a servo. Horizontal scaling is done by adding more servos. In modern, large-scale applications, the preferred scalability approach is horizontal. Horizontal is preferred because in reality there is a limited amount of memory we can throw at a single application instance, whereas theoretically there is no limit to the number of servos we can have.

When a project is scaled up it goes through a provisioning step to acquire more servos based on the scale options you supplied. Once the new capacity is available, the compiled bundle is sent into the container exactly as if you deployed a new version. Since we store the compiled output of your last deploy, the process of spinning up new capacity is very quick.

Xervo also support auto-scaling for times of unexpected traffic or high utilization. Auto-scaling is based on rules you defined and triggered from the statistics we track for running applications. 

Load Balancing

After an application is deployed and scaled, traffic can now be sent to the application instances. The Xervo load balancer is a custom implementation written in Go to more easily support the dynamic nature of a PaaS environment. In a platform with thousands of containers constantly deploying, scaling, starting, and stopping, we required a balancer that can be instantly reconfigured with zero downtime thousands of times per day. 

In order to receive network traffic to your application instance, your product must listen on port 8080. Since many frameworks support the PORT environment variable, we automatically inject a PORT environment variable set to 8080 for convenience. Our provisioning layer redirects port 8080 inside the container to a random port outside. The random port is then stored in the balancer cache using Redis so our balancer knows how to reach your instance. 

Request - Balancer -

The above diagram shows a typical request to an application that has two servos. When the request hits the balancer it will check the cache for the location of all servos based on the host header of the http request. Once the list of servos is received, the balancer will direct traffic to the appropriate one, keeping in mind session affinity. The balancer supports real-time failover in the event that it is unable to reach a servo. 

The balancer will immediately try all servos until it is able to successfully proxy a request. This allows applications to survive failed application hosts, failed servos, or simply unhandled exceptions in their source code, without the customer seeing an error. 


Xervo has seamless support for multiple infrastructure providers and regions. All balancers in all regions have a synchronized cache so they know what and where all servos reside. If a request hits a balancer that does not have an application instance to serve that request in its region, it will automatically proxy the request to a balancer that does have instances. 

​In the above example, a request hit an AWS us-east-1a load balancer, but the project does not have any servos in that region. The balancer detects this and automatically redirects the request to Joyent eu-ams-1 where an instance does exist. This is entirely transparent to the user making the request. 

For global applications that are scaled to multiple regions, Xervo recommends making use of a managed DNS service like Dyn or Route53, so that users are always sent to the load balancer closest to them.

Go to top

Contact Us

    Our Support Policy
    Submit a Support Request

  • Public Cloud Support Hours
    9am-5pm EST Mon-Fri
    Outside of these hours response times may be up to 24hrs.

    Submit a support ticket by clicking 'status and support' icon on the left side of this page.

seconds ago
a minute ago
minutes ago
an hour ago
hours ago
a day ago
days ago
Invalid characters found