Introduction

In today’s world, the server infrastructure machines are either in on-premise data centers, private data centers or public cloud data centers. These machines are either physical bare metal machines, virtual machines (VMs) with hypervisors  or small containers like docker containers on top of physical or virtual machines. These machines physically might be in the local lab. In a private data center scenario, where your own procured hosts are placed  in a physical space that is shared in a third party data center and connected remotely. Whereas in the public data centers like AWS, GCP, Azure, OCI the machines are either reserved or created on-demand for the highly scalable needs connecting remotely. Each of these have its own advantages w.r.t scalability, security, reliability,  management and costs associated with those infrastructures. 

The product development environment teams might need many servers during the SDLC process. Let us say if one had chosen the private data center with their own physical machines along with Xen Servers. Now, the challenge is how the VMs lifecycle is managed for provision or termination in similar to cloud environments with lean and agile processes.

 

This document is aiming to provide the basic infrastructure model, architecture, minimum APIs and sample code snippets so that one can easily build dynamic infrastructure environments.

 

Benefits

First let us understand the typical sequence of the steps followed in this server’s infrastructure process. You can recollect it as below.

    1. Procurement of the new machines by IT
    2. Host virtualization – Install Xen Server and Create VM Templates by IT
    3. Static VMs request by Dev and test teams via (say JIRA) tickets to IT
    4. Maintain the received VM IPs in a database or a static file or hardcoded in configuration files or CI tools like in Jenkins config.xml
    5. Monitor the VMs for health checks to make sure these are healthy before using to install the products
    6. Cleanup or uninstall before or after Server installations
    7. Windows might need some registry cleanup before installing the product
    8. Fixed allocation of VMs to an area or a team or dedicated to an engineer might have been done

Now, how can you make this process more lean and agile? Can you eliminate most of the above steps with simple automation?

Yes. In our environment, we had more than 1000 VMs and tried to achieved and mainly the below. 

“Disposable VMs on-demand as required during tests execution. Solve Windows cleanup issues with regular test cycles.”

As you see below, using the dynamic VMs server manager API service, 6 out of 8 steps can be eliminated and it gives the unlimited infrastructure view for the entire product team. Only the first 2 steps – procure and host virtualization are needed. In effect, this saves in terms of time and cost!

Typical flow to get infrastructure

Typical flow to get infrastructure

Dynamic Infrastructure model

The below picture shows our proposed infrastructure for a typical server product environments where 80% of docker containers, 15% as dynamic VMs and 5% as static pooled VMs for special cases. This distribution can be adjusted based on what works most for your environment.

Infrastructure model

Infrastructure model

From here on, we will discuss more about Dynamic VM server manager part.

 

Dynamic Server Manager architecture

In the dynamic VMs server manager, a simple API service where the below REST APIs can be exposed and can be used anywhere in the automated process. As the tech stack shows, python 3 and Python based Xen APIs are used for actual creation of VMs with XenServer host. Flask is being used for the REST service layer creation. The OS can be any of your product supported platforms like windows2016, centos7, centos8, debian10, ubuntu18, oel8, suse15.

Dynamic VMs server manager architecture

Dynamic VMs server manager architecture

Save the history of the VMs to track the usage and time to provision or termination can be analyzed further. For storing the json document, Couchbase enterprise server, which is a nosql document database can be used.

 

Simple REST APIs

 

Method URI(s) Purpose
GET /showall Lists all VMs in json format
GET /getavailablecount/<os> Gets the list of available VMs count for the given <os>
GET /getservers/<name>?os=<os>

/getservers/<name>?os=<os>&count=<count>

/getservers/<name>?os=<os>&count=<count>&cpus=<cpus>&mem=<memsize>

/getservers/<name>?os=<os>&expiresin=<minutes>

Provisions given <count>  VMs of <os>.

cpus count and mem size also can be supported.

expiresin parameter in minutes to get expiry (auto termination) of the VMs.

GET /releaseservers/<name>?os=<os>

/releaseservers/<name>?os=<os>&count=<count>

Terminates given <count>  VMs of <os>

Pre-requirements for dynamic VM targeted Xen Hosts

  • Identify targeted dynamic VM Xen Hosts
  • Copy/create the VM templates 
  • Move these Xen Hosts a separate VLAN/Subnet (work with IT) for IPs recycle

Implementation

At a high level –

  1. Create functions each REST API
  2. Call a common service to perform different REST actions.
  3. Understand the Xen Session creation, getting the records, cloning VM from template, attaching the right disk, waiting for the VM creation and IP received; deletion of VMs, deletion of disks
  4. Start a thread for expiry of VMs automatically
  5. Read the common configuration such as .ini format
  6. Understand working with Couchbase database and save documents
  7. Test all APIs with required OSes and parameters
  8. Fix issues if any
  9. Perform a POC with few Xen Hosts

The below code snippets can help you to understand even better.

APIs creation

 

Creation Xen session

 

List VMs

 

Create VM

Delete VM

 

Historic Usage of VMs

It is better to maintain the history of all the VMs created and terminated along with other useful data. Here is the example of json document stored in the Couchbase, a free Nosql database server. Insert a new document using the key as the xen opac reference uuid whenever a new VM is provisioned and update the same whenever VM is terminated. Track the live usage time of the VM and also how the provisioning/termination done by each user.

 

 

Configuration

The Dynamic VM server manager service configuration such as couchbase server, xenhost servers, template details, default expiry and network timeout values can be maintained in a simple .ini format. Any new Xen Host received, then just add as a separate section.The config is dynamically loaded without restarting the Dynamic VM SM service.

Sample config file: .dynvmservice.ini

 

Examples

Sample REST API calls using curl

 

Jenkins jobs with single VM

 

Jenkins jobs with multiple VMs needed

 

Key considerations

Here are few of my observations noted during the process and it is better to handle to make it more reliable.

  1. Handle Storage name/ID different among different Xen Hosts
    • Keep track of VM storage device name in the service input config file.
  2. Handle partial templates only available on some Xen Hosts while provisioning
  3. When Network IPs not available and Xen APIs gets the default 169.254.xx.yy on Windows. Wait until getting the non 169 address or timeout.
  4. Release servers should ignore os template as some of the templates might not be there Xen Hosts
  5. Provision on a specific given Xen Host reference
  6. Handle No IPs available or not getting network IPs for some of the VMs created.
    • Plan to have a different subnet for dynamic VMs targeted Xen Hosts. The default network DHCP IP lease expiry might be in days (say 7 days) and no new IPs are provided.
  7. Handle the capacity check to count the in progress as Reserved IPs and should show less count than full at the moment. Otherwise, both in-progress and incoming requests might have issues. One or two VMs (cpus/memory/disk sizes) can be in buffer while creating and checking if any parallel requests.

References

Some of the key references that help while creating the dynamic VM server manager service.

  1. https://www.couchbase.com/downloads
  2. https://wiki.xenproject.org/wiki/XAPI_Command_Line_Interface
  3. https://xapi-project.github.io/xen-api/
  4. https://docs.citrix.com/en-us/citrix-hypervisor/command-line-interface.html
  5. https://github.com/xapi-project/xen-api-sdk/tree/master/python/samples
  6. https://www.citrix.com/community/citrix-developer/citrix-hypervisor-developer/citrix-hypervisor-developing-products/citrix-hypervisor-staticip.html
  7. https://docs.ansible.com/ansible/latest/modules/xenserver_guest_module.html
  8. https://github.com/terra-farm/terraform-provider-xenserver
  9. https://github.com/xapi-project/xen-api/blob/master/scripts/examples/python/renameif.py
  10. https://xen-orchestra.com/forum/topic/191/single-device-not-reporting-ip-on-dashboard/14
  11. https://xen-orchestra.com/blog/xen-orchestra-from-the-cli/
  12. https://support.citrix.com/article/CTX235403

Hope you had a good reading time!

Disclaimer: Please view this as a reference if you are dealing with Xen Hosts. Feel free to share if you learned something new that can help us. Your positive feedback is appreciated!


Thanks to Raju Suravarjjala, Ritam Sharma, Wayne Siu, Tom Thrush, James Lee for their help during the process.

Author

Posted by Jagadesh Munta, Principal Software Engineer, Couchbase

Jagadesh Munta is a Principal Software Engineer at Couchbase Inc. USA. Prior to this, he was a veteran in Sun Microsystems and Oracle together for 19 years. Jagadesh is holding Masters in Software Engineering at San Jose State University,USA and B.Tech. Computer Science and Engineering at JNTU,India. He is an author of "Software Quality and Java Automation Engineer Survival Guide” to help Software developers and Quality automation engineers.

Leave a reply