Category Archives: Virtualization

Products Security Virtualization

Unifi Video on CentOS Via Docker

Published by:

I recently got a UniFi Video Camera Dome to evaluate. It can be used standalone, but to get the most out of the camera it’s a good idea to run the UniFi Video server. This software provides management for the cameras and recordings. Unfortunately, my home NAS is EL based (CentOS 6), and they only provide Debian and Ubuntu .deb packages. I found some instructions online involving alien to convert .deb to .rpm, and others involving extracting the .deb, installing the java files, rewriting the init script, etc. Instead, I decided to use Docker to create an image that had the necessary Ubuntu dependencies. This is simpler than building a VM, and provides direct access for the Docker container to my NAS filesystem.

Prerequisites:

  • Any OS with Docker daemon. Setting up your system for Docker is outside of the scope of this post, but there are plenty of easy to find instructions for a wide array of operating systems. Oh, one other thing. It needs to support SO_REUSEADDR, due to a dependency that the UniFi software has. For linux this means kernel 3.9+, which you’ll probably already have if you’re running Docker. For CentOS 6, see https://wiki.centos.org/Cloud/Docker. If you need a kernel, El Repo is always a good source.
  • A filesystem capable of holding at least a few tens of gigs for videos. For my example I use a fictitious path “/local/video/data”.
  • The URL to the UniFi Video .deb package for Ubuntu 14.04. This can be found on the ubnt.com support site.

First, we pull the prerequisite Ubuntu image, download the .deb to the video path so we’ll have it in the container, and launch the Ubuntu container. When we launch the container we’re going to do so in interactive mode, linking the host’s network to the container, and mapping the host’s selected video filesystem to the container’s unifi-video path:

docker pull ubuntu:14.04

wget -O /local/video/data/unifi-video.deb http://dl.ubnt.com/firmwares/unifi-video/3.1.2/unifi-video_3.1.2-Ubuntu14.04_amd64.deb

docker run -t -i --net=host -v /local/video/data:/var/lib/unifi-video/videos ubuntu:14.04

 

At this point we should be in a prompt within the container. We’re simply going to install the .deb and its dependencies:

 

apt-get update

dpkg -i /var/lib/unifi-video/videos/unifi-video.deb #errors are ok

apt-get install -f #(to fix/install dependencies)
exit

 

Now we should be out of the container. We’re going to commit our changes, making a new container image, and then run an instance of that container, starting the unifi software and then tailing /dev/null to keep the container running.

 

docker ps -a # find container id

docker commit <container id> unifi-video:1

docker run -d --privileged=true --net=host -v /local/video/data:/var/lib/unifi-video/videos unifi-video:1 /bin/bash -c "/etc/init.d/unifi-video start; tail -f /dev/null"

 

That’s it. At this point you should be able to go to http://<server ip>:7080 or 7443 and see the UniFi software.

While this is fairly easy and straightforward, it’s also a fairly naive way to deploy software in a container.  Normally you’d want to just have the necessary Java components and launch the java process itself, rather than using a whole pseudo OS, but it gets the job done quickly.

Virtualization

CloudStack vs OpenStack! Who will win?

Published by:

I’ve been spending a lot of time working on cloud solutions recently, and have run across this question countless times.  Some are worried about betting on the wrong horse, others already have a stake in it or are linked somehow to those who do, and some simply want to know which is best.  After having gotten to know both solutions, I think this is a short-sighted question, but I’d like to talk a little bit about them before explaining why. If you feel you already know the two well, then skip these sections for the tl;dr version.

CloudStack is the older of the two, and is undoubtedly more mature as of this writing.  It is a full stack solution, the download you get provides management, storage, host agent, EC2 API compatibility, and everything you need to manage a cloud infrastructure. From a development standpoint, there is a CloudStack API for remote control, as well as the ability to develop plugins to modify or add functionality to cloudstack.

It supports KVM, Xen, VMware, and Oracle VM. It has support for bridging, VLAN managementand direct network management, as well as recently added support for Nicira STT and plain GRE isolation for networking(both through openvswitch bridging). For VM disk storage, it allows NFS, Cluster LVM, Rados Block Device(Ceph), local storage, or any other storage that can be mounted as a ‘shared filesystem’, such as GFS or OCFS2. For backups, ISO images, VM templates, snapshots and such, there is a ‘secondary storage’, which is primarily NFS or OpenStack’s Swift object store.

It recently underwent a licensing change and is now under the Apache Foundation, and with that change, some of the components now need to be downloaded separately, such as VMware plugins and other components that aren’t license compatible. The developers are currently working to make that migration, and in the future you may see ‘oss’ and ‘non-oss’ builds for download or be directed in the documentation to fetch a specific package to include a particular functionality.  It of course has the backing of Citrix, is used by large companies like SoftLayer and GoDaddy down to small businesses.

OpenStack started out as a project whose main contributors were NASA and Rackspace. It is younger, but has recently had a lot of news buzz, and is even referred to by some as ‘the media darling’, if an open source project could be described as such. It is a more modular approach. It’s a group of services; the storage service, compute service, network service, and a web dashboard. As such, the installation and management is a bit more complicated, as each service is set up and configured independently, but it’s also a potential bonus in that each service can be swapped out independently with various solutions. Each service also has its own API and could be deployed independently or incorporated into a separate project.

It supports xapi and libvirt based virtualizations solutions, meaning Xen, KVM, LXC, and VMware to the extent in which libvirt is capable. The current networking service has support for linux bridging and VLAN isolation, and a new service called Quantum is in the works which will support SDN based openvswitch isolation like Nicira.  For VM volume storage, it relies largely on iSCSI, and the storage service has the capability of managing many iSCSI targets, as well as Ceph, local and NFS support for VM volumes. There is also an object storage service called Swift, which can be used for generic data storage, and is even run and sold as a separate service by some.

OpenStack is largely backed and used by RackSpace. It recently lost NASA support, but has more than made up for that in good publicity. It has yet to gain a wide install base as of this writing, though there are many contributors and people playing with it if the mailing lists and blogs are any indication.

So back to the original question: Who is going to win?  My prediction is that neither will “win”; this simply isn’t a one-winner game.  Who won between Xen and VMware? Who won between RedHat and Ubuntu?  For all of the posturing that some might do, and the holy war attitudes that grow up around technologies, in the end there is usually room for more than one product. CloudStack seems to be maintaining the lead at the moment, and tomorrow maybe OpenStack will get ahead, there may even be a 70/30 split, but the companies involved and the scale of the market indicate that there will be active development and support for both platforms. This has been evidenced to me in just the past few months, where I’ve seen companies scrambling to provide support for their stuff in both platforms. Companies like Brocade, Nicira/VMware, Netapp, are actively supporting and developing code for both platforms in hopes of driving cloud customers to use their products.

Even if it were a one-winner contest, depending on your size you may be able to influence the race. If you see a tug-of-war in process, with ten men on either side, and you’ve got a team of five behind you, do you need to stop to consider which team you should support in order to back the winner?  Some individuals I’ve heard asking this question should really be asking “which technology do I want to help win?” instead of thinking that they’re an inactive bystander in the equation.

In all of the back and forth, and daily news about who is building what with the backing of which companies, one thing is certain. Cloud infrastructure as a service is going to be a hit.

Storage Virtualization

VM benchmarking, trickier than one might expect

Published by:

I’ve always had a focus on storage throughout my career.  I’ve managed large enterprise vSANs with FC switches, commercial NAS filers, deployed iSCSI over ethernet, and managed ESX with both FC and NFS backends.  I’ve been entrusted to build very large storage servers, up to 32U, with Linux and off the shelf components.  Needless to say, I feel comfortable claiming that I know a little more than the average systems guy about storage, and particularly how Linux handles I/O, so when I turned my attention to benchmarking virtual machine disk performance, I found some interesting behaviors that most who seek to measure such things should probably be aware of, at least to interpret results, if they can’t otherwise be compensated for.

One of the primary things is how the Linux caching mechanisms can throw a wrench in things if you don’t think through what you’re doing.  One needs to be aware of which caches are in effect during each test. For example, it’s common to test with datasets larger than the system’s memory in order to stretch the system beyond its ability to cache, however, consider a 4GB virtual guest on a physical server with 32GB RAM.  Usually the guest systems are run with at least write-through cache from the host’s perspective (speaking in general terms, this can obviously be controlled by the end user on at least some virtualization platforms), so while the experimenter might think that using an 8GB dataset will be sufficient on the guest, or that issuing a drop_caches request between tests on the guest will suffice,  this dataset is likely to be saved in its entirety in the host’s read cache as it goes to underlying storage, artificially boosting the results.  Similarly, performing a write test on the guest and comparing it to the same write test on the host is almost certainly going to give the host an unfair advantage if the experimenter doesn’t take into account the increase in dirty memory available on the host, usually specified in percent of physical memory.

On top of that, there’s the complexity of testing  X number of virtual machines and forming a summation of how they all perform simultaneously on a physical host.  There are some pretty standard methods defined for doing this, such as putting some sort of load on each guest, and then benchmarking one while the others are running their dummy loads, but again, one must be careful, particularly with the dummy loads, that they’re not just looping tests that are small enough to cache, unless, of course, that’s the real-world behavior of the application, which brings me to my point.

It’s kind of a complex beast, trying to get meaningful results, and especially to share them with others who may have different expectations.  One has to determine a goal in disk benchmarking, and it’s usually one of two things; the testing of raw disk performance or an attempt to measure the real-world performance of an application or given I/O pattern.  The former would involve disabling any and all caches, while the latter would strive to utilize the caches how they normally would be.  The challenge in all of this, as mentioned, is that some people will value one set, while others will value the other.  Raw disk performance will tell you a lot about  just how good the setup is, for example whether one should go with that raid6 setup or do raid50 instead, on the other hand, does it really matter how well the disks perform without caches, don’t we want to know how it’s actually going to run?

No matter how it’s done, the most important thing of all is to frame your data properly. “This was the goal or purpose, these are the tests, this is the setup, here are the results”.  I’ve been running some tests that I’ll share shortly, but I wanted to get some of these cosiderations down, as I’ve rarely heard anyone speak of them while reading through the benchmarks of others, which frankly, has made much of the data I’ve seen surrounding vm performance largely useless.

Finally, lest this post be all rambling and not provide anything of concrete usefulness to individuals out there, the following are some mechanisms for controlling Linux caching.

Flush caches (page, dentries, inodes):  ‘echo 3 > /proc/sys/vm/drop_caches’

The above won’t do anything for dirty memory, which can be cleaned up with a ‘sync’, however, this won’t have much bearing on the write test you run afterward, you’ll need to know a little more about how dirty memory works. It would be naive to compare a system with 32G of memory, 3.2 of which can absorb pending writes, with a 4G system that only has 400M with which to cache writes.

In particular, two values are of importance:  /proc/sys/vm/dirty_ratio and /proc/sys/vm/dirty_background_ratio. These two numbers are specified as percentages. dirty_background_ratio tells you how big your dirty memory can get before pdflush kicks in and starts writing it out to disk. dirty_ratio is always higher (the code actually rewrites dirty_background_ratio to half of this if dirty_ratio < dirty_background_ratio), and is the point where applications skip dirty memory and are forced to write direct to disk. Usually this means that pdflush isn’t keeping up with your writes, and the system is potentially in trouble, but could also just mean that you’ve set it very low because you don’t want to cache writes.  For example, you may want to do this if you know you’ll be doing monster writes for extended periods, no sense in bloating up some huge amount of dirty memory only to have the processes forced to write sync AND contend with pdflush threads trying to do writeback.  On the flip side, increasing these values can give you a nice cache to absorb large, intermittent writes.

Both of these have time based counterparts, dirty_expire_centisecs and dirty_writeback_centisecs, such that pdflush will kick in and start doing writeback by age regardless of how much is there. For example, it might do writeback at 500MB OR when data in dirty memory has been around for longer than 15 seconds.  Newer kernels also allow an alternative specification of an actual number, rather than percent, in dirty_bytes and dirty_background_bytes.

There are quite a few more things I could share, but I think I’ll leave with just one more: /proc/sys/vm/vfs_cache_pressure. Usually this is set at 100 by default. Increasing this number will cause the system to tend to clean up/minimize directory and inode read caches (the stuff that’s cleaned up by drop_caches), decreasing the number will cause it to horde more.

Stay tuned for some benchmarks of KVM virtio and IDE with no cache, writethrough, and writeback, compared to VMware ESX paravirtualized disks.