Tag Archive for cluster

Scale-Out Doesn’t Just Mean Applications

Scale Out

A couple of months ago I wrote a post entitled Scale-Out. Distributed. Whatever the Name, it’s the Future of Computing.

Taking the concept a step further, I recently started thinking about other elements in IT which are moving in that direction; not just applications and storage, but underlying infrastructure and management elements too.

Then it dawned on me that this really is not a new thing… we’ve been taking this approach for years! Technologies like VMware vSphere, have enabled us to become trusting, almost presumptuous, that we can add resources as we need them; increasing the shared pool transparently and enabling us to continue to service requirements, whilst eliminating downtime. (You can even use them to scale up on-the-fly if you really have to!)

The current breed of infrastructure engineers and startups have grown up in this era and the great thing is that this has now become part of their DNA! Typically, no longer are solutions designed from scratch to be scale-up in nature; hitting some artificial limit in capacity or having to scale specific elements of a solution to avoid nasty bottlenecks.

Instead, infrastructure is being designed to scale-out natively; distributed architectures, balancing workloads and metadata evenly across platforms. This has the added benefit, of course, of making them more resilient to failure of individual components.Distributed Systems

Backup isn’t Sexy, but it’s Necessary

One great example of this new architecture paradigm (drink!), is Rubrik, a startup in the backup space who we met at Tech Field Day 12. Their home-grown distributed file system, distributed metadata, built in off-site replication and global namespace, provide a massively scalable and resilient backup system.

All of the roles from a traditional backup solution (such as backup proxies/media servers/metadata servers, etc) are now rolled into a single, scale-out platform. As I seem to find myself saying more and more often these days, KISS personified!kiss - Keep it simple stupid EFS

With shrinking IT teams, I commonly find that companies are willing to trade budget for time savings. Utilising a simple, policy-driven management interface and enabling off-site replication to be done over-the-wire, has a lot of benefits to operational time!

As an added bonus, it can even replicate out to S3, Blob and NFS targets, to give even more options for off-site replication. Of course, a big fat pipe to the internet will cost you more each month; though you’re probably investing in that anyway, to meet your employee’s peak lunchtime demand for facebook and youtube! 🙂

Much like any complex machine, under the hood, Rubrik is pretty impressive. There is a masterless cluster management solution, multi-tier flash and disk for performance, and a clever redirect-on-write snapshot chain algorithm, which minimises capacity utilisation whilst providing very granular restores.

The key thing here, though, is we don’t really care; we are a consumer society who just wants things to work, as we have more exciting things than backup to worry about!

rubrik

TLDR;

We have enough complexity in IT these days without having to worry about backup. I would say that the simple to manage, scale-out solution from Rubrik is certainly worth considering as part of any PoC or RFP! 🙂

Further Info

You can catch the full Rubrik session at the link below:
Rubrik Presents at Tech Field Day 12

Further Reading

Some of the other TFD delegates had their own takes on the presentation we saw. Check them out here:

Disclaimer: My flights, accommodation, meals, etc at Tech Field Day 12 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services.

Assigning vCenter Permissions and Roles for DRS Affinity Rules

Today I was looking at a permissions element for a solution. The requirement was to provide a customer with sufficient permissions to be able to configure host and virtual machine affinity / anti-affinity groups in the vCentre console themselves, without providing any more permissions than absolutely necessary.

After spending some time trawling through vCentre roles and permissions, I couldn’t immediately find the appropriate setting; certainly nothing specifically relating to DRS permissions. A bit of Googling and Twittering also yielded nothing concrete. I finally found that the key permission required to be able to allow users to create and modify affinity groups is the “Host \ Inventory \ Modify Cluster” privilege. Unfortunately the use of this permission is a bit like using a sledgehammer to crack a nut!

roles

By providing the Modify Cluster permission, this will also provide sufficient permissions to be able to enable, Configure and disable HA, modify EVC settings, and change pretty much anything you like within DRS. all of these settings are relatively safe to modify without risking uptime (though they do present some risk in the event of unexpected downtime); what is a far more concerning is that these permissions and allow you to enable, configure and disable DPM! It doesn’t take a great deal of imagination to come up with scenario where for example a junior administrator accidentally enables DPM on your cluster, a large percentage of your estate unexpectedly shuts down overnight without the appropriate config to boot back up, and all hell breaks loose at 9am!

The next question then becomes, how do you ensure that this scenario is at least partly mitigated? Well it turns out that DPM can be controlled via vCenter Scheduled Tasks. Based on that, the potential workaround for this solution is to enable the Modify Cluster privilege for your users in question, then set a scheduled task to auto-disable DPM on a regular basis (such as hourly). This should at least minimise any risk, without necessarily eradicating it. Not ideal, but it would work. I’m not convinced as to whether this would be such a great idea for use on a critical production system. Certainly a bit of key training before letting anyone loose in vCenter, even with “limited” permissions, is always a good idea!

I have tested this in my homelab on vSphere 5.5 and it seems to work pretty well. I don’t have vSphere 6 set up in my homelab at the moment, so can’t confirm if the same configuration options are available, however it seems likely. I’ll test this again once I have upgraded my lab.

It would be great to see VMware provide more granular permissions in this area, as even basic affinity rules such as VM-VM anti-affinity are absolutely critical in many application solutions to ensure resilience and availability of services such as Active Directory, Exchange, web services, etc. To allow VM administrators achieve this, it should not be necessary to start handing out sledgehammers to all and sundry! 🙂

If anyone has any other suggested solutions or workarounds to this, I would be very interested to hear them? Fire me a message via Twitter, and I will happily update this post with any other suggested alternatives. Unfortunately due to inundation with spam, I removed the ability to post comments from my site back in 2014. sigh

 

%d bloggers like this: