Archive for 24th November 2015

Where and why is my data growing?…

I’ve written recently about issues of data gravity and data inertia, and about how important analytics are to managing your data “stockpile”, but one thing I haven’t gone into is the constant challenge of actually understanding your data composition, i.e. what the hell am I actually storing?!

Looking back to my days as a Windows admin maintaining what for the time were some massive, multi-terabyte (ooer – it was 10 years ago to be fair), filers and shared document storage systems; we had little to tell us what the DNA of those file shares was, how much of it was documents and other business-related content, and how much of it was actually people storing their entire MP3 collections and “family photos” on their work shared drives (yes, 100% true!).

Back then our only method of combating these issues was to run TreeSize to see who was using most space, then do windows searches for specific file types and manually clear out the crud; an unenviable task which came across a few surprising finds I won’t go into just now (ooer for the second time)! The problem was that we just didn’t know what we had!

Ten years later I have spoken to customers who are consuming data at very significant rates, but don’t have a grip on where it’s all going…

With that in mind, I was really interested in what the chaps at Qumulo had come up with when they presented at SFD8 recently. As they said at the time, the management of storage is getting easier, but the management of data is getting very much harder! Their primary vision is therefore quite succinctly described as “Build visible data and make storage invisible”.

Their “Data Aware” scale-out NAS solution is based around providing near-realtime analytics on the metadata, and was designed to meet the requirements of the 600 companies and individuals they interviewed before they even came up with their idea!

The product is designed to be software only and subscription-based, though they also provide scale out physical 1u / 4u appliances as well. I guess the main concept there is “have it your way”; there are still plenty of customers out there who want to buy software solution which is pre-qualified and supported on specific hardware (which sounds like an oxymoron but each to their own I say)! Most of Qumulo’s customers today actually buy the appliances.

The coolest thing about their solution is definitely their unique file system (QSFS – Qumulo Scalable File System). It uses a very clever, proprietary method to track changes within the filesystem based on the aggregate of child attributes in the tree (see their SFD8 presentation for more info). As you then don’t need to necessarily walk the entire tree to get an answer to a query (it should be noted this would be one specifically catered for by Qumulo though). It can then present statistics based on those attributes in near-realtime.

Whiteboard Dude approves!

Whiteboard Dude approves!

I would have killed for this level and speed of insight back in my admin days, and frankly I have a few customers right now who would really benefit!

Taking this a step further, the analytics can also provide performance statistics based on file path and type, so for example it could show you where the hotspots are in your filesystem, and which clients are generating them.

Who's using my storage?

Who’s using my storage?

Stuff I would like to see in future versions (though I know they don’t chase the Service Provider market), would be things like the ability to present storage to more than one Active Directory domain, straight forward RBAC (Role Based Access Control) at the management layer, more of the standard data services you see from most vendors (the RFP tick box features). Being able to mix and match the physical appliance types would also be useful as you scale and your requirements change over time, but I guess if you need flexibility, go with the software-only solution.

At a non-feature level, it would be sensible if they could rename their aggregate terminology as I think it just confuses people (aggregates typically mean something else to most storage bods).

Capacity Visualisation

Capacity Visualisation

Overall though I think the Qumulo system is impressive, as are the founder’s credentials. Their CEO/CTO team of Peter Godman and Aaron Passey, with whom we had a good chinwag outside of the SFD8 arena, both played a big part in building the Isilon storage system. As an organisation they already regularly work with customers with over 10 billion files today and up to 4PB of storage.

If their system is capable of handling this kind of scalability having only come out of stealth 8 months ago, they’re definitely one to watch…

Further Reading
Some of the other SFD8 delegates have their own takes on the presentation we saw. Check them out here:

Dan Frith – Qumulo – Storage for people who care about their data

Scott D. Lowe – Data Awareness Is Increasingly Popular in the Storage Biz

Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Storage Field Day 8 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.


NanoLab – Part 10 – Your NUCs are nice and cool, but what about your stick?

I have been running a variety of Intel NUC nodes in my vSphere homelab over the past 3 years now, including the D34010WYKH, DC3217IYE & DC53427HYE.

In that time I have unfortunately seen more than my fair share of USB drive failures and corruptions, generally with an error which looks something like this:

Error loading /k.b00
Fatal error: 33 (Inconsistent data)

These are not cheap and nasty, or freebie USB drives, so I would not normally expect to see this rate of failures. The error only occurs when you reboot the host, and the startup bombs out at the start of the hypervisor launch. I have often managed to recover the stick by copying back corrupted files from another instance, but generally I needed to rebuild and restore the image. An unnecessary pain in the rear!

The Root Cause
The NUC case can become quite warm during normal operation with or without the fans spinning up, and I have come to believe that the main reason for the corruptions is that the USB stick itself is getting too hot and therefore eventually failing. Having pulled a USB out from a recently shut down node, they are really quite hot to the touch. You don’t actually see the symptom / failure until a reboot because the ESXi image actually runs in memory, so is only loaded from the USB stick at boot time.

The Solution
As for the solution, it’s really quite simple. I purchased a number of 12cm (5 inch) USB 2.0 extender cables on eBay for just 99p each (including delivery!).

These keep the USB stick indirectly attached to the NUC chassis, and as such the heat does not transfer into the flash drive. Since doing this I have not seen any further issues with the corruptions. Job done!

Keeping things cool: USB extender on Intel NUC

Keeping things cool: USB extender on Intel NUC

Software Defined Storage Virtualisation – How useful is that then?

Ignoring the buzzword bingo post title, storage virtualisation is not a new thing (and for my American cousins, yes, it should be spelt with an s! 🙂 ).

NetApp have for example been doing a V-Series controller for many years which could virtualise pretty much any storage you stick in the back of it. It would then present it as NFS and layer on all of the standard ONTAP features.

The big advantage then was that you can use the features which might otherwise be missing from your primary or secondary storage tiers, as well as being able to mix and match different tiers of storage from the same platform.

In a previous role, we had an annual process to full backup and restore a 65TB Oracle database from one site to another over a rather slow link, using an ageing VTL that could just about cope with incrementals and not much more on a day to day basis. End to end this process took a month!

Then one year we came up with a plan to used virtualised NFS storage to do compressed RMAN backups, replicate the data using snap mirror and restore on the other side. It took us 3 days; an order of magnitude improvement!

That was 4 years ago, when the quantity of data globally was about 4x less than it is now; the problem of data inertia is only going to get worse as the worlds storage consumption doubles roughly every two years!

What businesses need is the flexibility to use a heterogeneous pool of storage of different tiers and vendors in different locations to move our data around as required to meet our current IT strategy, without having to change paths to data or take downtime (especially on non virtualised workloads which don’t have the benefits of Storage vMotion etc). These tiers need to provide the consistent performance defined by individual application requirements.

It’s for this reason that I was really interested in the presentation from Primary Data at Storage Field Day 8. They were founded just two years ago, came out of stealth at VMworld 2015, and plan to go GA with their first product in less than a month’s time. They also have some big technical guns in the form of their Chief Scientist, the inimitable Steve Wozniak!

One of the limitations of the system I used in the past was that it was ultimately a physical appliance, with all the usual drawbacks thereof. Primary Data are providing the power to abstract data services based on software only, presented in the most appropriate format for the workload at hand (e.g. for vSphere, Windows, Linux etc), so issues with data gravity and inertia are effectively mitigated. I immediately see three big benefits:

  • Not only can we decouple the physical location of the data from it’s logical representation and therefore move that data at will, we can also very quickly take advantage of emerging storage technologies such as VVOLs.
    Some companies who shall remain nameless (and happen to have just been bought by a four letter competitor) won’t have support for VVOLs for up to another 12 months on some of their products, but with the “shim” layer of storage virtualisation from Primary Data, we could do it today on virtually any storage platform whether it is VVOL compliant or not. Now that is cool!
  • By virtualising the data plane and effectively using the underlying storage as object storage / chains of blocks, they enable additional data services which may either not be included with the current storage, or may be an expensive add-on license. A perfect example of this is sync and async replication between heterogenous devices.
    Perhaps then you could spend the bulk of your budget on fast and expensive storage in your primary DC from vendor A, then replicate to your DR site asynchronously onto cheaper storage from vendor B, or even a hyper-converged storage environment using all local server media. The possibilities are broad to say the least!
  • The inclusion of policy based Quality of Service from day one. In Primary Data parlance, they call them SLOs – Service Level Objectives for applications with specific IOPS, latency etc.
    QoS does not even exist as a concept on many recent storage devices, much to the chagrin of many service providers for example, so being able to retrofit it would protect the ROI on existing spend whilst keeping the platform services up to date.

There are however still a few elements which to me are not yet perfect. Access to SMB requires a filter driver in Windows in front of the SMB client, so the client thinks it’s talking to an SMB server but it’s actually going via the control plane to route the data to the physical block chains. A bit of a pain to retrofit to any large legacy environment.

vSphere appears to be a first class tenant in the Primary Data solution, with VASA and NFS-VAAI supported out of the “virtual” box, however it would be nice to have Primary Data as a VASA Client too, so it could read and then surface all capabilities from the underlying storage straight through to the vSphere hosts.

You will still have to do some basic administration on your storage back end to present it through to Primary Data before you can start carving it up in their “Single Pane of Glass”. If they were to create array plugins which would allow you to remote manage many common arrays this would really make that SPoG shine! (Yes, I have a feverish unwavering objection to saying that acronym!)

I will certainly be keeping an eye on Primary Data as they come to market. Their initial offering would have solved a number of issues for me in previous roles if it had been available a few years earlier, and I can definitely see opportunities where it would work well in my current infrastructure. I guess it now becomes up to the market to decide whether they see the benefits too!

Further Reading
Some of the other SFD8 delegates have their own takes on the presentation we saw. Check them out here:

Ray Lucchesi – Primary data’s path to better data storage presented at SFD8

Dan Frith – Primary Data  Because we all want our storage to do well

Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Storage Field Day 8 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.

NanoLab – Part 9 – Installing VMware vSphere ESXi 5.5 on Intel NUC

I successfully ran my VMware vSphere ESXi 5.1 Nanolab for 18 months on my pair of Intel NUC DC3217IYE hosts. Early this year I got around to upgrading to 5.5. I had experienced some issues with my vCenter Server Appliance so ended up just rebuilding the lab from scratch and reattaching my old data stores. Having written all of this up, I then promptly forgot to post it! So for the sake of continuity (before I do the same for 6.0 shortly), this article covers the process.

In addition I also purchased a 3rd node for my lab, the 4th Gen D34010WYKH model (also with a Core i3), with which I was able to test and prove the process on as it uses the same NIC chipset.

The following are updated instructions for installing vSphere 5.5 on Intel NUC (any model with the Intel® 82579V or Intel® I218V onboard NIC should work).

I recommend before you start, you upgrade the NUC to the latest firmware, to avoid any potential bugs (of which there were a few when they were first released). Copy the latest firmare image onto a USB stick, boot the NUC, hit F7 at the bios, find your firmware on the USB stick and let it do it’s thing:

Intel NUC Firmware Upgrade

Intel NUC Firmware Upgrade

vSphere 5.5 Install Requirements

  • A USB Stick. This should work on anything over 1-2GB but personally am using 8GB PNY Micro Sleek Attache & 16GB Kinston DataTraveler Micro drives as they’re tiny, so less likely to catch on anything as they stick out the back of the NUC box, and they cost less than £5 each.
  • A copy of VMware Workstation 8 / Fusion 6 or newer.
  • ESXi-Customizer 2.7.2 (created by Andreas Peetz) for adding VIBs to your image. NOTE: This can also be done by Powershell, but I like the GUI as it’s easy! (
  • The ESXi driver for the Intel® 82579V Gigabit Ethernet Controller (e.g. for the original models using ESXi 5.5):
  • OR The ESXi driver for the Intel® I218V Gigabit Ethernet Controller (e.g. for the Haswell based D34010U models):
  • (AND) The ESXi AHCI driver for the SATA controller (if you want to use local drives in the  Haswell based D34010U models):
    • sata-xahci-1.10-1.x86_64
    • If you do choose to add this in as well to your image, simply run the customiser twice, once for the network VIB, then a second time for the SATA vin, using the interim image as your source for the final image.

Process Overview

  • Create a customised ISO with the additional Intel driver.
  • Install ESXi to your USB stick using VMware Workstation / VMware Fusion and the customised ISO you will create below.
  • Plug in your NUC, insert the USB stick, boot and go!

Part One – Create the Custom ISO

  1. Run the ESXi-Customizer-v2.7.2.exe (latest version at time of writing).
  2. This will extract the customer to the directory of your choosing.
  3. Navigate to the new directory.
  4. Run the ESXi-Customizer.cmd batch file. This will open up the GUI, where you can configure the following options:
  • Path to your ESXi Installer
  • Path to the Intel driver downloaded previously
  • Path where you want the new ISO to be saved
  1. Ensure you tick the Create (U)EFI-bootable ISO checkbox.
ESXi-Customizer with 2.3.2 vib

ESXi-Customizer with 2.3.2 vib

This will output a new custom ESXi installer ISO called ESXi-5.x-Custom.iso or similar, in the path defined above.

Part Two – Install bootable ESXi to the USB stick.
I stress that this is my preferred way of doing this as an alternative is simply to burn your customised ISO to a CD/DVD and boot using a USB DVD-ROM. That would however be a whole lot slower, and waste a blank CD!

  1. Plug your chosen USB stick into your PC.
  2. Open VMware Workstation (8 or above), VMware Fusion, or whatever you use, ideally supporting the Virtualize Intel VT-x/EPT or AMD-V/RVI option (allowing you to nest 64-bit VMs).
  3. Create a new VM, you can use any spec you like really, as ESXi always checks on boot, but I created one with the similar specs as my intended host, single socket, 2vCPU cores. RAM doesn’t really matter either but I use at least 4GB normally. This does not require a virtual hard disk.
  4. Once the VM is created, and before you boot it, edit the CPU settings and tick the Virtualize Intel VT-x/EPT or AMD-V/RVI checkbox. This will reduce errors when installing ESXi (which checks to ensure it can virtualise 64-bit operating systems).

VMware Workstation Nesting

Screen Shot 2014-08-29 at 22.09.01

VMware Fusion Nesting

  1. Set the CD/DVD (IDE) configuration to Use ISO image file, and point this to the customised ISO created earlier.
  2. Once the above settings have been configured, power on the VM.
  3. As soon as the VM is powered on, in the bottom right of the screen, right click on the flash disk icon, and click Connect (Disconnect from Host).

Attach USB in VMware Workstation

Screen Shot 2014-08-29 at 21.38.18

Attach USB in VMware Fusion

  1. This will mount the USB stick inside the VM, and allow you to do a standard ESXi installation onto the stick.
ESXi Install

ESXi Install

  1. At the end of the installation, disconnect the stick, un-mount and unplug it.
Install Complete

Install Complete

Part Three – Boot and go!
This is the easy bit, assuming you don’t have any of the HDMI issues I mentioned in the first post!

  1. Plug your newly installed USB stick into the back of the NUC.
  2. Don’t forget to plug in a network cable (duh!) and keyboard for the initial configuration. If you wish to modify any bios settings (optional), you will also ideally need a mouse as the NUC runs Visual BIOS.
  3. Power on the NUC…
  4. Have fun!

That’s it!

Any questions/comments, please feel free to hit me up on twitter as I have recently disabled comments on my blog due to the insane volumes of spam bots they were attracting!

%d bloggers like this: