Tag Archive for SFD9

Storage Field Day 9 – Behind the Curtain

Tech Field Day cheese

Tech Field Day is an awesome experience for all of the delegates! We get to spend an entire week unabashedly geeking out, as well as hanging out with the founders, senior folk and engineers at some of the most innovative companies in the world!

For those people who always wondered what goes on behind the scenes in the Tech Field Day experience, I took a few pano shots at the Storage Field Day 9 event this week.

Here they are, along with most of my favourite tweets and photos of the week… it was a blast!

Panos

Pre-Event Meeting

Pre-Event Meeting & Plexistor

NetApp & SolidFire

NetApp & SolidFire

Violin Memory

Violin Memory

Intel

Intel

Cohesity

Cohesity

VMware

VMware

The rest of the event…

Until next time… 🙂

You had me at Tiered Non-Volatile Memory!

Memory isn’t cheap! Despite the falling costs and increasing sizes of DRAM DIMMS, it’s still damned expensive compared to most non-volatile media at a price per GB. What’s more frustrating is that often you buy all of this expensive RAM, assign it to your applications, and find later through detailed monitoring, that only a relatively small percentage is actually being actively used.

For many years, we have had technologies such as paging, which allow you to maximise the use of your physical RAM, by writing out the least used pages to disk, freeing up RAM for services with current memory demand. The problem with paging is that it is sometimes unreliable, and when you do actually need to get that page back, it can be multiple orders of magnitude slower returning it from disk.

Worse still, if you are running a workload such as virtual machines and the underlying host becomes memory constrained, a hypervisor may often not have sufficient visibility of the underlying memory utilisation, and as such will simply swap out random memory pages to a swap file. This can obviously have significant impact on virtual machine performance.

More and more applications are being built to run in memory these days, from Redis to Varnish, Hortonworks to MongDB. Even Microsoft got on the bandwagon with SQL 2014 in-memory OLTP.

One of the companies we saw at Storage Field Day ,  Plexistor, told us that can offer both tiered posix storage and tiered non-volatile memory through a single software stack.

The posix option could effectively be thought of a bit like a non-volatile, tiered RAM disk. Pretty cool, but not massively unique as RAM disks have been around for years.

The element which really interested me was the latter option; effectively a tiered memory driver which can present RAM to the OS, but in reality tier it between NVDIMMs, SSD and HDDs depending on how hot / cold the pages are! They will also be able to take advantage of newer bit addressable technologies such as 3D XPoint as they come on the market, making it even more awesome!

PlexistorArch.jpg

Plexistor Architecture

All of this is done through the simple addition of their NVM file system (i.e. device driver) on top of the pmem and bio drivers and this is compatible with most versions of Linux running reasonably up to date kernel versions.

It’s primarily designed to work with some of the Linux based memory intensive apps mentioned above, but will also work with more traditional workloads as well, such as MySQL and the KVM hypervisor.

Plexistor define their product as “Software Defined Memory” aka SDM. An interesting term which is jumping on the SDX bandwagon, but I kind of get where they’re going with it…

SDM_vs_SDS2.png

Software Defined Memory!

One thing to note with Plexistor is that they actually have two flavours of this product; one which is based on the use of NVRAM to provide a persistent store, and one which is non-persistent, but can be run on cloud infrastructures, such as AWS. If you need data persistence for the latter, you will have to do it at the application layer, or risk losing data.

If you want to find out a bit more about them, you can find their Storage Field Day presentation here:
Plexistor Presents at Storage Field Day 9

Musings…
As a standalone product, I have a sneaking suspicion that Plexistor may not have the longevity and scope which they might gain if they were procured by a large vendor and integrated into existing products. Sharon Azulai has already sold one startup in relatively early stages (Tonian, which they sold to Primary Data), so I suspect he would not be averse to this concept.

Although the code has been written specifically for the Linux kernel, they have already indicated that it would be possible to develop the same driver for VMware! As such, I think it would be a really interesting idea for VMware to consider acquiring them and integrating the technology into ESXi. It’s generally recognised as a universal truth that you run out of memory before CPU on most vSphere solutions. Moreover, when looking in the vSphere console we often see that although a significant amount of memory is allocated to VMs, often only a small amount is actually active RAM.

The use of Plexistor technology with vSphere would enable VMware to both provide an almost infinite pool of RAM per host for customers, as well as being able to significantly improve upon the current vswp process by ensuring hot memory blocks always stay on RAM and cold blocks are tiered out to flash.

plexistorvmware

The homelab nerd in me also imagines an Intel NUC with 160GB+ of addressable RAM per node! 🙂

Of course the current licensing models for retail customers favour the “run out of RAM first” approach as it sells more per-CPU licenses, however, I think in the long term VMware will likely move to a subscription based model, probably similar to that used by service providers (i.e. based on RAM). If this ends up being the approach, then VMware could offer a product which saves their customers further hardware costs whilst maintaining their ESXi revenues. Win-Win!

Further Reading
One of the other SFD9 delegates had their own take on the presentation we saw. Check it out here:

Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Storage Field Day 9 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.

Violin Memory have cool technology, but do they have a future? I hope so!

Since visiting Violin Memory at Storage Field Day 8, it has taken me a while to get around to writing this post, and I guess the reason is because I am both frustrated, and a little bit sad.

They were the first guys who (for me at least) truly tamed the beast that is flash storage, packed it up into a blisteringly fast product with insanely low latencies, and released it into the big wide world.

An organisation I worked at took that product and threw thousands of desktop users at it, a number of very busy SQL TempDBs, and some other frankly evil workloads. It didn’t even blink an eye! With Violin Storage on the back, I have seen RDS desktop customers stress test the platform with up to 100 users in a single VM!

The thing that makes me sad is that to date, Violin have not yet turned a profit.

This is not some upstart company out of the boondocks! This is a mature company, founded in 2005, who provide their product to some of the biggest enterprises in the world. Yet so far they have not managed to make a penny!

Anyway, there are other guys in the industry who are far better at financial analysis than me (for example Justin Warren did a post on this very subject just last week). So I will leave it to them try to work out why this is the case, because I like to talk tech!

Violin Memory

Looking at the latest incarnation, once again Violin have “evolutionized” [sic] something which is technically very impressive.

  • The original Violin OS has been given the boot and has been replaced by something they call Concerto OS, which blends elements of the original software with some updated features. In development is also support for a multi-controller configuration beyond the current dual, though this is obviously not GA yet.
  • The new dual controller Flash Storage Platform supports FC, iSCSI and Infiniband, as well as RDMA and ROCE.
  • It is all packaged in a 3U appliance with up to 64 redundant flash modules.
  • Thanks to their own custom backplane design, it is capable of 10-12GB/sec of throughput!
  • At 100% sustained writes they have measured 400,000 IOPSat RAID5, which is more than many of their competitors can achieve with 100% reads!It should be noted that this was on the performance model, which does not support dedupe.

All in all the new solution just screams FAST! In fact, I’m surprised they didn’t paint a red stripe down the side of the chassis!

low latency

So why on earth are Violin not ruling the AFA world right now? As a frickin cool technology with hyper speed storage, they deserve to be up there at the very least!

If I had to hazard a guess, it’s because for most companies, good is good enough.

With the smorgasbord of All Flash Arrays available today, if you don’t need latency measured in microseconds and massive IOPS/bandwidth, then you have a huge array of choices (pardon the pun). At that point features, price and support become more important than straight line drag racer performance.

If Violin want to compete with the general market whilst servicing their high-speed clients, then they need to concentrate on continuing to developing a wider range of data services and providing entry-level options to consume their products. The last thing you want to do is lose business just because you were missing a check box in an RFP…

If Violin can stop burning cash and break even, then perhaps they have a future. I for one, hope so!

If you want to catch Violin’s presentation from SFD8, check them out here:
http://techfieldday.com/appearance/violin-memory-presents-at-storage-field-day-8/

They will also be presenting again at SFD9 this week, and I’m looking forward to finding out what they plan to do next!

Further Reading
Some of the other SFD8 delegates have their own takes on the presentation we saw. Check them out here:

Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Storage Field Day 8 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.

Secondary can be just as important as Primary

There can be little doubt these days, that the future of the storage industry for primary transactional workloads is All Flash. Finito, that ship has sailed, the door is closed, the game is over, [Insert your preferred analogy here].

Now I can talk about the awesomeness of All Flash until the cows come home, but the truth is that flash is not now, and may never be as inexpensive for bulk storage as spinning rust! I say may as technologies like 3D NAND are changing the economics for flash systems. Either way, I think it will still be a long time before an 8TB flash device is cheaper than 8TB of spindle. This is especially true for storing content which does not easily dedupe or compress, such as the two key types of unstructured data which are exponentially driving global storage capacities through the roof year on year; images and video.

With that in mind, what do we do with all of our secondary data? It is still critical to our businesses from a durability and often availability standpoint, but it doesn’t usually have the same performance characteristics as primary storage. Typically it’s also the data which consumes the vast majority of our capacity!

AFA Backups

Accounting needs to hold onto at leat 7 years of their data, nobody in the world ever really deletes emails these days (whether you realise or not, your sysadmin is probably archiving all of yours in case you do something naughty, tut tut!), and woe betide you if you try to delete any of the old marketing content which has been filling up your arrays for years! A number of my customers are also seeing this data growing at exponential rates, often far exceeding business forecasts.

Looking at the secondary storage market from my personal perspective, I would probably break it down into a few broad groups of requirements:

  • Lower performance “primary” data
  • Dev/test data
  • Backup and archive data

As planning for capacity is becoming harder, and business needs are changing almost by the day, I am definitely leaning more towards scale-out solutions for all three of these use cases nowadays. Upfront costs are reduced and I have the ability to pay as I grow, whilst increasing performance linearly with capacity. To me, this is a key for any secondary storage platform.

One of the vendors we visited at SFD8, Cohesity, actually targets both of these workload types with their solution, and I believe they are a prime example of where the non-AFA part of the storage industry will move in the long term.

The company came out of stealth last summer and was founded by Mohit Aron, a rather clever chap with a background in distributed file systems. Part of the team who wrote the Google File System, he went on to co-found Nutanix as well, so his CV doesn’t read too bad at all!

Their scale-out solution utilises the now ubiquitous 2u, 4-node rack appliance physical model, with 96TB of HDDs and a quite reasonable 6TB of SSD, for which you can expect to pay an all-in price of about $80-100k after discount. It can all be managed via the console, or a REST API.

Cohesity CS2000 Series

2u or not 2u? That is the question…

That stuff is all a bit blah blah blah though of course! What really interested me is that Cohesity aim to make their platform infinitely and incrementally scalable; quite a bold vision and statement indeed! They do some very clever work around distributing data across their system, whilst achieving a shared-nothing architecture with a strongly consistent (as opposed to eventually consistent), 2-phase commit file system. Performance is achieved by first caching data on the SSD tier, then de-staging this sequentially to HDD.

I suspect the solution being infinitely scalable will be difficult to achieve, if only because you will almost certainly end up bottlenecking at the networking tier (cue boos and jeers from my wet string-loving colleagues). In reality most customers don’t need infinite as this just creates one massive fault domain. Perhaps a better aim would be to be able to scale massively, but cluster into large pods (perhaps by layer 2 domain) and be able to intelligently spread or replicate data across these fault domains for customers with extreme durability requirements?

Lastly they have a load of built-in data protection features in the initial release, including instant restore, and file level restore which is achieved by cracking open VMDKs for you and extracting the data you need. Mature features, such as SQL or Exchange object level integration, will come later.

Cohesity Architecture

Cohesity Architecture

As you might have guessed, Cohesity’s initial release appeared to be just that; an early release with a reasonable number of features on day one. Not yet the polished article, but plenty of potential! They have already begun to build on this with the second release of their OASIS software (Open Architecture for Scalable Intelligent Storage), and I am pleased to say that next week we get to go back and visit Cohesity at Storage Field Day 9 to discuss all of the new bells and whistles!

Watch this space! 🙂

To catch the presentations from Cohesity as SFD8, you can find them here:
http://techfieldday.com/companies/cohesity/

Further Reading
I would say that more than any other session at SFD8, the Cohesity session generated quite a bit of debate and interest among the guys. Check out some of their posts here:

Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Storage Field Day 8 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.

%d bloggers like this: