Tag Archive for Amazon

Cohesity Announces Cloud Integration Services

With the release of v2.0 of their OASIS platform, as presented as Storage Field Day 9 recently, Cohesity’s development team have continued churn out new features and data services at a significant rate. It seems that they are now accelerating towards the cloud (or should that be The Cloud?) with a raft of cloud integration features announced today!

There are three key new features included as part of this, called CloudArchive, CloudTier and CloudReplicate respectively, all of which pretty much do exactly what it says on the tin!

CloudArchive is a feature which allows you to archive datasets to the cloud (duh!), specifically onto Google Nearline, Azure, and Amazon S3. This would be most useful for things like long term retention of backups without taking up space on your primary platform.

CohesityCloudFeatures.png

CloudTier extends on-premises storage, allowing you to use cloud storage as a cold tier, moving your least used blocks out. If you are like me, you like to understand how these things work down deep in the guts! Mohit Aron, Founder & CEO of Cohesity, kindly provided Tekhead.it with this easy to understand explanation on their file and tiering system:

NFS/SMB files are mapped to objects in our system – which we call blobs. Each blob consists though of small pieces – which we call chunks. Chunks are variable sized – approximately ranging from 8K-16K. The variable size is due to deduplication – we do variable length deduplication.

The storage of the chunks [is] done by a completely different component. We group chunks together into what we call a chunkfile – which is approximately 8MB in size. When we store a chunkfile on-prem, it is a file on Linux. But when we put it in the cloud, it becomes an S3 object.

Chunkfiles are the units of tiering – we’ll move around chunkfiles based on their hotness.

So there you have it folks; chunkfile hotness is the key to Cohesity’s very cool new tiering technology! I love it!

chunkfilehotness

With the chunkfiles set at 8mb this seems like a sensible size for moving large quantities of data back and forth to the cloud with minimal overhead. With a reasonable internet connection in place, it should still be possible to recall a “cool” chunk without too much additional latency, even if your application does require it in a hurry.

You can find out more information about these two services on a new video they have just published to their youtube channel.

The final feature, which is of most interest to me is called CloudReplicate, though this is not yet ready for release and I am keen to find out more as information becomes available. With CloudReplicate, Cohesity has made the bold decision to allow customers to run a software only edition of their solution in your cloud of choice, with native replication from their on premises appliances, paving the way to true hybrid cloud, or even simply providing a very clean DR strategy.

This solution is based on their native on-premises replication technology, and as such will support multiple replication topologies, e.g. 1-to-many, many-to-1, many-to-many, etc, providing numerous simple or complex DR and replication strategies to meet multiple use cases.

Cohesity-CloudReplicate.png

It could be argued that the new solution potentially provides their customers with an easy onramp to the cloud in a few years… I would say that anyone making an investment in Cohesity today is likely to continue to use their products for some time, and between now and then Cohesity will have the time to significantly grow their customer base and market share, even if it means enabling a few customers to move away from on-prem down the line.

I have to say that once again Cohesity have impressed with their vision and speedy development efforts. If they can back this with increase sales to match, their future certainly looks rosy!

Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Storage Field Day 9 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.

Amazon AWS Tips and Gotchas – Part 4 – Direct Connect & Public / Private VIFs

Continuing in this series of blog posts taking a bit of a “warts and all” view of a few Amazon AWS features, below are a handful more tips and gotchas when designing and implementing solutions on Amazon AWS, specific to Direct Connect.

For the first post in this series with a bit of background on where it all originated from, see here:
Amazon #AWS Tips and Gotchas – Part 1

For more posts in this series, see here:
Index of AWS Tips and Gotchas

Tips and Gotchas – Part 4
10. VPC Private / Public Access Considerations

If you have gone out and bought a shiny new Direct Connect to your AWS platform, you might reasonably assume that all of the users and applications on your MPLS will automatically start using this for accessing S3 content and other AWS endpoints. Unfortunately, this is not so simple!

At a high level, here is a diagram showing the two primary Direct Connect configurations, Public and Private:

AWS Direct Connect Public and Private VIFMore Info on Direct Connect here:
AWS Direct Connect by Camil Samaha

A key point to note about Direct Connect is that it supports multiple VIFs per 1Gbps or 10Gbps link:

aws2If you are not a giant enterprise and don’t need this kind of bandwidth, you can buy single VIFs from your preferred network provider, but you will pay for it on a per-VIF basis and as such multiple VPCs Direct Connect access to public endpoints will bump up your costs a bit.

The question therefore becomes, what is the cost-effective and simple solution to access service endpoints (such as S3 in the examples below), when you also want to access your private resources in your own VPCs?

This is not always a straight forward answer if you are on a tight budget.

Accessing S3 via your Direct Connect

As I understand it, the S3 endpoint acts very much like VPC peering, only it is from your VPC to S3, and is therefore subject to similar restrictions. Specifically, the S3 endpoint documentation has a very key statement:

Endpoint connections cannot be extended out of a VPC. Resources on the other side of a VPN connection, a VPC peering connection, an AWS Direct Connect connection, or a ClassicLink connection in your VPC cannot use the endpoint to communicate with resources in the endpoint service”.

Basically this means for every VPC you want to communicate with directly from your MPLS, you need another VIF, and hence another connection from your service provider. If you want to access S3 services and other AWS public endpoints directly, you will also need an additional connection dedicated to that. This assumes your requirements are not enough to justify buying a 1Gbps / 10Gbps pipe for your sole use, and are using a partner to deliver it. If you can buy 1Gbps or above then you can subdivide your pipe into multiple VIFs for little / no extra cost.

Here are four example / potential solutions for different use cases, but they are definitely NOT all recommended or supported.

  • Assuming you are using a Private VIF, then by default, the content in S3 is actually accessed over the internet (e.g. using HTTPS if you bucket is configured as such):
    This may come as a surprise to people, as you would expect to buy a connection and access any AWS service.AWS Direct Connect Private VIF
  • If you have a Direct Connect from your MPLS into Amazon as a Public connection / VIF you can then route to the content over your Direct Connect, however this means you are bypassing your VPC and going straight into Amazon.
    This is a bit like having a private internet connection, so accessing VPCs etc securely would still require you run an IPsec VPN over the top of your “public” connection. This will work fine and will mean you can maximise the utilisation of the bandwidth on your direct connect, reduce your Direct Connect costs by sharing one between all VPCs. This is OK, but frankly not brilliant as you are ultimately still depending on VPNs to secure your data. If you want very secure, private access to your VPCs, you should really just spend the money! 🙂AWS Direct Connect Public VIF
  • If you have a Direct Connect from your MPLS into Amazon as a Private connection / VIF, you could proxy the connectivity to S3 via an EC2 instance. The content is requested by your instance using the standard S3 API and forwarded back to your clients. This means your EC2 instance is now a bottleneck to your S3 storage, and if you want to avoid it becoming a SPoF, you need at least a couple of them.
    It is worth specifically noting that although technically possible, this method would be strictly against all support and recommendations from AWS! S3 Endpoints and VPC peers are for accessing content from your VPCs, they are NOT meant to be transitive.AWS Direct Connect Private VIF
  • Lastly, Amazon’s primary recommended method is to run multiple VIFs, mixing both public and private. This biggest downside here is that each VIF will likely have a specific amount of bandwidth associated with it and you will have to procure multiple connections from your provider (unless you are big enough to need to buy a minimum of 1 Gbps!).AWS Direct Connect Public and Private VIFs

As this scales to many accounts, many VPCs and many VIFs, things also start to get a bit complex when it comes to routing (especially if you want many or all of the VPCs in question to be able to route to eachother), and I will cover that in the next post.

Until then…

AWS Direct Connect VIF networkingFind more posts in this series here:
http://www.tekhead.org/tag/awsgotchas/

Amazon AWS Tips and Gotchas – Part 5 – Managing Multiple VPCs

AWS Certified Solutions Architect Associate Exam Study Guide & Resources

After about 5 weeks of steeping myself in the AWS ecosystem and platform, labbing like crazy, and attending a compressed AWS training course, I finally sat the AWS Certified Solutions Architect Associate exam last week and passed.

I’ve described my experience and thoughts on the exam itself here:
#AWS Certified Solutions Architect Associate Exam Prep & Experience

Study Materials

In preparation for the exam, I used the following study materials:

Best of luck with your exams!!! 🙂

AWS Certified Solutions Architect Associate Exam Prep & Experience

AWS Certified Solutions Architect Associate Exam Prep & Experience

Historically I have been well aware of AWS and understood the key services at a high level, but recently this has become a key strategic focus for my employer, and I was asked to get down and dirty with the platform. So after about 5 weeks of steeping myself in the AWS ecosystem and platform, labbing like crazy, and attending a compressed AWS Solutions Architect training course, I finally sat the AWS Certified Solutions Architect Associate exam this week, and am happy to say I passed!

It has been a pretty intense number of weeks, and my wife has been less than impressed with hardly seeing me for a month, but it has certainly been worthwhile!

TLDR: Loads of exam resources coming in the follow-up post. Learn to speed read! ACloud.Guru and official QA AWS courses are both good. The exam itself was reasonably tricky for an intro level exam, but not too bad. List of prep materials is here:
http://tekhead.it/blog/2016/03/aws-certified-solution-architect-associate-exam-study-guide-resources/

AWS Solutions Architect Exam Prep Process

I will post a follow-up list of resources shortly but for now, I will concentrate on the process!

My exam prep and training was largely centred around the ACloud.Guru and official QA AWS Accelerated courses, with a load of additional reading preceding and following them.

I am also a copious note taker and I spend significant amounts of time labbing to make sure that whatever I am designing for a customer, or whatever I am being tested on, I have generally done it at least once! More detail on these in the study materials post.

7 days before the AWS exam

Having spent several weeks labbing I spent my last week predominantly reading through the recommended whitepapers and reading the AWS FAQ documents, along with a number of articles from the AWS documentation site.

2 days before the AWS exam

I spent this time solidly doing practice questions, reading AWS documentation to fill in any blanks from the practice questions, and reading through my notes from the two courses.

I found the sample exam and practice questions very useful. The same goes for the practice tests in the ACloud.Guru course. Whenever I came across a question I was not 100% confident on, again I hit the AWS documentation site to fill in the blanks.

1 day before the AWS exam

One thing I did the night before the exam was to read through all of my ACloud.Guru notes, specifically concentrating on the “Exam Tips” which Ryan had noted throughout the course, as well as all of the end of section summaries.

Similarly during the QA course, every time the trainer mentioned something which is a likely exam topic I made a specific note of it. I took some time to review the list prior to the exam and look up AWS documentation and articles on the relevant features.

#AWS Certified Solutions Architect Associate Exam Prep & Experience

AWS Solutions Architect Exam Experience

The exam itself is obviously under NDA so I obviously cant go into any detail about the content. Amazon also provide an FAQ about the exam which is worth reading.

The exam centre I used was not one I had used before for Prometric or Pearson Vue. It certainly looked the part, very modern etc, but in reality, it was actually quite sub par. I was lucky enough to be sitting on the opposite side of a paper thin wall from a very noisy chap in a meeting room! Fortunately, the exam centre did provide ear plugs. Can’t say I have ever even felt the need to wear earplugs in an exam before, but there’s a first time for everything!

I felt the time allocation was reasonable. I finished after roughly 75-80% of my allotted time so very similar to a number of other industry entry to mid level exams I have taken in the past.

In terms of difficulty, I would equate the Solutions Architect Associate exam to being of a similar level to a reasonably tricky VCP / MCP, but definitely not as hard as a VCAP. I passed reasonably comfortably, but had to really think hard about quite a few of the questions. I was really glad I managed to get a bit of time to read some of the FAQ documents in the days before the exam, which were not originally on my resource list, but turned out to be very good exam prep!

Every time I hit next there was a very long pause until the next question is displayed. I can only guess the questions are being requested on the fly as you progress, as the pause was so long I cant think of any other reasonable explanation! I would guess I lost at least 3-5 minutes over the course of the exam, staring at the next question loading! Not ideal if you are pushed for time, and had I been, I may have found this more frustrating.

The submit button (which ends the exam) is frankly stupid! It appears on every single page of the exam. Do they believe people are going to answer the first 3 questions then hit submit?!? This is just asking for trouble IMHO.The test system vendor they use feels dated / clunky compared to other systems I have used recently, e.g. for Microsoft and VMware exams on Pearson Vue, which are pretty dated in and of themselves!

As this post is now getting rather long I shall end it here and provide a second post with a rather sizable list of my study materials!

In the mean time…

AWS Solution Architect Associate Exam Prep and Experience

 

AWS Certified Solutions Architect Associate Exam Study Guide & Resources

 

%d bloggers like this: