Monthly Archives: January 2017

Am I getting overcharged on using Cloud ?

When you are putting data onto cloud, what all would you think ?

1. How long is it going to take ?
2. How safe it is ?
3. How much would be cost savings ?
4. Will I be able to get this data back ?

Safety aspect of the data is covered under my another blog, and here I am sharing my thoughts on validating the cost savings.

These are few ways cloud providers charge on the backup or restore(data retrieval) operation:
1. Amount of storage being used for the backup
2. Amount of transactions i.e No. of get and put operations
3. No charge for backup but only charge for data retrieval

Eg., Pricing Links for Amazon:  https://aws.amazon.com/s3/pricing/
For IBM SoftLayer – http://www.softlayer.com/info/pricing

IBM Spectrum Virtualize 7.8, which supports Cloud Backup on Amazon S3, IBM SoftLayer and Open Stack Swift with recently introduced Transparent Cloud Tiering (TCT),  provides following data points for you to validate or estimate your data consumption:

  • Amount of data uploaded
  • No. of successful PUT and GET operations
  • Total amount of bytes and blocks downloaded and transferred to cloud
  • Total no. of blocks uploaded

 

All these parameters are inside the usual Vdisk (Nv*) and Node (Nn*) stats of the configuration node files of your cluster and can be retrieved from them for reference.
There are many other parameters like failed operations counter, backup and download latencies, retries counters etc. which also gets recorded in these files.

There are few other newly introduced CLI’s which show cloud usage:

  • svcinfo lscloudaccountusage – This shows total cloudaccount usage by the cluster
  • svcinfo lsvolumebackup – This shows how much backup is done for every single volume in cloud
  • svcinfo lsvolumebackupprogress – This shows backup time and current progress of the backup for all the volumes
  • svcinfo lsvolumerestoreprogress – This shows restore operation progress for all the volumes

I am not covering the new CLI list exhaustively but there are many more mentioned in Knowledge center.
With careful analysis and by using a small script, monthly usage calculation can be done and bill can be estimated for the particular service provider.

PS: These are my personal thoughts and they may or may not match with my employer.

Is my data secure in Cloud with Spectrum Virtualize ?

With Spectrum Virtualize 7.8 release, feature to put snapshots in cloud is supported. Details on how to start setup are here .
These snapshots are stored in the form of objects. Objects contain both metadata and data and for a large size volume (We support upto 256 TB volume size), there could be millions of blobs in cloud.

What if I don’t have Encryption on SVC ?

These objects are created by Spectrum Virtualise code and are uploaded to cloud using internal gateway. These objects are not in human readable format and in case a cloud account is compromised (It is not a trivial thing though), this snapshot data isn’t directly usable by a rogue user. Any restoration of data requires a Spectrum Virtualise code to run on the system with few other mandatory parameters.
This implies that if a customer is opting for an On-premise cloud using Open Stack Swift and doesn’t want encryption, in that case also, data is very secured in cloud, though IBM highly recommends encryption to be ON.

Advantage with Encryption ON:

When encryption is enabled on SVC or Storwize cluster, data is encrypted first and then put in cloud (public or private) with encryption keys at various levels i.e Cluster, Cloud account layer, volume layer, snapshots generations level etc.

Without access of correct keys, data can never be restored by unauthorised user.

What if all my keys are compromised ?

Spectrum Virtualize provides a re-keying mechanism which can swiftly change all keys of the system including the ones relevant to the cloud accounts. It is a 2 step process and has fault tolerance of SVC as well.

USB vs SKLM Encryption ?

With USB mode encryption, upto 3 USB’s can be enabled with master keys which are critical for accessing data onto cloud.
In case of a site failures, physical access to these keys is required for regeneration of data.

In case of SKLM based keys, keys are stored onto IBM SKLM servers. SKLM stands for Secure Key Life Manager and it acts a secured central repository for all your datacenter keys be it from servers or storages. In case of regeneration of cloud data onto a site, network access to SKLM server is required.

PS: All these thoughts are mine and not necessarily reflect that of my employer.