Techworld - the UK's infrastructure and network knowledge centre
  • Networking
  • Storage
  • Security
  • Mobility and Wireless
  • Applications
  • OS and Servers
  • Mid-sized Business

Home | News | Insight | How-tos | White papers | Case studies | Interviews | Briefings | Reviews | Blog

Print-friendly page

June 09, 06

Isilon on other approaches

A terrible kluge

By Chris Mellor

Rob Anderson is Isilon's consultancy manager for EMEA. He has come to this position after being involved in writing Isilon's operating software and, indeed, holds a couple of software patents, as a result of this work. He knows, as we might say, whereof he speaks.

Isilon's idea of a highly-performant NAS system is to cluster NAS nodes together using Infiniband and to have nodes function as peers in producing a virtualised and automatically load-balancing single pool of NAS storage and processing resource.

The company reckons it has built the first industrial-strength clustered storage system in existence. In comparison IBM's GFS is slow, has performance issues, is without industrial-strength reliability and is hardly ever seen in the market place.

Isilon puts some of its performance advantage down to its use of Infiniband for clustering nodes together. All back-end traffic between the nodes is Infiniband-based. It has a latency one-twentieth of Ethernet, and Infiniband's bandwidth roadmap is going to deliver bandwidth increases much faster than Isilon nodes (currently maxed out at 88 per cluster) can soak up.

Clearly inter-nodal traffic could choke Infiniband if there were enough nodes in an Isilon cluster. This is because one node deals with a single file request from an accessing server. Files and associated parity data are striped across nodes. So, to deliver a file, the requested node has to fetch the pieces, the stripes, from the other nodes. Have 88 nodes simultaneously fetching stripes from up to nine nodes each simultaneously, and you have a lot of Infiniband traffic. But, Rob Anderson, says, nowhere near enough to even begin to throttle performance. Isilon clusters could handle, he asserts: "up to 300,000 concurrent dial-up connections."

"The Infiniband bandwidth roadmap is orders of magnitude bigger than what you would get from an EMC SAN or whatever."

Gigabit Ethernet clustering just doesn't give you the performance and future headroom that Infiniband does.

Anderson says: "NetApp uses Infiniband to cluster two nodes. When NetApp bought Spinnaker it then made a mistake. It tried to add features out of the Spinnaker product into ONTAP. But clustering can't be done that way; it has to be in the DNA of the system. NetApp's approach didn't work. Two years ago NetApp reversed direction. Dave Hitz (NetApp CEO) announced that Data ONTAP GX is a Spinnaker foundation with NetApp features added to it."

Anderson added this comment: "(Data ONTAP GX) is namespace organisation. It's not clustering. It's RAID behind the veil and can still take eight hours to rebuild a disk. They'll be performance problems downstream. It's a bandaid. It's a total kluge."

With Isilon file data and parity data is striped across up to 9 nodes. A failed disk can be re-built in 30 minutes to an hour. In effect, Isilon's striping technology renders RAID redundant.

Anderson says suppliers like Acopia 'do it in the switch layer. It's not rich, it's lightweight.' Again there will be performance problems downstream.

A virtualised pool of NAS resource requires the NAS nodes to be clustered for smooth performance scaling. It also requires N + 2 protection so that the system can recover from two failed disks and not just one. (NetApp's RAID DP provides protection against two disk failures.)

Isilon is working on N + 3 and N + 4 protection. The N + 1 and N + 2 protection schemes can apply to nodes, to folders, even to individual files.

In the Isilon scheme we can conceive of nodes as water tanks connected by a pipe (Infiniband). When a new (and empty) node is added to the cluster then the water finds a fresh level across all the tanks. In the same way the data stored on the existing nodes is spread out across the now-expanded cluster so that all nodes have the same data occupancy utilisation; automatic load-balancing of data storage.

Isilon also says that adding nodes increases I/O performance in the same way as adding lanes to a motorway or runways to an airport does. For added processing performance you can add processor-only nodes. For added capacity scaling you can add disk expansion units.

In Isilon's view it is a simple and clean design that is simple and clean to manage and very reliable and performant in use.

<<newer article | back to index | older article>>