VT: Success or failure?
The Virginia Tech supercomputer experiment generated a lot of excitement when it was announced... and it has generated a lot of controversy too. There's no question that with pizza-fueled student labour and 1100 dual Power Mac desktops (not servers), VT got a great Linpack benchmarketing score for the buck. For a mere US$7 million, they got a machine that scored in the top 3 in the world supercomputer rankings and the first educational cluster to top 10 Tflops/s. And they did it with a new platform (Macs) and with a new OS (OS X.2 Jaguar) (and with high-performance Infiniband interconnects that didn't even had OS X drivers until then). That's a remarkable achievement.
However, it has been plagued by problems. First of all, it doesn't seem that System X (aka Big Mac) has actually been used for anything other than benchmarking and basic testing. There had been rumours of ongoing instability issues, which one wonders might be related to the lack of both ECC memory and real hardware monitoring. And of course, the Power Mac based cluster no longer exists. It has long since been dismantled and sold off as high-priced refurbished equipment at various retailers. Indeed, the cluster had to be removed from the latest top 500 supercomputer list for this reason.
Well, as of just a few days ago, the VT cluster is back. It has already been assembled, this time with dual G5 Xserves. Again they're at 2.0 GHz, and again there are 1100 of them. This cluster solves several key problems, with its support for ECC memory and hardware monitoring. It also offers significant power improvements and of course space savings. It is undergoing testing now, and benchmarking (again) will hopefully start next week, which gives them plenty of time to get on the new supercomputer list coming in a few months. If you're in the neighourhood, they're even going to start offering tours of the facility in a couple of weeks.
COLSA does one (or 466) better
With a healthy dose of cash from the US Army, COLSA is building a G5 supercomputer using 1566 dual 2.0 GHz Xserves, which is 42% larger than the VT cluster. Like the VT cluster, The G5 Xserve's biggest draws were the price performance ratio along with the fact that it's a unix-based system. Some may note that the Linpack benchmark speeds may not be all that great because it's using the built-in Gigabit Ethernet as the backbone, not Infiniband cards. However, it's going to be used for computational fluid dynamics and stuff like that, which they say is less dependent on network speed.
UCLA gets in on the action
For about a million bux, UCLA's Plasma Physics Group is reported to be getting a nice "little" cluster of 256 dual G5 Xserves. Unlike COLSA and VT however, these guys do have some experience with Xserve clusters, since they've worked with NASA to create clusters based on the G4 Xserve. Before anyone asks why anyone would use a G4 Xserve for a cluster in the era of fast Xeons, CFD and other code may be amenable to Altivec acceleration, so despite its lousy FP performance, the G4 often is quite nice for this type of work. Similarly, Genentech seemed to love its G4 clusters for bioinformatics work, precisely because of Altivec. In fact, Apple now has a G5 Xserve based Workgroup Cluster specifically geared to bioinformatics types, along with pre-installed Altivec-accelerated bioinformatics applications.
The University of Maine, too
The University of Maine is also creating a 256 dual G5 Xserve cluster, also with funding ($680000) from the US Army. It seems the US military has just as much money to throw around these days as ever, if not more. Assembly of this system has already begun, and you can follow its progress in their gallery.
Xserve sales unprecedented
This has been a remarkable year for the server group at Apple. The G5 Xserve is hugely popular, being aggressively priced, well-packaged, and apparently quite easy to use. Apple does not publish separate Xserve sales numbers, but it has long been rumoured that the G4 Xserve's unit sales ranged only the few thousand range per quarter. These few G5 supercomputers might just equal those types of sales numbers, and that's not including the large numbers of other Xserves that are being sold to small labs and small businesses. (It's rumoured that large clusters represent the minority of Xserve sales.) Too bad it has taken so long. The G5 Xserve was announced in January, but because of IBM's 90 nm growing pains, it has taken until June/July for the G5 Xserve to ship in volume. This is right at the transition from one quarter to another, which will impact on the G5 sales numbers to be announced today. Still it looks like IBM and Apple are over the hump, and the coming quarter looks very positive for the Xserve. I would not be surprised to see shipments of 5-digit units in this quarter. As for the future, Xserve sales will probably just get better in 2005. That's when OS X 10.4 Tiger comes out, which will allow true 64-bit support (with memory addressing over 4 GB per process). Plus, by then, people will have been able to see functioning G5 supercomputers in action, doing real work.
Oh, and one more thing...
While it's not a Mac supercomputer, it's still a G5 based one. IBM has been commissioned to build a 2282 dual G5 blade server based system for Spain, using 2.2 GHz 970FX chips which don't exist on the Mac side. I wonder how long it will take for 2.2 GHz G5s to make their way into Power Macs and/or Xserves. If IBM can put 2.2 GHz chips into blades, then Apple should have no problem putting them into Xserves.