About this blog series - This is the 1st posting of a series which describes the experiences of engineers who test the performance of HPC servers and server clusters at HP.
My name is Dave Field. I lead an engineering group at HP - we measure the performance of new HP servers. In addition to the common industry-standard benchmarks, we concentrate on the performance of real HPC ISV applications. In the 20+ years we have done this work, we have seen many server architectures. These days, HPC clusters of servers using multi-core processors occupy most of our energy.
We evaluate the performance of new server products, so receiving a new server model is a common occurrence. This has been an especially rich year for new products - this is the 14th new HP server we've tested this year, with at least one more to go before the year is over. HP servers for HPC span the range of industry-standard processors - Intel Xeon and Itanium2 and AMD Opteron. (In HP terminology, the processor is the physical component which plugs into the system board. A processor contains one or more cores, or CPUs.) And for each processor type, there are specific models with different architectural features.
Since we test pre-production, or prototype, computers, it's not quite true that I received a server - we usually receive new product kits. Testing new products can be very interesting, but to get to the interesting part, there are inevitably a number of problems to solve. We need to turn the kit into a working computer, then ensure that the performance meets the product specs, before we can do meaningful performance evaluation. These initial steps are lessons in patience and expectation-setting, during which I always meet some new people who will help in problem-solving.
The new server kit usually contains the server enclosure, system board, and processors. To turn the kit into a computer, we need to obtain three layers of stuff - supporting hardware (the right DIMMs, network interfaces, and disks), firmware, and operating system.
Firmware is in flux during the pre-production period, and each version of pre-production firmware changes the server's performance. Usually the processors are pre-production versions, tied to specific firmware revs. Most of the performance data collected on these early versions will be discarded. But if we don't get some measurements now, we can't influence the product. Sometimes we identify performance issues which can be fixed before production release - so this is a very satisfying part of the job.
These days, the current versions of the major Linux distributions work out-of-the-box on new server models.
When the operating system boots, we can begin to measure performance!
Posted
09-23-2008 7:16 PM
by
d-field