<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://www.communities.hp.com/online/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>HPC Cluster Edge Blog - All Comments</title><link>http://www.communities.hp.com/online/blogs/hpcclusteredge/default.aspx</link><description>Read the HPC Cluster Edge Blog and learn more about high performance and cluster computing at HP Communities.</description><dc:language>en</dc:language><generator>CommunityServer 2008.5 SP1 (Build: 31106.3070)</generator><item><title>re: SLURM: a Simple Resource Manager no more!</title><link>http://www.communities.hp.com/online/blogs/hpcclusteredge/archive/2009/06/08/slurm-a-simple-resource-manager-no-more.aspx#115457</link><pubDate>Mon, 21 Sep 2009 09:05:11 GMT</pubDate><guid isPermaLink="false">964d1d0f-bea0-4201-a2aa-8aa369a35a46:115457</guid><dc:creator>jeux</dc:creator><description>&lt;p&gt;SLURM is an open-source resource manager designed for Linux clusters of all sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work. &lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.communities.hp.com/online/aggbug.aspx?PostID=115457" width="1" height="1"&gt;</description></item><item><title>re: Interconnect technologies:  InfiniBand, 1G Ethernet, 10G Ethernet</title><link>http://www.communities.hp.com/online/blogs/hpcclusteredge/archive/2009/07/02/interconnect-technologies-infiniband-1g-ethernet-10g-ethernet.aspx#115227</link><pubDate>Sat, 19 Sep 2009 10:11:14 GMT</pubDate><guid isPermaLink="false">964d1d0f-bea0-4201-a2aa-8aa369a35a46:115227</guid><dc:creator>disque dure externe</dc:creator><description>&lt;p&gt;Off late, a vast number of interconnect technologies such as InfiniBand, Myrinet and Quadrics have been introduced into the System-Area Network (SAN) environment; the primary driving requirements of this environment being high-performance and a feature-rich interface. Ethernet, on the other hand, is already the ubiquitous technology for Wide-Area Network (WAN) environments. Traditionally, SAN technologies had been shut off from the WAN environment due to their incompatibility with the existing Ethernet compatible infrastructure. Similarly, Ethernet has traditionally not been considered a SAN interconnect due to its close to order-of-magnitude performance gap compared to other SAN interconnects such as InfiniBand, Myrinet and Quadrics (informally called Ethernot networks).&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.communities.hp.com/online/aggbug.aspx?PostID=115227" width="1" height="1"&gt;</description></item><item><title>re: Do you suffer from cluster monitoring schizophrenia?</title><link>http://www.communities.hp.com/online/blogs/hpcclusteredge/archive/2009/05/28/do-you-suffer-from-cluster-monitoring-schizophrenia.aspx#91896</link><pubDate>Fri, 29 May 2009 10:50:17 GMT</pubDate><guid isPermaLink="false">964d1d0f-bea0-4201-a2aa-8aa369a35a46:91896</guid><dc:creator>mark.seger@hp.com</dc:creator><description>&lt;p&gt;Just to followup on my original post, I realized that by focusing on central vs local data collection I missed raising an even higher level decision that must be made and that is the purpose for the monitoring. &amp;nbsp;Many people are simply looking for a high level view of the overall state of their cluster&amp;#39;s health such as how many systems might be down, whether any systems are idle or perhaps not running close 100% CPU utilization - remember that in HPC, 100% is typically a good thing. &amp;nbsp;There are certainly other things that can be identified in this manner as well such as memory utilization or networks or storage systems running at near capacity. This type of monitoring might also be useful for longer term capacity planning. &amp;nbsp;All worthwhile causes and probably all met with a central monitoring approach.&lt;/p&gt;
&lt;p&gt;But my next question then becomes what do you do when your central management system shows systems crashing or not running at their peak? &amp;nbsp;Or any of a multitude of other problems? &amp;nbsp;Since you&amp;#39;ve been collecting all this data in a central location, surely the answer must be in there and all you&amp;#39;d need to do is look for it. In fact, there is often enough data in here to help identify the problem. &amp;nbsp;&lt;/p&gt;
&lt;p&gt;However, I would claim that in many cases there isn&amp;#39;t enough data since you had to trade off the level of detail being collected in favor of central collection. &amp;nbsp;This is where I believe local data collection can come to your rescue.&lt;/p&gt;
&lt;p&gt;-mark&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.communities.hp.com/online/aggbug.aspx?PostID=91896" width="1" height="1"&gt;</description></item><item><title>re: HPC: The Innovator’s Great Divide?</title><link>http://www.communities.hp.com/online/blogs/hpcclusteredge/archive/2009/05/24/hpc-the-innovator-s-great-divide.aspx#91844</link><pubDate>Wed, 27 May 2009 19:11:09 GMT</pubDate><guid isPermaLink="false">964d1d0f-bea0-4201-a2aa-8aa369a35a46:91844</guid><dc:creator>guodong.zhang@hp.com</dc:creator><description>&lt;p&gt;Speaking about how High Performance Computing (HPC) helps the competiveness of corporations, there is a recent article on Forbes.com titled &amp;quot;American Business&amp;#39;s Secret Competitive Weapon: HPC&amp;quot; by Matthew Faraci &amp;nbsp;that tells compelling stories on how companies could take advantage of HPC to cut cost, improve product designs, and reduce the time to market. &amp;nbsp; Very interest article, and highly recommended to read : &lt;/p&gt;
&lt;p&gt;&lt;a rel="nofollow" target="_new" href="http://www.forbes.com/2009/05/22/high-performance-computing-leadership-managing-hpc.html"&gt;www.forbes.com/.../high-performance-computing-leadership-managing-hpc.html&lt;/a&gt;&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://www.communities.hp.com/online/aggbug.aspx?PostID=91844" width="1" height="1"&gt;</description></item></channel></rss>