直立行走的日子(目标5%)

北纬二度,为生计奔波;偶有小成,仍谨言慎行

Main menu

Skip to primary content
Skip to secondary content
  • Home

Tag Archives: Cluster

Post navigation

← Older posts

Interactive jobs in SGE

Posted on 2007年09月12日 by puzzlebird
Reply

SGE support interactive jobs with qsh, qrsh and qlogin. While qsh focus with X11 sessions, qrsh and qlogni are using to execute interactive commands (qlogin is more used to initiate interactive shell, but qrsh do the command and exit).

Continue reading →

Posted in Technical hands-on | Tagged Cluster | Leave a reply

Dell cluster headnode down

Posted on 2006年10月02日 by puzzlebird
Reply

This is the first time when Dell cluster gets down after I took over the Unix/Linux side. At about 10:20, I got a call from colleague that the cluster head node is not accessible and even ping does not receive any response. Consequently, the entire cluster is not available temporarily to the business.

Continue reading →

Posted in Technical hands-on | Tagged Cluster, Linux | Leave a reply

How does Ganglia monitor and collects nodes info

Posted on 2006年09月25日 by puzzlebird
Reply

Ganglia is one of the most widely used open source cluster monitor tools. It is documentation page has detailed information on how to configure gmond and gmetad, but lacks an architectual overview on how Ganglia collects modes information for each monitored node.

Continue reading →

Posted in Technical hands-on | Tagged Cluster, PHP | Leave a reply

Fixed the problem where cexec in C3 hangs

Posted on 2006年08月30日 by puzzlebird
Reply

We use C3 to submit same command to be executed on all the nodes. This week I found the “cexec” command does not work anymore. It simply hangs after command and timesout after very long. 

Continue reading →

Posted in Technical hands-on | Tagged Cluster | Leave a reply

Again, to work with machines

Posted on 2006年08月21日 by puzzlebird
Reply

As the ex-unix-admin left the company, I need to take the responsibility to take care of these unix/linux machines. While the company expected me to do more high-level things than playing with physical machine/environment, I still need to do some basic work while there is nobody to do it. 

Continue reading →

Posted in Technical hands-on | Tagged Cluster, Linux, Mac, Storage | Leave a reply

Clarified a few confusion on LAM-MPI

Posted on 2006年04月20日 by puzzlebird
Reply

I had a few wrong understandings of LAM-MPI and get clarified today when working out the cluster project. Referring back to my last post on MPI, the conclude was made wrong. The daemon I started can only be used by myself. Every user have to start their own lamd before running program with R-SNOW or Rmpi package.

Continue reading →

Posted in Technical hands-on | Tagged Cluster, Mac, PHP | Leave a reply

Rmpi/SNOW runs job only at headnode, fixed

Posted on 2006年04月17日 by puzzlebird
Reply

In our recent demo of Rmpi and SNOW package for parallel computing, developer reported jobs are extremely slow. The system adminstrator found it appeared all the mpi jobs actually ran only at the head node.

Continue reading →

Posted in Technical hands-on | Tagged Cluster, PHP | Leave a reply

About puzzlebird

Posted on 2006年03月22日 by puzzlebird
Reply

So, here is something about me.

Expertises:
HPC Architecutre, Performance tuning (system/program profile),
Linux/Unix, Mac osX, Oracle, MySQL, Perl, C, SHELL, PHP, XML, Joomla

Continue reading →

Posted in MY Life | Tagged Cluster, HPC, Joomla, Linux, Mac, MySQL, Oracle, Perl, PHP, SAN, XML | Leave a reply

Prototype code to run batch jobs with SGE

Posted on 2006年03月19日 by puzzlebird
Reply

User once ran thousands of “agrep” jobs on our 20 CPU Solaris machine and took all the CPUs. I am leading the process to introduce cluster computing to the bioinformatics group, so this is a good chance to convert this scripts to run in a SGE cluster environment. 

Continue reading →

Posted in Technical hands-on | Tagged Cluster, Linux, Mac, Perl, SAN, Solaris | Leave a reply

Fix “NIS account” issue using nscd

Posted on 2006年03月09日 by puzzlebird
Reply

We encounted a problem with current cluster that about 5-8% percent of jobs fail with error “can not find password entry for user ‘xxx’. User may not exists or NIS error”. This kind of error happens randomly to the submitted jobs and only affects NIS users (the local account works perfectly well).
 

Continue reading →

Posted in Technical hands-on | Tagged Cluster, Mambo | Leave a reply

Post navigation

← Older posts

About Puzzlebird

Puzzlebird manages enterprise IT for Pharam MNC in Singapore, specialized in service management and system integration with 12 years of working experience.

Puzzlebird任职跨国医药公司,负责管理新加坡IT部门,专长于信息服务,团队管理以及系统集成。

Puzzlebird @LinkedIn

Meta

  • Log in
  • Entries RSS
  • Comments RSS
  • WordPress.org

Tag Cloud

Family 人生 Management China Perl Cluster 次贷危机 通货膨胀 Solaris 管理 Mambo 商品 forex MySQL 热钱 投资 回国 家 Linux Investment leadership 健康 石油 Joomla 金融危机 stock Microsoft 基金 Storage NAS 股票 交易 中国 Ensembl C# PHP EMC singapore 教育 未来 XML Mac SAN 社会 Oracle

Archives

  • February 2012 (4)
  • January 2012 (24)
  • December 2011 (44)
  • November 2011 (21)
  • October 2011 (18)
  • September 2011 (18)
  • August 2011 (12)
  • July 2011 (5)
  • June 2011 (7)
  • May 2011 (12)
  • April 2011 (10)
  • March 2011 (17)
  • February 2011 (6)
  • January 2011 (21)
  • 2010 (117)
  • 2009 (74)
  • 2008 (196)
  • 2007 (120)
  • 2006 (220)
  • 2005 (338)
  • 2004 (322)
  • 2003 (207)
Proudly powered by WordPress