Kamran Agayev's Oracle Blog

Oracle Certified Master

CPU usage raised to 100% because of dbresp.pl

Posted by Kamran Agayev A. on January 11th, 2011

Today I’ve got a call from my friend which claimed the performance degredation on one of the production databases. When connecting to SQL*Plus or RMAN, I realized a delay, so run “top” command and checked the running processes on the system. When running ps – ef command, I saw hundreds of perl executables that are currently running on the system:

[sourcecode]oracle   15560     1  3 Jan11 ?        05:50:07 /opt/oracle/product/10.2/db_1/perl/bin/perl /opt/oracle/product/10.2/db_1/sysman/admin/scripts/db/dbresp.pl
oracle   16309     1  3 Jan11 ?        05:44:53 /opt/oracle/product/10.2/db_1/perl/bin/perl /opt/oracle/product/10.2/db_1/sysman/admin/scripts/db/dbresp.pl
…..
…..[/sourcecode]

As the dbresp.pl file locates under sysman folder, I’ve decided that it has some relation with EM, so I checked the EM trace file:

[sourcecode]tail -50 emagent.trc | more

2011-01-11 08:51:37 Thread-4096777120 ERROR fetchlets.oslinetok: Metric execution timed out in 600 seconds
2011-01-11 08:51:37 Thread-4096777120 ERROR command: failed to kill process 24963 running perl: (errno=3: No such process)
2011-01-11 08:51:37 Thread-4096777120 ERROR engine: [oracle_database,prod_db,Response] : nmeegd_GetMetricData failed : Metric execution timed out in 600 seconds
2011-01-11 09:06:37 Thread-4113513376 ERROR fetchlets.oslinetok: Metric execution timed out in 600 seconds
2011-01-11 09:06:37 Thread-4113513376 ERROR command: failed to kill process 25393 running perl: (errno=3: No such process)
2011-01-11 09:06:37 Thread-4113513376 ERROR engine: [oracle_database,prod_db,Response] : nmeegd_GetMetricData failed : Metric execution timed out in 600 seconds
2011-01-11 09:21:37 Thread-4096777120 ERROR fetchlets.oslinetok: Metric execution timed out in 600 seconds
2011-01-11 09:21:37 Thread-4096777120 ERROR command: failed to kill process 26068 running perl: (errno=3: No such process)
2011-01-11 09:21:37 Thread-4096777120 ERROR engine: [oracle_database,prod_db,Response] : nmeegd_GetMetricData failed : Metric execution timed out in 600 seconds
2011-01-11 09:36:37 Thread-4099926944 ERROR fetchlets.oslinetok: Metric execution timed out in 600 seconds[/sourcecode]

Wouu… Interesting output. I’ve decided to check metalink and found the following note: Server Has 100% Of Cpu Because Of Dbresp.pl [ID 764140.1]

Unfortunately as a solution the note adviced me to refer to the metalink note: “ Ext/Mod Problem Performance Agent High CPU Consumption Gen” where it’s written to change the alert.log file name to solve the issue. It wasn’t a real solution, so I’ve decided to take down the EM and kill all processes

[sourcecode]emctl stop dbconsole[/sourcecode]

Then I called the following command and got the list of all dbresp.pl processes and got the script which kills them all :)

[sourcecode]ps -ef | grep dbresp.pl | awk {‘print "kill -9 " $2’} > kill.sh

more kill.sh
kill -9 23989
kill -9 24569
kill -9 25145
kill -9 25723
…..
…..[/sourcecode]

Next, I made it executable and run :

[sourcecode]oracle@host</a>:~> chmod 755 kill.sh
oracle@host:~> ./kill.sh
oracle@host:~>
oracle@host:~> ps -ef | grep dbresp
oracle   32454 29520  0 10:48 pts/0    00:00:00 grep dbresp [/sourcecode]

After killing all unnecessary processes, CPU usage went down.

To deal with this bug, you can check the count of dbresp.pl files,  take down the EM, kill all processes and start it again using any cron job

If you have another solution, please let me know :)

9 Responses to “CPU usage raised to 100% because of dbresp.pl”

  1. Shamil Mehdi Says:

    Very helpful post thank you!

  2. orawiss Says:

    Hi Kamran,
    I faced the same problem few months ago , in one 10gR2 database , I dit the same ; I killed all EM processes after shutdowm EM.
    This is the only solution I found,
    Cheers,
    Wissem

  3. chford Says:

    I, too, had this issue in a 10gR2 db. Found the same exact metalink document and did exactly what you did Kamran. But thanks for reminding me of that day! 😉

  4. Ivan Kartik Says:

    I’m just curious what kill.sh does as you can use ps -ef | grep dbresp.pl | awk {‘print “kill -9 ” $2’} |bash (or other shell) to do the same job 😉

  5. Kamran Agayev A. Says:

    Haha, you’re right Ivan :) THat’s what being a Linux guru 😉

  6. Ahamed Says:

    Thanks!

  7. Javeed Says:

    It didn’t work 4 me on windows, I Stoped the emctl and killed all the em process it is the same oracle using 100% cpu and i created a new alert.log no use still the same

  8. PAS Says:

    For us the 100% CPU from dbresp.pl was caused by the TNS Listener hanging. See: http://arjudba.blogspot.ie/2009/01/listener-hangs-child-listener-process.html
    To resolve: kill process (dbresp.pl), stop listener, start listener.

  9. Oracle 10g Linux – 100% CPU par des scripts perl | Rakams Blog Info Says:

    […] CPU usage raised to 100% because of dbresp.pl Oracle – Utilisation 100% CPU sous linux par Perl Cette entrée a été publiée dans Uncategorized. Vous pouvez la mettre en favoris avec ce permalien. ← Marque-Page – Réseau et Télécom […]

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>