PJP Documentation > Reference Manual Menu

 

Phantom Job Processor Reference Manual

Notes for System Administrators

 

Typically, the PJP software takes care of itself.  Sometimes however, situations will arise where the PJP administrator needs to do some troubleshooting.  The purpose of this page is to assist the admin who is faced with such a situation. 

 

·       What if a phantom is hung?

·       Audit trail items in the PHANTOM-LOG file

·       UniVerse’s &PH& file

 

 

What if a phantom is hung?

Emails notifying the admin of a possibly hung phantom are launched by the PHANTOM.MONITOR program.  This program is run hourly by the system scheduling tool (crontab or Task Scheduler), or by the Phantom Console program on initialization and each time the (R)efresh option is chosen.

Is it really hung?

It often happens that a process looks hung temporarily, but it subsequently recovers.  Reasons for this are explored in this section from the FAQ page.  Historically, we’ve encountered 2 different situations where phantoms are really hung:

-          The phantom port has been terminated.  This can happen either because the program was demanding user input which wasn’t forthcoming from a stacked data command, or because of an environmental issue (i.e. the port was manually terminated or the machine went down and was then re-booted).

-          A programming problem has left the process in an endless loop.

Has the phantom port terminated?

The first step in the troubleshooting process is to determine if the phantom port has terminated or if it is currently executing.  The process for this is dependent upon a combination of the operating system (unix or Windows), the database system (UniVerse or jBase), and how they are configured on your individual server.  Sometimes, you have to improvise.  Here are some hints:

 

From the unix shell prompt, the command   psef|grep PHANTOM   will list all currently operating ports for which the instruction being executed contains the literal “PHANTOM”.  If you can’t find your hung phantom in the listing, it has terminated.

 

In the PHANTOM-STATUS file, items with the id LOGON*PHANTOM_NAME have the port in attribute 9 and the pid in attribute 10.  Depending upon the output that your system provides for the LISTU command from TCL, you may be able to locate the suspect phantom as still executing.  If you can’t find it, it’s probably safe to say that it has terminated.

Killing the phantom:

Once a phantom has been labeled as hung, the (K)ill option becomes available for that phantom on the main screen of the phantom console program.  The internal protocol and capabilities of the kill process are dependent upon your host’s operating system / database system configuration. 

 

At the very least, killing a phantom will clear the internal PJP database and allow for the phantom to be re-launched.  Sometimes, if the phantom was still executing, killing it from the phantom console will terminate the process at the operating system level.  However, this is not a certainty.  Therefore, if the process was still executing, it is important to go back and make sure that is no longer executing prior to re-launching the phantom.  Depending upon which programs are being executed by the phantom, it is sometimes essential that there not be 2 iterations of the same phantom executing concurrently.  Please don’t allow this to happen!

Why was it hung?

If, in post-mortem, you haven’t been able to identify a system environmental reason for the phantom to have become hung, you must suspect a problem with the program which was being executed.  The easiest way to troubleshoot that is to disable the process temporarily (with a cycle of “X”) and run it from TCL.  More often than not, the problem becomes immediately evident using this strategy.

 

Audit trail items in the PHANTOM-LOG file:

There are 3 special items in the PHANTOM-LOG file, providing an audit trail of the last 499 events.  They are arranged in reverse chronological order.  Each time a new event is recorded, the 500th event is deleted, keeping each item from growing inordinately large.

EMAIL-LOG:

The program PHANTOM.SEND.EMAIL is our generic program for issuing a system command to send an email.  It is possible that during PJP installation we found that we needed a customized version of that program to comply with your unique email system requirements, in which case you will find a program named PHANTOM.SEND.EMAIL.CUSTOM in the PL-CCI file.  In either case, each time an email command is sent to your system, the time and date, the content of the email, and the text of the system command will be combined and inserted into attribute 1 of the EMAIL-LOG item.

 

One of the most common support questions is, “Why didn’t I get an email?”.  Using this audit trail, you can determine whether or not PJP sent the email to your operating system software.

MONITOR-LOG:

The program PHANTOM.MONITOR is run every hour by your operating system scheduler software, and it is also run each time the Phantom Console program is initialized and each time the (R)efresh option is executed.  Each time the PHANTOM.MONITOR program is executed, it records the time of execution.  Additionally, if it recognizes a phantom as being hung, it records that event along with somewhat of an explanation about why it thought it was hung.

 

When you are investigating a hung phantom, it is sometimes helpful to refer to the MONITOR-LOG item to find out why it was identified as being hung.

RESUME-LOG:

The program PHANTOM.RESUME is run every hour by your operating system scheduler software, once per logon.  Each time it is run, as part of its initialization, it updates this item with the date, time, and logon for which it is being run.  If it performs its intended function of launching a phantom

which had been stopped as part of a planned stoppage, it will also note that event.

 

UniVerse’s &PH& file:

UniVerse has a file called &PH& which stores all terminal output from phantom processes.  In theory, it would be a great tool to find error messages which may have led to a phantom becoming hung.  Unfortunately, most systems have a garbage collecting process in place which deletes &PH& items after they have become a couple of days old.  In practice, most PJP phantoms run longer than their &PH& items are allowed to survive by the clean-up process.  The item-id of the &PH& file is command_time_date.  For PJP phantoms, the &PH& item-id will look like this: PHANTOM.DRIVER_time_date.  The first line of this item will contain the readable time/date, the logon/phantom name, and the message “Phantom starting”.

 

 

 

Home              Back                Next

 

 

Copyright 2008, Cubs Consulting, Inc.