5 January 2002: Using SRE2003 daemons A description of SRE2003 procedures for implementing daemons and data structures. Contents: 1. Introduction 2. Overview 2.1 Synopsis of SRE_DMN_ procedures (including typical return) 3 . Detailed Description of the SRE_DMN procedures 3.1 SRE_DMN_LAUNCH 3.2. SRE_DMN_OWN 3.3. SRE_DMN_ASK 3.4. SRE_DMN_LISTEN 3.5. SRE_DMN_TIMELEFT 3.6. SRE_DMN_RESPOND 3.7. SRE_DMN_COMMAND 3.8. SRE_DMN_KILL 4.0. Flags and Queues 4.1. SRE2003 Flags 4.2. SRE2003 Queues ------------------ 1. Introduction SRE2003 makes extensive use of "daemons". Daemons are threads running in parallel to the main program. Some daemons are transient (such as request and transaction daemons), while others are permanent (such as daemons that handle caching and auditing). In all cases, SRE2003 uses the SRE2003 daemon manager procedures to handle the creation, completion, and (most importantly) the communications between daemons. This document describes these procedures. In addition to "daemons", SRE supports "flags" and "queues". These are described in section 4. ------------------ 2. Overview The SRE_DMN procedures: SRE_DMN_launch -- launch a daemon SRE_DMN_own -- create identifier information SRE_DMN_ask -- used to send a request to the daemon SRE_DMN_listen -- used by a daemon to wait for a request SRE_DMN_respond -- used by the daemon to reply to a request SRE_DMN_timeleft -- used by daemon to determine max wait time remaining SRE_DMN_command -- send a command to a daemon All of these are written in REXX, and all of these are meant to be used as procedures called by REXX programs and procedures. The basic strategy is as follows: Let's assume your have a "daemon" called MONITOR1 (say, it's a daemon that monitors the status of some resource). You also have your main program. 1) In your main program, launch the MONITOR1 daemon using SRE_DMN_LAUNCH 2) After whatever initializations it needs to do, MONITOR1 should start "listening" for requests. This is done using calls to SRE_DMN_LISTEN. Typically, this is done with a loop, so that after a finite wait with no request arriving (say, 1 minute), MONITOR1 might do some housekeeping and then listen again. 3) The main program would "request" information by using SRE_DMN_ASK to send a request to the daemon. Typically, a finite waiting time is given -- so that if the daemon does not respond, some other action can be taken. The response is returned as the value of the SRE_DMN_ASK function. Thus, FROM THE POINT OF VIEW OF THE MAIN PROGRAM, these daemons are "called" through the intemediary of SRE_DMN_ASK There is one extra step: the "main program" should set up a "ownt_id" that is used in calls to SRE_DMN_ASK. This is done by a call to SRE_DMN_OWN. 4) When a request is issued (when the main program calls SRE_DMN_ASK), the SRE_DMN_LISTEN function will return the request info to the daemon. The daemon then uses this info to form a response, just as if the daemon were called as a procedure. In a sense, the information returned by SRE_DMN_LISTEN is used instead of the value of a "parse arg". Instead of using "return" to send this response back to the main program, the daemon uses SRE_DMN_RESPOND. There is one optional step: the daemon can call SRE_DMN_TIMELEFT to be sure that the client is still waiting for a response. Note: SRE_DMN_COMMAND procedure is for communication on control information between SRE_DMN functions ------------------ 2.1 Synopsis of SRE_DMN_ procedures (including typical return) STUF=SRE_DMN_Launch(daemon_name,daemon_file,[verify,p1,...,p12]] returns thread_id ',' daemon_id or, if verify=1, returns thread_id ',' daemon_id ',' verify_result STUF=SRE_DMN_own([extra1,justsay]) returns client_id STUF=SRE_DMN_ask(daemon_name,param,waitmsec,[ownid,syscommand]) returns length_response ',' response STUF=SRE_DMN_listen(daemon_id,max_semwait) if request found, returns astamp ',' param otherwise ' ' STUF=SRE_DMN_timeleft(astamp) returns ' ' if no time left, else 1/100th seconds left STUF=SRE_DMN_respond(astamp,results) returns 1, or an 'error ' message STUF=SRE_DMN_command(daemon_name,cmd_name,msec) used for inter-daemon communication ------------------ 3. Detailed Description of the SRE_DMN procedures 3.1 SRE_DMN_launch: Launch a daemon in a seperate thread, and return a daemon id stuf=SRE_DMN_Launch(daemon_name,daemon_file,verify,param1,...] daemon_name -- a colloquial name to use for this daemon If it starts with / or \, then the daemon is "global" Otherwise, it's "process specific" daemon_file -- file containing the rexx code. Alternatively, if the first character of daemon_file is ':', then the rest of the string identifies a macrospace procedure to be used (as the daemon) verify -- Optional. If verify>0: after launching the daemon, wait verify milliseconds to see if the daemon successfully launched. If verify=1, then wait 10 seconds for verification Deprecated (use SRE_DMN_COMMAND for an alternative): If verify='?', then just check to see if the daemon exists (daemon_file is ignored, param1.. are ignored) This check is against the daemon's semaphore. p1 .. p12 -- Optional. Up to 12 parameters the daemon code can read The daemon should read this using parse arg dmn_id,p1,..,p12 Stuf contains either: Normal call: thread_id ',' daemon_id the thread_id is what thread the daemon is running under. The daemon_id is used by other SRE_DMN procedures such as SRE_DMN_ASK (daemon_id is a slightly more efficient alternative to the daemon_name), or 'error (launch) 'error_message if an error occured. Normal call, with verify>0 thread_id ',' daemon_id ',' verify_result verify_result is the result of a verify call -- if it begins with 'error', then and error occurred Or, error message (as above) Verify=? call 0 or 1 0=deamon does not exist, 1=daemon does exist Notes: * the launched daemon should start always start with parse arg daemon_id,param1,...,param12 the daemon_id is used in sre_dmn_listen and sre_dmn_timeleft * daemon_id structure is: queue_name' 'sem_name' 'daemon_name ----------------------- 3.2. SRE_DMN_OWN Create an "Own-id" for use by a client. The ownt_id is typically with calls to sre_dmn_ask. Technically, SRE_DMN_OWN creates (or clears) a "thread specific" queue and semaphore, and returns a structured string containing this information. Call as: STUF=sre_dmn_own([extra1,justsay]) Where: extra1 (optional) Extra "id" information. This should be a single word (no embedded spaces), typically unique to the caller. Thus, if several threads may call a daemon, each thread should use its own "extra1" id. This is optional -- its use can save a fraction of a second. justsay (optional) Controls what is done not speciifed -- create the own_id 1 - return, but do not create, client_id This is used when you know that the won_id was created (by an earlier call to SRE_DMN_own) 2 - return a "success flag" along with the client_id STUF is equal to: If justsay<>2: If success: client_id If failure ' ' If justsay =2) success_flag ',' client_id success_flag is 0 if the client_id could not be created; 1 if it could or (if an error occurred). Notes: * error messages are written to pmprintf * the client_id structure is: own_queue' 'own_sem' 'extra1 where extra1 may be ' ' (if it was not specified) Examples: id=sre_dmn_own() id=sre_dmn_own('THREAD_10') parse value sre_dmn_own('THREADXX',2) with status ',' id ----------------------- 3.3. SRE_DMN_ASK Make a request to a daemon, and wait for the response. Usage: STUF= SRE_DMN_ask(daemon_name,param,waitmsec,ownid,syscommand) where: daemon_name The daemon's name (or it's daemon_id, as returned by SRE_DMN_LAUNCH) If you got it, use the daemon_id (it speeds things up a bit). params An arbitrarily long string (not necessarily a text string) containing information to be passed to the daemon For now, only one parameter can be passed. If you want to send multiple values, you'll have to add your own seperator (such as a comma), and have the daemon parse them out. waitmsec Number of milliseconds to wait for reply. If not specified, 90 seconds If 0, no wait (SRE_DMN_ASK will immediately return ' ') If -1, inifinite wait If -2, quick no wait (same as 0, with extra shortcuts Waitmsec can also have a second word: "E". If this is present, then a "two part" response is returned (status,value). Otherwise, just the value is returned. For example: '2000 E' means "wait 2 seconds, return a two part response 2000 means "wait 2 seconds, just return the value " ownid Optional. Own-id (as created by sre_dmn_own). Specifying own-id can speed things up a bit If ownid=0, then do NOT wait for a response (this is exactly the same as using waittime=-2) syscommand Optional. Used in special cases (to talk to other sre_dmn_ procedures). Not recommended for normal use (see SRE_DMN_COMMAND instead) stuff is If waitmsec=-2 or = 0 ' ' If TwoPart Mode 'error (ask) 'error_message -- if error occurred length_response ',' response -- value returned by the daemon otherwise (if not two part mode) ' ' : error response : the value returned by the daemon (which might be ' ') Notes: * sre_dmn_ask assumes that the daemon uses sre_dmn_listen! ----------------------- 3.4. SRE_DMN_LISTEN Used by daemons' to "listen for requests" submitted by SRE_DMN_ASK Usage: STUF= SRE_DMN_listen(daemon_id,max_semwait) Where: daemon_id The daemon's "id". This is the first argument send to the daemon by SRE_DMN_LAUNCH. max_semwait Milliseconds to listen for. If 0, check for a query, and immediately return STUF equals: No pending request (i.e. timed out with no request detected): ' ' Error occurred: 'error 'error_message A request was made: request_stamp ',' param where: request_stamp contains"client id info'. This is used when the daemon calls sre_dmn_timeleft and sre_dmn_respond param the parameter provided in the call to sre_dmn_ask. Note that only 1 parameter can be sent. If you need to send more then one variable, it is up to you to figure out how to combine them (say, by using commas as delimiters). Notes: * the request_stamp has the stucture: queue semid id timedone syscommand where timedone has the structure julian_day:seconds.hsec * sre_dmn_listen looks for "syscommands" that are used by SRE_DMN_COMMAND. For some syscommands (such as TID), a response is sent back (to SRE_DMN_ASK), and the daemon is not notified. For the EXIT syscommand, SRE_DMN_LISTEN will cut the connection (kill the queue and semaphore used for inter-daemon communication), tell the daemon to EXIT, and exit -- it will NOT respond to SRE_DMN_ASK! Note that properly coded daemons should look for a request_stamp of "EXIT" -- this means "exit asap". ----------------------- 3.5. SRE_DMN_TIMELEFT Time remaining for this request (the request must have been submitted by sre_dmn_ask) Usage: STUF=sre_dmn_timeleft(astamp) where: astamp the request_stamp provided by sre_dmn_listen STUF equals If a timeout occurred (given the value of waitmsec in SRE_DMN_ASK). ' ' Otherwise, the time remaining, as seconds.hsec SRE_DMN_TIMELEFT can be called as many times as needed -- a daemon can use this to know when time is running out (say, that it's a good idea to send a partial response). ----------------------- 3.6. SRE_DMN_RESPOND Used by a daemon to return a response to client (assuming the client used sre_dmn_ask, and the daemon read the request using sre_dmn_listen). Usage: STUF= SRE_DMN_respond(astamp,results) where: astamp the request-stamp returned by sre_dmn_listen results the response STUF equals: If success 1 otherwise 'error 'error_message Note: if astamp equals 0, then SRE_DMN_RESPOND does nothing (i.e.; astamp will equal 0 when SRE_DMN_ASK used a 0 waittime). if astamp equals 'EXIT', SRE_DMN_RESPOND will immediately exit. This is actually a condition that should not occur -- a properly programmed daemon will look for a request-stamp of 'EXIT', and exit ASAP if it is found. ----------------------- 3.7. SRE_DMN_COMMAND Low level daemon manipulation commands (kill, disconnect, etc.) Usage: STUFf=SRE_DMN_command(daemon_name,cmd_name,msec) Where: daemon_name the daemon's name, or it's daemon_id (as provided by sre_dmn_launch) OR For the KILL command, the argument returned when the daemon was launched by SRE_DMN_LAUNCH (this is: thread_id,daemon_id). acommand the command to perform msec milliseconds to wait STUF depends on acommand Valid commands are: CONNECTED -- if daemon is connected (is able to recieve requests). return 1. Else return 0 This works by checking the daemon's semaphore DISCONNECT -- disconnect the daemon (kill it's queue and semaphore) This is a more cetain way of disconnecting then the DIE command -- since DIE requires that the daemon be listening (with SRE_DMN_LISTEN). VERIFY -- return 1 (if daemon is still running). Otherwise, ' ' VERIFY is a more exacting test then CONNECTED. TID -- return thread the daemon is running under DIE or EXIT -- disconnect, and "exit" (proper operation requires that the daemon watch for an EXIT "request-stamp") KILL -- DEPRECATED. We recommend using SRE_DMN_KILL instead Kill the daemon --- disconnect and stop it's thread. This is more drastic the EXIT, since it does not depend on daemon to be using SRE_DMN_LISTEN. However, it is not as clean -- it just tries to kill the thread without attempting to close things cleanly. Note that to use KILL, the first argument most be the result returned by SRE_DMN_LAUNCH, and NOT the daemon_name. ------------------ 3.8. SRE_DMN_KILL An SRE_DMN command to kill a daemon. This is a harsh command -- -- it does NOT ask the daemon to shut down gracefully! -- Instead, it kills the queue and semaphore, and attempts to kill the thread (that the daemon is running under) ** Sometimes the thread won't die -- in which case the next time the deamon tries to use it's daemon-id (say,in a SRE_DMN_LISTEN call), a "missing queue and semaphore" error will occur. Usaage: astat=sre_dmn_kill(augmented_daemon_id) where: augmented_daemon_id the value of the argument returned by SRE_DMN_LAUNCH -- it includes the "thread_id," (without the "s) as a prefix to the normal "daemon-id". Astat will be set to: 'error n_id)message' if an error occurs '1' on success Notes: * all other SRE_DMN procedures use ths "normal" daemon-id. If other SRE_DMN procedures are given an "augmented" daemon-id, they can detect (and remove) the "thread_id," prefix. * if a "thread_id," (without the ") is not present, SRE_DMN_KILL returns an error message * the normal daemon-id contains semaphore, queue, and name fields. ------------------ 4.0 Flags and Queues 4.1: SRE2003 Flags Flags are essentially semaphores. They are useful means for threads to signal the existence of a condition. For example, SRE2003 uses flags to signal a shutdown -- the various daemons check a SHUTDOWN flag on a regular basis, and exit if it's set equal to 1. There are three kinds of flags supported by SRE2003. Normal: these are meant to be read by a set of threads in a single process. However, they can be read by threads in other processes. They are also seriallized, hence can be set and read by several different threads. Global: Same as normal, but open to all processes. That is, global flags are readily set and read by different processes, wherase normal flags are designed to be specific to a process. Local: Only can be read by threads in the same process. Not seriallized, hence should only be used in "set by one, read by many" circumstances. The biggest advantage of local flags is that they are more quickly read (they use the OS/2 environment, rather them event semaphores). Three procedures are used to work with flags: SRE_FLAG_OPEN, SRE_FLAG_SET, and SRE_FLAG_CLOSE. SRE_FLAG_OPEN : Create a flag. Syntax stat=SRE_FLAG_OPEN(flagname,state,pid) where flagname: The name of the flag. You will use this name whenever you check the flag's status. Flagnames that begin with a / or a \ are GLOBAL flags. Flagnames that begin with a # are LOCAL flags. All other flagnames are treated as NORMAL flags. stats: Optional. 0 or 1. The flag is set to this state. The default is 0 pid: Optional. The current process ID. This is used to speed things up a tad. It is also used to set/read normal flags running under a different process. If you are content to use normal flags strictly for intra-process communication, and a few milliseconds here and there doesn't bother you, then you need not bother specifying this parameter The returned value is either 1 (for success), or 'error 'error_message. SRE_FLAG_SET: Read and set a flag Syntax: vv=sre_flag_set(flagname,state,pid,acum) where: flagname: name of flag (that was used in sre_flag_open call) state: optional. If specified, either a 0 or a . If not specified: READ the flag's value If specified: set the current value One can think of 0 as "off", and 1 as "on". pid: optional. The process ID this flag refers to. See SRE_FLAG_OPEN for the details. The returned value is either a 0 or 1. or 'error 'error_message If state is specified, the return value should be the same as the value of "state". SRE_FLAG_CLOSE: close a flag Syntax: stat=sre_flag_close(flagname,pid) where flagname and pid are as described above. The returned value is either 1 (for success), or 'error 'error_message. Notes on SRE_FLAG procedures: * If specifying a GLOBAL flag, be sure to include the preceding / or \. Simiarly, for LOCAL flags, be sure to include the preceding #. Basically, the following is completely legal: stat1=sre_flag_open('MY_FLAG') stat2=sre_flag)open('\MY_FLAG') stat3=sre_flag_open('#MY_FLAG') In this example, three seperate flags will be created, each refererred to by its name (MY_FLAG, \MY_FLAG, or #MY_FLAG). * if a process stops, the "flags" are destroyed. Thus, "GLOBAL" flags will disapper when the process that opened them dies. * Flagnames are case insensitive * Flagnames should NOT start with a number. They must NOT contain spaces. Basically, start with an alphabetical character, and use _, digits, and alphabetical characters elsewhere. * If you sre_flag_open a currently existing flag, the currently existing flag will be overwritten. 4.2: SRE2003 Queues Queues are essentially FIFO queues -- they are very similar to OS/2 REXX queues. They are useful means of storing unique values, such as pointers to available resources. They can also be used as dynamic and seriallized counters. As with flags, SRE2003 supports both "normal" and "global" queues -- with global queues signified by names that start with a \ (or a /). The queue procedures are: SRE_Q_OPEN: create a simple queue Syntax: stat=sre_q_open(qname,pid) where: qname: the name of the queue. As noted, queuenames that begin with a \ (or a /) are "global" queues. pid: process id. Used just like the "pid" in SRE_FLAG_OPEN. Returns a 1 for success, or 'error 'error_message SRE_Q_CLOSE: close a queue Syntax: stat=sre_q_close(qname,pid) where qname and pid are as noted under SRE_Q_OPEN. Returns a 1 for success, or 'error 'error_message SRE_Q_SET: add and read contents of a queue Syntax: vv=sre_q_set(qname,action,value,pid) where: qname: as described above action: what to do. For example: POP, PUSH, QUEUE, CT, EXISTS, POPALL See below for the details value: the value to add, or a value to return on certain "empty queue" conditions pid: as described above Return depends on the "action" The supported actions are: CT: Returns a count of the number of elements in the queue. If the queue is not defined, returns a 'error missing queue' EXISTS: Check to see if a queue exists. Returns a 0 if no, 1 if yes. POP: read the top element of the queue Returns it's value. If the queue is empty, the "value" argument is returned. If "value" is not specified, returns ' ' If an error occurs, returns '' POPALL: read all elements in the queue Return these elements (in order read), using "value" as delimiter If the queue is empty, the "value" argument is returned. If "value" is not specified, use ' ' as a delimiter If an error occurs, returns '' PUSH: push the "value" onto the top of the queue. If a POP immediately follows a PUSH, the PUSHED value will be returned. Returns a 1 for success, or 'error 'error_message QUEUE: queue a "value" to the bottom of the queue. Returns a 1 for success, or 'error 'error_message Notes: * Value can be any string (where numbers are treated as strings) This includes arbitarily long strings, which can contain any character. * Sorry, there is no "POP_FROM_BOTTOM" action * Queue names are case insensitive * Queue names should NOT start with a number. They must NOT contain spaces. Basically, start with an alphabetical character, and use _, digits, and alphabetical characters elsewhere. * If you sre_q_open a currently existing queue, the currently existing queue will be overwritten.