cobol42

COBOL TIPS #30
by
Shawn M. Gordon
President S.M.Gordon & Associates

Well we are back with our continuing saga comparing the C and COBOL languages. Last month I covered the pre-processor, and how it can be used to do conditional compilations, amoung other things. This month is going to be a fairly in depth look at how to declare simple and complex variables, as well as using various techniques for initializing them. So read on, and prepare to be amazed.

VARIABLE DECLARATION

OK the most obvious next step is how to define variables in C. This is where C really has it over COBOL in some respects, as you can define global AND local variables. This would be the equivalent of having something like a state validation paragraph in a COBOL program that declared all the variables it needed at the top of the paragraph and then they went away when the paragraph returned to the calling process. Imagine, no more global variables getting declared that only got used once, or even not at all.

C differs significantly from COBOL in the way that variables are defined. The easiest way to illustrate the differences for integer types is to compare them to IMAGE integer types and then declare them in each language. Since I J and K are basically the same, and J is the only one used commonly in COBOL, I will use that.

IMAGE COBOL C
---------------------------------------------------------
J PIC S9(4) COMP short
J2 PIC S9(9) COMP long
J4 PIC S9(18) COMP extended /* CCS specific */
Z PIC 9 char

Real numbers can't directly be used in COBOL, but C can use them. COBOL typically uses an implied decimal in a J2 type field.

R2 float
R4 double

P4 PIC S9(3) COMP-3   char[3]
P8 PIC S9(7) COMP-3   char[7]
P12 PIC S9(11) COMP-3   char[11]
P16 PIC S9(15) COMP-3   char[15]

An important note here on declaring simple integers in C, a standard declaration would be 'int my_counter'. Now the length of my_counter will be dependent on the native architecture of the machine it was declared on. So on a Classic MPE V machine saying 'int' will declare a 16 bit integer or PIC S9(4) COMP. The problem is that if you move that code to a spectrum that declaration will suddenly be 32 bit or PIC S9(9) COMP. We get around that by declaring them to be either a 'short' int which is 16 bits, or a 'long' int which is 32 bits.

Our next variable type is the string or character array. C deals with strings in an extremely annoying way. Because everything is a single character you have to define an array of characters, this is also how you have to reference it. Here is an example of how you could do it in COBOL and how you would have to do it in COBOL if it worked like C;

char name[8]; /* a variable called name that is eight characters */

01 NAME PIC X(08).

or if it was defined like C would have you do

01 NAME-ARRAY
05 NAME PIC X OCCURS 8 TIMES.

Actually character arrays in C will always be null terminated, so if you needed an eight character array you would need to make it nine to account for the null character at the end.

The only way to initialize a character array to spaces in C is to move single characters to each element of the array. Here would be your choices in the two languages;

01 NAME PIC X(08) VALUE SPACES.
or
MOVE SPACES TO NAME.
or
INITIALIZE NAME.

int i;
char name[9];
for (i=1; i<=9; i++)
name[i] = ' ';
or

memset(&name[strlen(name)], ' ', sizeof(name)-strlen(name));

Pretty nasty huh? The first C example does a little 'for' loop to assign a space to each character member of the of the character array 'name'. The second example is really nasty and is an example of moving spaces into the unused portion of a string, and getting rid of the NULL character, it uses the memory function 'memset' to move the character, space to all character positions starting at the NULL that is in the string. The first parameter casts a pointer to the last NULL position in the string, the seceond parameter indicates the character to copy, and the third parameter is the number of characters to copy. The 'sizeof' function tell's us the entire length of the variable 'name', and the 'strlen' function tell's us the length of the data within the string, so by subtracting the length of the data from the length of the variable, we get the number of characters to copy.

This actually illustrates a nice feature of C, and that is the ability to embed functions within functions. In COBOL you would have had to make multiple statements with intermediary variables to hold results. Of course in COBOL you don't have to jump through these kinds of hoops very often.

C also differentiates between a character and a string. Since the 'char' type only really declares a single character those don't need to be null terminated, so these two declarations are different;

char switch;
char switch2[1];

You would actually need to make 'switch2' an array of two characters for it to function the way you would expect it to because of the null terminator on strings. A null is defined in C with \0, and all of the string manipulation functions rely on the proper placement of the null character so that you will get the expected output.

The distinction between a character and a string in C takes a little getting used to, for example, if you want to initialize our character variable switch to Y you would enclose it in the single quote character ', i.e., 'Y'. Single quotes denote that it is a single character, whereas for switch2 you would use double quotes " to enclose the string, i.e., "Y". It is VITAL that you keep this straight, some compilers won't complain if you use this incorrectly and you could get some really unpredictable results.

Now that we know how to declare simple variables in C, how would we declare a record structure analogous to the 01 variable declaration in COBOL? There is what is known as the 'struct' in C that is used for this exact purpose, although the implementation is a little bit odd. First let's declare a simple layout in COBOL, then I'll do the same in C;

01 CUST-MAST
03 CM-NUMBER PIC X(06).
03 CM-NAME PIC X(30).
03 CM-AMT-OWE PIC S9(9) COMP.
03 CM-YTD-BAL PIC S9(9) COMP OCCURS 12.
03 CM-PHONE PIC X(10).

struct customer_master {
char cm_number[7];
char cm_name[31];
long cm_amt_owe;
long cm_ytd_bal[12];
char cm_phone[10];
};
struct customer_master cust_mast;

The 'struct' verb declares a template of the record type that you are concerned with, once the template is declared you can then declare a variable that is a type of that structure. So the line 'struct customer_master cust_mast' declares a variable 'cust_mast' to be of a type 'customer_master'. You would then reference the member's of the structure by specifying the variable name dot member, i.e., 'cust_mast.cm_name'.

This can be especially handy if you are going to reuse a structure for a different purpose. The drawback here is that there is no convenient way to initialize the variable inside of a structure without addressing each member individually. COBOL has the very handy INITIALIZE verb to do this, you could write a general purpose initialization function in C that would serve the same purpose however.

You can name the structure at the same time as you declare if you don't want to reuse it. After the } and before the ; just put any old variable name that you want it to have.

The last common verb used in the Working Storage section is REDEFINES. At first I didn't think there was an equivalent, but I was wrong, it is the 'union' verb. Redefines is mostly handy for working with a variable as either alpha er numeric. Since byte referencing was introduced in COBOL-85, you hardly ever see a REDEFINE statement used to get at various substrings within a variable anymore. Now let's look at how you would declare a REDEFINE and a 'union'.

01 CUST-MASTER

03 CM-DL-REDEFINE REDEFINES CM-DAYS-LATE.
05 CM-DL-NUM PIC 999.

union redef {
char days_late[4];
int dl_num;
};
union redef days_late_test;

The setup and use of unions is very similar to structs, you can even put a union inside a struct, which is where you would want to use it most of the time anyway. We made 'days_late' a character array of 4 because we have to remember to account for the null character. You can do all sorts of strange things with union's if you care to, but that is really all I am going to touch on.

One other type that I want to touch on is the enumerated type. By using the 'enum' keyword, we can create a new "type" and specify values it may have. (Actually, 'enum' is type 'int', so we really create a new name for an existing type.) The purpose of the enumerated type is to help make a program more readable, like the COBOL 88 level. The syntax is similar to that used for structures;

enum ranges {min = 10, max = 100, mid = 55};
enum ranges tester;

tester = mid;
if (tester > min)

The if statement would be true because tester would have a value of it. I suggest that if you want to use enumerated types that you read up on them a heck of a lot more than what I just touched on here.

The last point I want to make about variable declaration is that C has very little facility for applying edit masks compared to COBOL. This makes it a less than convenient language for writing reports and such where date and dollar edit masks are used extensively.

Next month I am going to cover operators for assigment, and logical testing, as well as looping constructs.