Import Fortran unformatted binary - import

I have an unformatted binary file generated using the Compaq Visual Fortran compiler (big endian).
Here's what the little bit of documentation states about it:
The binary file is written in a general format consisting of data arrays, headed by a descriptor record:
An 8-character keyword which identifies the data in the block.
A 4-byte signed integer defining the number of elements in the block.
A 4-character keyword defining the type of data. (INTE, REAL, LOGI, DOUB, or CHAR)
The header items are read in as a single record. The data follows the descriptor on a new record. Numerical arrays are divided into block of up to 1000 items. The physical record size is the same as the block size.
Attempts to read such data
module modbin
type rectype
character(len=8)::key
integer::data_count
character(len=4)::data_type
logical::is_int
integer, allocatable:: idata(:)
real(kind=8), allocatable::rdata(:)
end type
contains
subroutine rec_read(in_file, out_rec)
integer, intent(in):: in_file
type (rectype), intent(inout):: out_rec
!
! You need to play around with this figure. It may not be
! entirely accurate - 1000 seems to work, 1024 does not
integer, parameter:: bsize = 1000
integer:: bb, ii, iimax
! read the header
out_rec%data_count = 0
out_rec%data_type = ' '
read(in_file, end = 20) out_rec%key, out_rec%data_count,
out_rec%data_type
! what type is it?
select case (out_rec%data_type)
case ('INTE')
out_rec%is_int = .true.
allocate(out_rec%idata(out_rec%data_count))
case ('DOUB')
out_rec%is_int = .false.
allocate(out_rec%rdata(out_rec%data_count))
end select
! read the data in blocks of bsize
bb = 1
do while (bb .lt. out_rec%data_count)
iimax = bb + bsize - 1
if (iimax .gt. out_rec%data_count) iimax = out_rec%data_count
if (out_rec%is_int) then
read(in_file) (out_rec%idata(ii), ii = bb, iimax)
else
read(in_file) (out_rec%rdata(ii), ii = bb, iimax)
end if
bb = iimax + 1
end do
20 continue
end subroutine rec_read
subroutine rec_print(in_recnum, in_rec)
integer, intent(in):: in_recnum
type (rectype), intent(in):: in_rec
print *, in_recnum, in_rec%key, in_rec%data_count, in_rec%data_type
! print out data
open(unit=12, file='reader.data' , status='old')
write(12,*)key
!write(*,'(i5')GEOMINDX
!write(*,'(i5')ID_BEG
!write(*,'(i5')ID_END
!write(*,'(i5')ID_CELL
!write(*,'(i5')TIME_BEG
!write(*,'(i5')SWAT
!format('i5')
!end do
close(12)
end subroutine rec_print
end module modbin
program main
use modbin
integer, parameter:: infile=11
! fixed size for now - should really be allocatable
integer, parameter:: rrmax = 500
type (rectype):: rec(rrmax)
integer:: rr, rlast
open(unit=infile, file='TEST1603.SLN0001', form='UNFORMATTED',
status='OLD', convert='BIG_ENDIAN')
rlast = 0
do rr = 1, rrmax
call rec_read(infile, rec(rr))
if (rec(rr)%data_type .eq. ' ') exit
rlast = rr
call rec_print(rr, rec(rr))
end do
close(infile)
end program main
This code compiles and runs smoothly showing
and produces no errors but this is written in the output file
shows me no useful numerical values
The file in question is available here
And the right WRITE statement should produce a file like this one here
Is my WRITE STATEMENT to output this file type wrong? , and if so, what is the best way?
thank you

The comments above are trying to direct you to one of (at least) two problems in your code. In the subroutine rec_print you have write(12,*)key where you meant to write write(12,*)in_rec%key (at least I think that's what you wanted.)
The other problem I spotted is that rec_print opens reader.data with status='old' and then closes it after writing key. (The use of old here suggests that the file already exists.) Each time rec_print is called, the file is opened, the first record is overwritten, and the file is closed. One solution to this would be to use status='unknown'. position='append', though it would be more efficient to open the file once in the main program and just let the subroutine write to it.
If I make these changes, I get in the data file:
INTEHEAD
GEOMETRY
GEOMINDX
ID_BEG
ID_END
ID_CELL
TIME_BEG
SWAT
A side-comment about CONVERT= and derived types: Your program isn't affected by this, but there are compiler differences with how reading a derived type record with CONVERT= is handled. I think gfortran converts each component according to its type, but I know that Intel Fortran doesn't convert reads (nor writes) of an entire derived type. You are reading individual components, which works in both compilers, so that's fine, but I thought it was worth mentioning.
If you're wondering why Intel Fortran does it this way, it's due to the VAX FORTRAN (where CONVERT= came from) heritage with STRUCTURE/RECORD and the possible use of UNION/MAP (not available in standard Fortran). With unions, there's no way to know how a particular component should be converted, so it just transfers the bytes. I had suggested to the Intel team that this could be relaxed if no UNIONs were present, but that I'm sure is very low priority.

Related

What is contained in the "function workspace" field in .mat file?

I'm working with .mat files which are saved at the end of a program. The command is save foo.mat so everything is saved. I'm hoping to determine if the program changes by inspecting the .mat files. I see that from run to run, most of the .mat file is the same, but the field labeled __function_workspace__ changes somewhat.
(I am inspecting the .mat files via scipy.io.loadmat -- just loading the files and printing them out as plain text and then comparing the text. I found that save -ascii in Matlab doesn't put string labels on things, so going through Python is roundabout, but I get labels and that's useful.)
I am trying to determine from where these changes originate. Can anyone explain what __function_workspace__ contains? Why would it not be the same from one run of a given program to the next?
The variables I am really interested in are the same, but I worry that I might be overlooking some changes that might come back to bite me. Thanks in advance for any light you can shed on this problem.
EDIT: As I mentioned in a comment, the value of __function_workspace__ is an array of integers. I looked at the elements of the array and it appears that these numbers are ASCII or non-ASCII character codes. I see runs of characters which look like names of variables or functions, so that makes sense. But there are also some characters (non-ASCII) which don't seem to be part of a name, and there are a lot of null (zero) characters too. So aside from seeing names of things in __function_workspace__, I'm not sure what that stuff is exactly.
SECOND EDIT: I found that after commenting out calls to plotting functions, the content of __function_workspace__ is the same from one run of the program to the next, so that's great. At this point the only difference from one run to the next is that there is a __header__ field which contains a timestamp for the time at which the .mat file was created, which changes from run to run.
THIRD EDIT: I found an article, http://nbviewer.jupyter.org/gist/mbauman/9121961 "Parsing MAT files with class objects in them", about reverse-engineering the __function_workspace__ field. Thanks to Matt Bauman for this very enlightening article and thanks to #mpaskov for the pointer. It appears that __function_workspace__ is an undocumented catch-all for various stuff, only one part of which is actually a "function workspace".
1) Diffing .mat files
You may want to take a look at DiffPlug. It can do diffs of MAT files and I believe there is a command line interface for it as well.
2) Contents of function_workspace
SciPy's __function_workspace__ refers to a special variable at the end of a MAT file that contains extra data needed for reference types (e.g. table, string, handle, etc.) and various other stuff that is not covered by the official documentation. The name is misleading as it really refers to the "Subsystem" (briefly mentioned in the official spec as an offset in the header).
For example, if you save a reference type, e.g., emptyString = "", the resulting .mat will contain the following two entries:
(1) The variable itself. It looks sort of like a UInt32 matrix, but is actually an Opaque MCOS Reference (MATLAB Class Object System) to a string object at some location in the subsystem.
[0] Compressed (81 bytes, position = 128)
[0] Matrix (144 bytes, position = 0)
[0] UInt32[2] = [17, 0] // Opaque
[1] Int8[11] = ['emptyString'] // Variable Name
[2] Int8[4] = ['MCOS'] // Object Type
[3] Int8[6] = ['string'] // Class Name
[4] Matrix (72 bytes, position = 72)
[0] UInt32[2] = [13, 0] // UInt32
[1] Int32[2] = [6, 1] // Dimensions
[2] Int8[0] = [''] // Variable Name (not needed)
[3] UInt32[6] = [-587202560, 2, 1, 1, 1, 1] // Data (Reference Target)
(2) A UInt8 matrix without name (SciPy renamed this to __function_workspace__) at the end of the file. Aside from the missing name it looks like a standard matrix, but the data is actually another MAT file (with a reduced header) that contains the real data.
[1] Compressed (251 bytes, position = 217)
[0] Matrix (968 bytes, position = 0)
[0] UInt32[2] = [9, 0] // UInt8
[1] Int32[2] = [1, 920] // Dimensions
[2] Int8[0] = [''] // Variable Name
[3] ... 920 bytes ... // Data (Nested MAT File)
The format of the data is unfortunately completely undocumented and somewhat of a mess. I could post the contents of the Subsystem, but it gets somewhat overwhelming even for such a simple case. It's essentially a MAT file that contains a struct that contains a special variable (MCOS FileWrapper__) that contains a cell array with various values, including one that magically encodes various Object Properties.
Matt Bauman has done some great reverse engineering efforts (Parsing MAT files with class objects in them) that I believe all supporting implementations are based on. The MFL Java library contains a full (read-only) implementation of this (see McosFileWrapper.java).
Some updates on Matt Bauman's post that we found are:
The MCOS reference can refer to an array of handle objects and may have more than 6 values. It contains sizing information followed by an array of indices (see McosReference.java).
The Object Id field looks like a unique id, but the order seems random and sometimes doesn't match. I don't know what this value is, but completely ignoring it seems to work well :)
I've seen Segment 5 populated in .fig files, but I haven't been able to narrow down what's in there yet.
Edit: Fyi, once the string object is correctly parsed and all properties are filled in, the actual string value is encoded in yet another undocumented format (see testDoubleQuoteString)

Find category of MATLAB mlint warning ID

I'm using the checkcode function in MATLAB to give me a struct of all error messages in a supplied filename along with their McCabe complexity and ID associated with that error. i.e;
info = checkcode(fileName, '-cyc','-id');
In MATLAB's preferences, there is a list of all possible errors, and they are broken down into categories. Such as "Aesthetics and Readability", "Syntax Errors", "Discouraged Function Usage" etc.
Is there a way to access these categories using the error ID gained from the above line of code?
I tossed around different ideas in my head for this question and was finally able to come up with a mostly elegant solution for how to handle this.
The Solution
The critical component of this solution is the undocumented -allmsg flag of checkcode (or mlint). If you supply this argument, then a full list of mlint IDs, severity codes, and descriptions are printed. More importantly, the categories are also printed in this list and all mlint IDs are listed underneath their respective mlint category.
The Execution
Now we can't simply call checkcode (or mlint) with only the -allmsg flag because that would be too easy. Instead, it requires an actual file to try to parse and check for errors. You can pass any valid m-file, but I have opted to pass the built-in sum.m because the actual file itself only contains help information (as it's real implementation is likely C++) and mlint is therefore able to parse it very rapidly with no warnings.
checkcode('sum.m', '-allmsg');
An excerpt of the output printed to the command window is:
INTER ========== Internal Message Fragments ==========
MSHHH 7 this is used for %#ok and should never be seen!
BAIL 7 done with run due to error
INTRN ========== Serious Internal Errors and Assertions ==========
NOLHS 3 Left side of an assignment is empty.
TMMSG 3 More than 50,000 Code Analyzer messages were generated, leading to some being deleted.
MXASET 4 Expression is too complex for code analysis to complete.
LIN2L 3 A source file line is too long for Code Analyzer.
QUIT 4 Earlier syntax errors confused Code Analyzer (or a possible Code Analyzer bug).
FILER ========== File Errors ==========
NOSPC 4 File <FILE> is too large or complex to analyze.
MBIG 4 File <FILE> is too big for Code Analyzer to handle.
NOFIL 4 File <FILE> cannot be opened for reading.
MDOTM 4 Filename <FILE> must be a valid MATLAB code file.
BDFIL 4 Filename <FILE> is not formed from a valid MATLAB identifier.
RDERR 4 Unable to read file <FILE>.
MCDIR 2 Class name <name> and #directory name do not agree: <FILE>.
MCFIL 2 Class name <name> and file name do not agree: <file>.
CFERR 1 Cannot open or read the Code Analyzer settings from file <FILE>. Using default settings instead.
...
MCLL 1 MCC does not allow C++ files to be read directly using LOADLIBRARY.
MCWBF 1 MCC requires that the first argument of WEBFIGURE not come from FIGURE(n).
MCWFL 1 MCC requires that the first argument of WEBFIGURE not come from FIGURE(n) (line <line #>).
NITS ========== Aesthetics and Readability ==========
DSPS 1 DISP(SPRINTF(...)) can usually be replaced by FPRINTF(...).
SEPEX 0 For better readability, use newline, semicolon, or comma before this statement.
NBRAK 0 Use of brackets [] is unnecessary. Use parentheses to group, if needed.
...
The first column is clearly the mlint ID, the second column is actually a severity number (0 = mostly harmless, 1 = warning, 2 = error, 4-7 = more serious internal issues), and the third column is the message that is displayed.
As you can see, all categories also have an identifier but no severity, and their message format is ===== Category Name =====.
So now we can just parse this information and create some data structure that allows us to easily look up the severity and category for a given mlint ID.
Again, though, it can't always be so easy. Unfortunately, checkcode (or mlint) simply prints this information out to the command window and doesn't assign it to any of our output variables. Because of this, it is necessary to use evalc (shudder) to capture the output and store it as a string. We can then easily parse this string to get the category and severity associated with each mlint ID.
An Example Parser
I have put all of the pieces I discussed previously together into a little function which will generate a struct where all of the fields are the mlint IDs. Within each field you will receive the following information:
warnings = mlintCatalog();
warnings.DWVRD
id: 'DWVRD'
severity: 2
message: 'WAVREAD has been removed. Use AUDIOREAD instead.'
category: 'Discouraged Function Usage'
category_id: 17
And here's the little function if you're interested.
function [warnings, categories] = mlintCatalog()
% Get a list of all categories, mlint IDs, and severity rankings
output = evalc('checkcode sum.m -allmsg');
% Break each line into it's components
lines = regexp(output, '\n', 'split').';
pattern = '^\s*(?<id>[^\s]*)\s*(?<severity>\d*)\s*(?<message>.*?\s*$)';
warnings = regexp(lines, pattern, 'names');
warnings = cat(1, warnings{:});
% Determine which ones are category names
isCategory = cellfun(#isempty, {warnings.severity});
categories = warnings(isCategory);
% Fix up the category names
pattern = '(^\s*=*\s*|\s*=*\s*$)';
messages = {categories.message};
categoryNames = cellfun(#(x)regexprep(x, pattern, ''), messages, 'uni', 0);
[categories.message] = categoryNames{:};
% Now pair each mlint ID with it's category
comp = bsxfun(#gt, 1:numel(warnings), find(isCategory).');
[category_id, ~] = find(diff(comp, [], 1) == -1);
category_id(end+1:numel(warnings)) = numel(categories);
% Assign a category field to each mlint ID
[warnings.category] = categoryNames{category_id};
category_id = num2cell(category_id);
[warnings.category_id] = category_id{:};
% Remove the categories from the warnings list
warnings = warnings(~isCategory);
% Convert warning severity to a number
severity = num2cell(str2double({warnings.severity}));
[warnings.severity] = severity{:};
% Save just the categories
categories = rmfield(categories, 'severity');
% Convert array of structs to a struct where the MLINT ID is the field
warnings = orderfields(cell2struct(num2cell(warnings), {warnings.id}));
end
Summary
This is a completely undocumented but fairly robust way of getting the category and severity associated with a given mlint ID. This functionality existed in 2010 and maybe even before that, so it should work with any version of MATLAB that you have to deal with. This approach is also a lot more flexible than simply noting what categories a given mlint ID is in because the category (and severity) will change from release to release as new functions are added and old functions are deprecated.
Thanks for asking this challenging question, and I hope that this answer provides a little help and insight!
Just to close this issue off. I've managed to extract the data from a few different places and piece it together. I now have an excel spreadsheet of all matlab's warnings and errors with columns for their corresponding ID codes, category, and severity (ie, if it is a warning or error). I can now read this file in, look up ID codes I get from using the 'checkcode' function and draw out any information required. This can now be used to create analysis tools to look at the quality of written scripts/classes etc.
If anyone would like a copy of this file then drop me a message and I'll be happy to provide it.
Darren.

Assigning a whole DataStructure its nullind array

Some context before the question.
Imagine file FileA having around 50 fields of different types. Instead of all programs using the file, I tried having a service program, so the file could only be accessed by that service program. The programs calling the service would then receive a DataStructure based on the file structure, as an ExtName. I use SQL to recover the information, so, basically, the procedure would go like this :
Datastructure shared by service program :
D FileADS E DS ExtName(FileA) Qualified
Procedure called by programs :
P getFileADS B Export
D PI N
D PI_IDKey 9B 0 Const
D PO_DS LikeDS(FileADS)
D LocalDS E DS ExtName(FileA) Qualified
D NullInd S 5i 0 Array(50) <-- Since 50 fields in fileA
//Code
Clear LocalDS;
Clear PO_DS;
exec sql
SELECT *
INTO :LocalDS :nullind
FROM FileA
WHERE FileA.ID = :PI_IDKey;
If SqlCod <> 0;
Return *Off;
EndIf;
PO_DS = LocalDS;
Return *On;
P getFileADS E
So, that procedure will return a datastructure filled with a record from FileA if it finds it.
Now my question : Is there any way I can assign the %nullind(field) = *On without specifying EACH 50 fields of my file?
Something like a loop
i = 1;
DoW (i <= 50);
if nullind(i) = -1;
%nullind(datastructure.field) = *On;
endif;
i++;
EndDo;
Cause let's face it, it'd be a pain to look each fields of each file every time.
I know a simple chain(n) could do the trick
chain(n) PI_IDKey FileA FileADS;
but I really was looking to do it with SQL.
Thank you for your advices!
OS Version : 7.1
First, you'll be better off in the long run by eliminating SELECT * and supplying a SELECT list of the 50 field names.
Next, consider these two web pages -- Meaningful Names for Null Indicators and Embedded SQL and null indicators. The first shows an example of assigning names to each null indicator to match the associated field names. It's just a matter of declaring a based DS with names, based on the address of your null indicator array. The second points out how a null indicator array can be larger than needed, so future database changes won't affect results. (Bear in mind that the page shows a null array of 1000 elements, and the memory is actually relatively tiny even at that size. You can declare it smaller if you think it's necessary for some reason.)
You're creating a proc that you'll only write once. It's not worth saving the effort of listing the 50 fields. Maybe if you had many programs using this proc and you had to create the list each time it'd be a slight help to use SELECT *, but even then it's not a great idea.
A matching template DS for the 50 data fields can be defined in the /COPY member that will hold the proc prototype. The template DS will be available in any program that brings the proc prototype in. Any program that needs to call the proc can simply specify LIKEDS referencing the template to define its version in memory. The template DS should probably include the QUALIFIED keyword, and programs would then use their own DS names as the qualifying prefix. The null indicator array can be handled similarly.
However, it's not completely clear what your actual question is. You show an example loop and ask if it'll work, but you don't say if you had a problem with it. It's an array, so a loop can be used much like you show. But it depends on what you're actually trying to accomplish with it.
for old school rpg just include the nulls in the data structure populated with the select statement.
select col1, ifnull(col1), col2, ifnull(col2), etc. into :dsfilewithnull where f.id = :id;
for old school rpg that can't handle nulls remove them with the select statement.
select coalesce(col1,0), coalesce(col2,' '), coalesce(col3, :lowdate) into :dsfile where f.id = :id;
The second method would be easier to use in a legacy environment.
pass the key by value to the procedure so you can use it like a built in function.
One answer to your question would be to make the array part of a data structure, and assign *all'0' to the data structure.
dcl-ds nullIndDs;
nullInd Ind Dim(50);
end-ds;
nullIndDs = *all'0';
The answer by jmarkmurphy is an example of assigning all zeros to an array of indicators. For the example that you show in your question, you can do it this way:
D NullInd S 5i 0 dim(50)
/free
NullInd(*) = 1 ;
Nullind(*) = 0 ;
*inlr = *on ;
return ;
/end-free
That's a complete program that you can compile and test. Run it in debug and stop at the first statement. Display NullInd to see the initial value of its elements. Step through the first statement and display it again to see how the elements changed. Step through the next statement to see how things changed again.
As for "how to do it in SQL", that part doesn't make sense. SQL sets the values automatically when you FETCH a row. Other than that, the array is used by the host language (RPG in this case) to communicate values back to SQL. When a SQL statement runs, it again automatically uses whatever values were set. So, it either is used automatically by SQL for input or output, or is set by your host language statements. There is nothing useful that you can do 'in SQL' with that array.

stuck with my small code in matlab

I have problem with the following code:
.
.
.
a=zeros(1000,ctimes);
a1=zeros(1000,ctimes);
hold all
for i=num1:num2;
colors=Lines(i);
switch phantom
case 1
path=['E:\filename\'];
path1=['E:\filename2\'];
n=['S',num2str(emt),'_',num2str(i),'.m'];
d=load([path,name]);
a(:,i)=complex(d(:,2),d(:,3)));
n1=['S',num2str(emt),'_',num2str(i),'.m'];
d1=load([path1,name1]);
a1(:,i)=complex(d1(:,2),d1(:,3)));
the problem is that a(:,i) can not be defined. while there is no problem or with complex(d1(:,2),d1(:,3))) , can any expert body help me plz?!
thank you ...
Are you sure you are forming your file name correctly? You are doing something to create a variable n, but using a variable name when you form the path. Here are some recommended debugging steps:
1) make sure the file path is formed correctly:
filePath = fullfile(path, name);
disp(filePath);
The fullfile function concatenates elements of a file path & name, and takes care of using the right file path separator (good for portable code, stops you having to remember to add a / or \ to the end of a file path, etc).
2) check that d is loaded correctly:
clear d;
d = load(filePath);
disp(size(d));
3) check the size of the complex quantity you compute before assigning it to a(:,i):
temp = complex(d(:,2), d(:,3));
disp(size(temp));
By the time you have done these things, you should have found your problem (the dimensions of temp should be [1000 1] to match the size of a(:,i), of course).
As an aside, you should avoid using i as a variable name, especially when you are using complex numbers, since its built-in value is sqrt(-1). Thus, c = a + i * b; would create a complex number (a,b) and put it into c - until you change the meaning of i. A simple solution is to use ii. The same is true for j, by the way. It is one of the unfortunate design decisions in Matlab that you can overwrite built in values like that...

COBOL add 0 to a Variable in COMPUTE

I ran into a strange statement when working on a COBOL program from $WORK.
We have a paragraph that is opening a cursor (from DB2), and the looping over it until it hits an EOT (in pseudo code):
... working storage ...
01 I PIC S9(9) COMP VALUE ZEROS.
01 WS-SUB PIC S9(4) COMP VALUE 0.
... code area ...
PARA-ONE.
PERFORM OPEN-CURSOR
PERFORM FETCH-CURSOR
PERFORM VARYING I FROM 1 BY 1 UNTIL SQLCODE = DB2EOT
do stuff here...
END-PERFORM
COMPUTE WS-SUB = I + 0
PERFORM CLOSE-CURSOR
... do another loop using WS-SUB ...
I'm wondering why that COMPUTE WS-SUB = I + 0 line is there. My understanding is that I will always at least be 1, because of the perform block above it (i.e., even if there is an EOT to start with, I will be set to one on that initial iteration).
Is that COMPUTE line even needed? Is it doing some implicit casting that I'm not aware of? Why would it be there? Why wouldn't you just MOVE I TO WS-SUB?
Call it stupid, but with some compilers (with the correct options in effect), given
01 SIGNED-NUMBER PIC S99 COMP-5 VALUE -1.
01 UNSIGNED-NUMBER PIC 99 COMP-5.
...
MOVE SIGNED-NUMBER TO UNSIGNED-NUMBER
DISPLAY UNSIGNED-NUMBER
results in: 255. But...
COMPUTE UNSIGNED-NUMBER = SIGNED-NUMBER + ZERO
results in: 1 (unsigned)
So to answer your question, this could be classified as a technique used cast signed numbers into unsigned numbers. However, in the code example you gave it makes no sense at all.
Note that the definition of "I" was (likely) coded by one programmer and of WS-SUB by another (naming is different, VALUE clause is different for same purpose).
Programmer 2 looks like "old school": PIC S9(4), signed and taking up all the digits which "fit" in a half-word. The S9(9) is probably "far over the top" as per range of possible values, but such things concern Programmer 1 not at all.
Probably Programmer 2 had concerns about using an S9(9) COMP for something requiring (perhaps many) fewer than 9999 "things". "I'll be 'efficient' without changing the existing code". It seems to me unlikely that the field was ever defined as unsigned.
A COMP/COMP-4 with nine digits does have a performance penalty when used for calculations. Try "ADD 1" to a 9(9) and a 9(8) and a 9(10) and compare the generated code. If you can have nine digits, define with 9(10), otherwise 9(8), if you need a fullword.
Programmer 2 knows something of this.
The COMPUTE with + 0 is probably deliberate. Why did Programmer 2 use the COMPUTE like that (the original question)?
Now it is going to get complicated.
There are two "types" of "binary" fields on the Mainframe: those which will contain values limited by the PICture clause (USAGE BINARY, COMP and COMP-4); those which contain values limited by the field size (USAGE COMP-5).
With BINARY/COMP/COMP-4, the size of the field is determined from the PICture, and so are the values that can be held. PIC 9(4) is a halfword, with a maxiumum value of 9999. PIC S9(4) a halfword with values -9999 through +9999.
With COMP-5 (Native Binary), the PICture just determines the size of the field, all the bits of the field are relevant for the value of the field. PIC 9(1) to 9(4) define halfwords, pic 9(5) to 9(9) define fullwords, and 9(10) to 9(18) define doublewords. PIC 9(1) can hold a maximum of 65535, S9(1) -32,768 through +32,767.
All well and good. Then there is compiler option TRUNC. This has three options. STD, the default, BIN and OPT.
BIN can be considered to have the most far-reaching affect. BIN makes BINARY/COMP/COMP-4 behave like COMP-5. Everything becomes, in effect, COMP-5. PICtures for binary fields are ignored, except in determining the size of the field (and, curiously, with ON SIZE ERROR, which "errors" when the maxima according to the PICture are exceeded). Native Binary, in IBM Enterprise Cobol, generates, in the main, though not exclusively, the "slowest" code. Truncation is to field size (halfword, fullword, doubleword).
STD, the default, is "standard" truncation. This truncates to "PICture". It is therefore a "decimal" truncation.
OPT is for "performance". With OPT, the compiler truncates in whatever way is the most "performant" for a particular "code sequence". This can mean intermediate values and final values may have "bits set" which are "outside of the range" of the PICture. However, when used as a source, a binary field will always only reflect the value specified by the PICture, even if there are "excess" bits set.
It is important when using OPT that all binary fields "conform to PICture" meaning that code must never rely on bits which are set outside the PICture definition.
Note: Even though OPT has been used, the OPTimizer (OPT(STD) or OPT(FULL)) can still provide further optimisations.
This is all well and good.
However, a "pickle" can readily ensue if you "mix" TRUNC options, or if the binary definition in a CALLing program is not the same as in the CALLed program. The "mix" can occur if modules within the same run-unit are compiled with different TRUNC options, or if a binary field on a file is written with one TRUNC option and later read with another.
Now, I suspect Programmer 2 encountered something like this: Either, with TRUNC(OPT) they noticed "excess bits" in a field and thought there was a need to deal with them, or, through the "mix" of options in a run-unit or "across file usage" they noticed "excess bits" where there would be a need to do something about it (which was to "remove the mix").
Programmer 2 developed the COMPUTE A = B + 0 to "deal" with a particular problem (perceived or actual) and then applied it generally to their work.
This is a "guess", or, better, a "rationalisation" which works with the known information.
It is a "fake" fix. There was either no problem (the normal way that TRUNC(OPT) works) or the correct resolution was "normalisation" of the TRUNC option across modules/file use.
I do not want loads of people now rushing off and putting COMPUTE A = B + 0 in their code. For a start, they don't know why they are doing it. For a continuation it is the wrong thing to do.
Of course, do not just remove the "+ 0" from any of these that you find. If there is a "mix" of TRUNCs, a program may stop "working".
There is one situation in which I have used "ADD ZERO" for a BINARY/COMP/COMP-4. This is in a "Mickey Mouse" program, a program with no purpose but to try something out. Here I've used it as a method to "trick" the optimizer, as otherwise the optimizer could see unchanging values so would generate code to use literal results as all values were known at compile time. (A perhaps "neater" and more flexible way to do this which I picked up from PhilinOxford, is to use ACCEPT for the field). This is not the case, for certain, with the code in question.
I wonder if a testing version of the sources ever had
COMPUTE WS-SUB = I + 0
ON SIZE ERROR
DISPLAY "WS-SUB overflow"
STOP RUN
END-COMPUTE
with the range test discarded when the developer was satisfied and cleaning up? MOVE doesn't allow declarative SIZE statements. That's as much of a reason as I could see. Or perhaps developer habit of using COMPUTE to move, as a subtle reminder to question the need for defensive code at every step? And perhaps not knowing, as Joe pointed out, the SIZE clause would be just as effective without the + 0? Or a maintainer struggled with off by one errors and there was a corrective change from 1 to 0 after testing?