Need to replace use of strcpy_s in C code used in iPhone project - iphone

I have a C SDK I need to use in an iPhone project, and the example code was written for use with Visual Studio. It includes use of strcpy_s, which is a Microsoft-only string function.
file_header.header_size = FIT_FILE_HDR_SIZE;
strcpy_s((FIT_UINT8 *)&file_header.data_type, sizeof(".FIT"), ".FIT"); << problem!
I've tried changing to strcpy and strncpy like so
strncpy((FIT_UINT8 *)&file_header.data_type, ".FIT", sizeof(".FIT"));
But I get this warning:
warning: pointer targets in passing argument 1 of 'builtin_strncpy_chk' differ in signedness
warning: pointer targets in passing argument 1 of '__inline_strncpy_chk' differ in signedness
warning: call to builtin_strncpy_chk will always overflow destination buffer
The struct file_header is this:
typedef struct
{
FIT_UINT8 header_size; // FIT_FILE_HDR_SIZE (size of this structure)
FIT_UINT8 protocol_version; // FIT_PROTOCOL_VERSION
FIT_UINT16 profile_version; // FIT_PROFILE_VERSION
FIT_UINT32 data_size; // Does not include file header or crc. Little endian format.
FIT_UINT8 data_type[4]; // ".FIT"
} FIT_FILE_HDR;
FIT_UINT8 is typedef Unsigned char.
So we can see that it is given an length of 4 in the typedef, and the strcpy_s takes the data_type by reference and copys ".FIT" to it. Where am I going wrong with strncpy? If you haven't guessed by now I'm not much of a C programmer :)
Edit: this does not give me an error, but it is correct?
strncpy((void *)&file_header.data_type, ".FIT", sizeof(file_header.data_type));

With any "safe string" operations, the size should almost always be the size of the destination buffer; if you use the size of the source string, you might as well call memcpy.
If you want C99 conformance:
strncpy(file_header.data_type, ".FIT", sizeof file_header.data_type);
However, strlcpy (a BSDism, available in iOS) is preferred by many, because it guarantees that the destination will be nul-terminated:
strlcpy(file_header.data_type, ".FIT", sizeof file_header.data_type);
Note, however that the nul-terminated string ".FIT" doesn't actually fit in the allotted space, as it requires 5 characters (1 for the trailing nul). If you use strlcpy, you will see that the resulting string is just ".FI" because strlcpy guarantees nul-termination, and truncates your string if necessary.
If you require nul-termination then, you probably want to increase the size of the data_type array to 5. As caf correctly points out, this looks like a file header, in which case nul-termination is probably not required; in that case strncpy is preferred; I might even use memcpy, and avoid giving a future developer the idea that the field is a string.

Don't use
sizeof(".FIT")
use
strlen(".FIT")

Related

Parasoft / Misra C++ 2008 : Expression of 'signed' type should not be cast to 'unsigned' type

I am coding in C++14. The company requires all code to pass Parasoft / Misra C++ 2008 checks.
I receive a string which ends in a digit from 1 to N and I need to convert it to uint8_t and subtract 1 to use it as an array index.
// NumberString is guaranteed to contain a single digit as a std::string
const uint8_t Number = static_cast<uint8_t>(std::stoi(NumberString) - 1);
causes Parasoft to report
Expression of 'signed' type should not be cast to 'unsigned' type
I have tried many ways to rewrite it, but to no avail. How can I get rid of that Parasoft message?
I am at my wit's end and even considering stuffing an extra (unused) element zer0 at the front of that array. Surely there must be a way to avoid that?

Import Fortran unformatted binary

I have an unformatted binary file generated using the Compaq Visual Fortran compiler (big endian).
Here's what the little bit of documentation states about it:
The binary file is written in a general format consisting of data arrays, headed by a descriptor record:
An 8-character keyword which identifies the data in the block.
A 4-byte signed integer defining the number of elements in the block.
A 4-character keyword defining the type of data. (INTE, REAL, LOGI, DOUB, or CHAR)
The header items are read in as a single record. The data follows the descriptor on a new record. Numerical arrays are divided into block of up to 1000 items. The physical record size is the same as the block size.
Attempts to read such data
module modbin
type rectype
character(len=8)::key
integer::data_count
character(len=4)::data_type
logical::is_int
integer, allocatable:: idata(:)
real(kind=8), allocatable::rdata(:)
end type
contains
subroutine rec_read(in_file, out_rec)
integer, intent(in):: in_file
type (rectype), intent(inout):: out_rec
!
! You need to play around with this figure. It may not be
! entirely accurate - 1000 seems to work, 1024 does not
integer, parameter:: bsize = 1000
integer:: bb, ii, iimax
! read the header
out_rec%data_count = 0
out_rec%data_type = ' '
read(in_file, end = 20) out_rec%key, out_rec%data_count,
out_rec%data_type
! what type is it?
select case (out_rec%data_type)
case ('INTE')
out_rec%is_int = .true.
allocate(out_rec%idata(out_rec%data_count))
case ('DOUB')
out_rec%is_int = .false.
allocate(out_rec%rdata(out_rec%data_count))
end select
! read the data in blocks of bsize
bb = 1
do while (bb .lt. out_rec%data_count)
iimax = bb + bsize - 1
if (iimax .gt. out_rec%data_count) iimax = out_rec%data_count
if (out_rec%is_int) then
read(in_file) (out_rec%idata(ii), ii = bb, iimax)
else
read(in_file) (out_rec%rdata(ii), ii = bb, iimax)
end if
bb = iimax + 1
end do
20 continue
end subroutine rec_read
subroutine rec_print(in_recnum, in_rec)
integer, intent(in):: in_recnum
type (rectype), intent(in):: in_rec
print *, in_recnum, in_rec%key, in_rec%data_count, in_rec%data_type
! print out data
open(unit=12, file='reader.data' , status='old')
write(12,*)key
!write(*,'(i5')GEOMINDX
!write(*,'(i5')ID_BEG
!write(*,'(i5')ID_END
!write(*,'(i5')ID_CELL
!write(*,'(i5')TIME_BEG
!write(*,'(i5')SWAT
!format('i5')
!end do
close(12)
end subroutine rec_print
end module modbin
program main
use modbin
integer, parameter:: infile=11
! fixed size for now - should really be allocatable
integer, parameter:: rrmax = 500
type (rectype):: rec(rrmax)
integer:: rr, rlast
open(unit=infile, file='TEST1603.SLN0001', form='UNFORMATTED',
status='OLD', convert='BIG_ENDIAN')
rlast = 0
do rr = 1, rrmax
call rec_read(infile, rec(rr))
if (rec(rr)%data_type .eq. ' ') exit
rlast = rr
call rec_print(rr, rec(rr))
end do
close(infile)
end program main
This code compiles and runs smoothly showing
and produces no errors but this is written in the output file
shows me no useful numerical values
The file in question is available here
And the right WRITE statement should produce a file like this one here
Is my WRITE STATEMENT to output this file type wrong? , and if so, what is the best way?
thank you
The comments above are trying to direct you to one of (at least) two problems in your code. In the subroutine rec_print you have write(12,*)key where you meant to write write(12,*)in_rec%key (at least I think that's what you wanted.)
The other problem I spotted is that rec_print opens reader.data with status='old' and then closes it after writing key. (The use of old here suggests that the file already exists.) Each time rec_print is called, the file is opened, the first record is overwritten, and the file is closed. One solution to this would be to use status='unknown'. position='append', though it would be more efficient to open the file once in the main program and just let the subroutine write to it.
If I make these changes, I get in the data file:
INTEHEAD
GEOMETRY
GEOMINDX
ID_BEG
ID_END
ID_CELL
TIME_BEG
SWAT
A side-comment about CONVERT= and derived types: Your program isn't affected by this, but there are compiler differences with how reading a derived type record with CONVERT= is handled. I think gfortran converts each component according to its type, but I know that Intel Fortran doesn't convert reads (nor writes) of an entire derived type. You are reading individual components, which works in both compilers, so that's fine, but I thought it was worth mentioning.
If you're wondering why Intel Fortran does it this way, it's due to the VAX FORTRAN (where CONVERT= came from) heritage with STRUCTURE/RECORD and the possible use of UNION/MAP (not available in standard Fortran). With unions, there's no way to know how a particular component should be converted, so it just transfers the bytes. I had suggested to the Intel team that this could be relaxed if no UNIONs were present, but that I'm sure is very low priority.

What is contained in the "function workspace" field in .mat file?

I'm working with .mat files which are saved at the end of a program. The command is save foo.mat so everything is saved. I'm hoping to determine if the program changes by inspecting the .mat files. I see that from run to run, most of the .mat file is the same, but the field labeled __function_workspace__ changes somewhat.
(I am inspecting the .mat files via scipy.io.loadmat -- just loading the files and printing them out as plain text and then comparing the text. I found that save -ascii in Matlab doesn't put string labels on things, so going through Python is roundabout, but I get labels and that's useful.)
I am trying to determine from where these changes originate. Can anyone explain what __function_workspace__ contains? Why would it not be the same from one run of a given program to the next?
The variables I am really interested in are the same, but I worry that I might be overlooking some changes that might come back to bite me. Thanks in advance for any light you can shed on this problem.
EDIT: As I mentioned in a comment, the value of __function_workspace__ is an array of integers. I looked at the elements of the array and it appears that these numbers are ASCII or non-ASCII character codes. I see runs of characters which look like names of variables or functions, so that makes sense. But there are also some characters (non-ASCII) which don't seem to be part of a name, and there are a lot of null (zero) characters too. So aside from seeing names of things in __function_workspace__, I'm not sure what that stuff is exactly.
SECOND EDIT: I found that after commenting out calls to plotting functions, the content of __function_workspace__ is the same from one run of the program to the next, so that's great. At this point the only difference from one run to the next is that there is a __header__ field which contains a timestamp for the time at which the .mat file was created, which changes from run to run.
THIRD EDIT: I found an article, http://nbviewer.jupyter.org/gist/mbauman/9121961 "Parsing MAT files with class objects in them", about reverse-engineering the __function_workspace__ field. Thanks to Matt Bauman for this very enlightening article and thanks to #mpaskov for the pointer. It appears that __function_workspace__ is an undocumented catch-all for various stuff, only one part of which is actually a "function workspace".
1) Diffing .mat files
You may want to take a look at DiffPlug. It can do diffs of MAT files and I believe there is a command line interface for it as well.
2) Contents of function_workspace
SciPy's __function_workspace__ refers to a special variable at the end of a MAT file that contains extra data needed for reference types (e.g. table, string, handle, etc.) and various other stuff that is not covered by the official documentation. The name is misleading as it really refers to the "Subsystem" (briefly mentioned in the official spec as an offset in the header).
For example, if you save a reference type, e.g., emptyString = "", the resulting .mat will contain the following two entries:
(1) The variable itself. It looks sort of like a UInt32 matrix, but is actually an Opaque MCOS Reference (MATLAB Class Object System) to a string object at some location in the subsystem.
[0] Compressed (81 bytes, position = 128)
[0] Matrix (144 bytes, position = 0)
[0] UInt32[2] = [17, 0] // Opaque
[1] Int8[11] = ['emptyString'] // Variable Name
[2] Int8[4] = ['MCOS'] // Object Type
[3] Int8[6] = ['string'] // Class Name
[4] Matrix (72 bytes, position = 72)
[0] UInt32[2] = [13, 0] // UInt32
[1] Int32[2] = [6, 1] // Dimensions
[2] Int8[0] = [''] // Variable Name (not needed)
[3] UInt32[6] = [-587202560, 2, 1, 1, 1, 1] // Data (Reference Target)
(2) A UInt8 matrix without name (SciPy renamed this to __function_workspace__) at the end of the file. Aside from the missing name it looks like a standard matrix, but the data is actually another MAT file (with a reduced header) that contains the real data.
[1] Compressed (251 bytes, position = 217)
[0] Matrix (968 bytes, position = 0)
[0] UInt32[2] = [9, 0] // UInt8
[1] Int32[2] = [1, 920] // Dimensions
[2] Int8[0] = [''] // Variable Name
[3] ... 920 bytes ... // Data (Nested MAT File)
The format of the data is unfortunately completely undocumented and somewhat of a mess. I could post the contents of the Subsystem, but it gets somewhat overwhelming even for such a simple case. It's essentially a MAT file that contains a struct that contains a special variable (MCOS FileWrapper__) that contains a cell array with various values, including one that magically encodes various Object Properties.
Matt Bauman has done some great reverse engineering efforts (Parsing MAT files with class objects in them) that I believe all supporting implementations are based on. The MFL Java library contains a full (read-only) implementation of this (see McosFileWrapper.java).
Some updates on Matt Bauman's post that we found are:
The MCOS reference can refer to an array of handle objects and may have more than 6 values. It contains sizing information followed by an array of indices (see McosReference.java).
The Object Id field looks like a unique id, but the order seems random and sometimes doesn't match. I don't know what this value is, but completely ignoring it seems to work well :)
I've seen Segment 5 populated in .fig files, but I haven't been able to narrow down what's in there yet.
Edit: Fyi, once the string object is correctly parsed and all properties are filled in, the actual string value is encoded in yet another undocumented format (see testDoubleQuoteString)

WSOCK32.DLL htons function

In a Visual FoxPro app using sockets, we are using wsock32.dll and use the htons() function to convert a portnumber to TCP/IP network byte order. It should return an unsigned short between 0 and 65535. When testing this with port 63333 it returns 26103 but after installing the Windows Fall Creators update it returns a bigger value: 16213495.
Sample FoxPro program:
DECLARE INTEGER htons IN "wsock32.dll" INTEGER hostshort
LOCAL portNumber, htonsNumber
portNumber = 63333
htonsNumber = htons( portNumber )
? htonsNumber
The resulting value should go into a "sockaddr" structure used by the connect() function but there is only space for 2 bytes for the port.
Does anyone know what has happened in this windows update to the wsock32 functions and/or has a suggestion to solve this?
I compared the Windows 10 FCU function with Windows 8 and Windows has reordered the register usage and saved one AND instruction. This is most likely a compiler optimization and not a source code change. Because the left-shifted half is not masked you get garbage in bits 16-23 but these bits should be ignored. The function is still correct for anyone that follows the Windows ABI.
The best solution is to update the function declaration so it uses a 16-bit integer type. If that is not possible you can cast the number to a 16-bit type in languages that support casting. The final option is to truncate the value yourself by ANDing with 0xffff:
htonsNumber = BitAnd(htons(portNumber), 0xffff)
SHORT is listed as a valid return type so that should work as well:
DECLARE SHORT htons IN "wsock32.dll" INTEGER

COBOL add 0 to a Variable in COMPUTE

I ran into a strange statement when working on a COBOL program from $WORK.
We have a paragraph that is opening a cursor (from DB2), and the looping over it until it hits an EOT (in pseudo code):
... working storage ...
01 I PIC S9(9) COMP VALUE ZEROS.
01 WS-SUB PIC S9(4) COMP VALUE 0.
... code area ...
PARA-ONE.
PERFORM OPEN-CURSOR
PERFORM FETCH-CURSOR
PERFORM VARYING I FROM 1 BY 1 UNTIL SQLCODE = DB2EOT
do stuff here...
END-PERFORM
COMPUTE WS-SUB = I + 0
PERFORM CLOSE-CURSOR
... do another loop using WS-SUB ...
I'm wondering why that COMPUTE WS-SUB = I + 0 line is there. My understanding is that I will always at least be 1, because of the perform block above it (i.e., even if there is an EOT to start with, I will be set to one on that initial iteration).
Is that COMPUTE line even needed? Is it doing some implicit casting that I'm not aware of? Why would it be there? Why wouldn't you just MOVE I TO WS-SUB?
Call it stupid, but with some compilers (with the correct options in effect), given
01 SIGNED-NUMBER PIC S99 COMP-5 VALUE -1.
01 UNSIGNED-NUMBER PIC 99 COMP-5.
...
MOVE SIGNED-NUMBER TO UNSIGNED-NUMBER
DISPLAY UNSIGNED-NUMBER
results in: 255. But...
COMPUTE UNSIGNED-NUMBER = SIGNED-NUMBER + ZERO
results in: 1 (unsigned)
So to answer your question, this could be classified as a technique used cast signed numbers into unsigned numbers. However, in the code example you gave it makes no sense at all.
Note that the definition of "I" was (likely) coded by one programmer and of WS-SUB by another (naming is different, VALUE clause is different for same purpose).
Programmer 2 looks like "old school": PIC S9(4), signed and taking up all the digits which "fit" in a half-word. The S9(9) is probably "far over the top" as per range of possible values, but such things concern Programmer 1 not at all.
Probably Programmer 2 had concerns about using an S9(9) COMP for something requiring (perhaps many) fewer than 9999 "things". "I'll be 'efficient' without changing the existing code". It seems to me unlikely that the field was ever defined as unsigned.
A COMP/COMP-4 with nine digits does have a performance penalty when used for calculations. Try "ADD 1" to a 9(9) and a 9(8) and a 9(10) and compare the generated code. If you can have nine digits, define with 9(10), otherwise 9(8), if you need a fullword.
Programmer 2 knows something of this.
The COMPUTE with + 0 is probably deliberate. Why did Programmer 2 use the COMPUTE like that (the original question)?
Now it is going to get complicated.
There are two "types" of "binary" fields on the Mainframe: those which will contain values limited by the PICture clause (USAGE BINARY, COMP and COMP-4); those which contain values limited by the field size (USAGE COMP-5).
With BINARY/COMP/COMP-4, the size of the field is determined from the PICture, and so are the values that can be held. PIC 9(4) is a halfword, with a maxiumum value of 9999. PIC S9(4) a halfword with values -9999 through +9999.
With COMP-5 (Native Binary), the PICture just determines the size of the field, all the bits of the field are relevant for the value of the field. PIC 9(1) to 9(4) define halfwords, pic 9(5) to 9(9) define fullwords, and 9(10) to 9(18) define doublewords. PIC 9(1) can hold a maximum of 65535, S9(1) -32,768 through +32,767.
All well and good. Then there is compiler option TRUNC. This has three options. STD, the default, BIN and OPT.
BIN can be considered to have the most far-reaching affect. BIN makes BINARY/COMP/COMP-4 behave like COMP-5. Everything becomes, in effect, COMP-5. PICtures for binary fields are ignored, except in determining the size of the field (and, curiously, with ON SIZE ERROR, which "errors" when the maxima according to the PICture are exceeded). Native Binary, in IBM Enterprise Cobol, generates, in the main, though not exclusively, the "slowest" code. Truncation is to field size (halfword, fullword, doubleword).
STD, the default, is "standard" truncation. This truncates to "PICture". It is therefore a "decimal" truncation.
OPT is for "performance". With OPT, the compiler truncates in whatever way is the most "performant" for a particular "code sequence". This can mean intermediate values and final values may have "bits set" which are "outside of the range" of the PICture. However, when used as a source, a binary field will always only reflect the value specified by the PICture, even if there are "excess" bits set.
It is important when using OPT that all binary fields "conform to PICture" meaning that code must never rely on bits which are set outside the PICture definition.
Note: Even though OPT has been used, the OPTimizer (OPT(STD) or OPT(FULL)) can still provide further optimisations.
This is all well and good.
However, a "pickle" can readily ensue if you "mix" TRUNC options, or if the binary definition in a CALLing program is not the same as in the CALLed program. The "mix" can occur if modules within the same run-unit are compiled with different TRUNC options, or if a binary field on a file is written with one TRUNC option and later read with another.
Now, I suspect Programmer 2 encountered something like this: Either, with TRUNC(OPT) they noticed "excess bits" in a field and thought there was a need to deal with them, or, through the "mix" of options in a run-unit or "across file usage" they noticed "excess bits" where there would be a need to do something about it (which was to "remove the mix").
Programmer 2 developed the COMPUTE A = B + 0 to "deal" with a particular problem (perceived or actual) and then applied it generally to their work.
This is a "guess", or, better, a "rationalisation" which works with the known information.
It is a "fake" fix. There was either no problem (the normal way that TRUNC(OPT) works) or the correct resolution was "normalisation" of the TRUNC option across modules/file use.
I do not want loads of people now rushing off and putting COMPUTE A = B + 0 in their code. For a start, they don't know why they are doing it. For a continuation it is the wrong thing to do.
Of course, do not just remove the "+ 0" from any of these that you find. If there is a "mix" of TRUNCs, a program may stop "working".
There is one situation in which I have used "ADD ZERO" for a BINARY/COMP/COMP-4. This is in a "Mickey Mouse" program, a program with no purpose but to try something out. Here I've used it as a method to "trick" the optimizer, as otherwise the optimizer could see unchanging values so would generate code to use literal results as all values were known at compile time. (A perhaps "neater" and more flexible way to do this which I picked up from PhilinOxford, is to use ACCEPT for the field). This is not the case, for certain, with the code in question.
I wonder if a testing version of the sources ever had
COMPUTE WS-SUB = I + 0
ON SIZE ERROR
DISPLAY "WS-SUB overflow"
STOP RUN
END-COMPUTE
with the range test discarded when the developer was satisfied and cleaning up? MOVE doesn't allow declarative SIZE statements. That's as much of a reason as I could see. Or perhaps developer habit of using COMPUTE to move, as a subtle reminder to question the need for defensive code at every step? And perhaps not knowing, as Joe pointed out, the SIZE clause would be just as effective without the + 0? Or a maintainer struggled with off by one errors and there was a corrective change from 1 to 0 after testing?