What is the difference between COBOL Nested Program vs. Subroutine in Z/OS - zos

In COBOL, you may enclose a subroutine as nested program or as a stand-alone module. I want to know what are the differences between the two approaches in terms of speed of execution, memory usage, and whether both methods are allowed in CICS or not. Any references would be great. The run environment is Z/OS.
Thanks.

Both methods are allowed in CICS.
The difference in memory usage, if any, is going to be negligible. The compiler will generate reentrant code and thus your Working-Storage will be dynamically allocated on first execution per CICS transaction and your Local-Storage dynamically allocated per execution. The Language Environment memory allocation algorithm is designed to be speedy. Only one copy of your executable code will be present in the CICS region.
Packaging your subroutine as a nested program or statically linking your modules together at bind time avoids the overhead of the LOAD when the subroutine is called.
Packaging your subroutine as a nested program prevents it from being called by other programs unless you package the nested program as a copybook and use the COPY compiler directive to bring it into your program. This technique can lead to interesting issues, such as changes to the nested program copybook should probably require recompilation of all programs using the copybook in order to pick up the new version; but this depends on your source code management system. Static linking of the subroutine has similar issues.
If you package your subroutine as a separate module you have the option of executing it via EXEC CICS LINK or COBOL dynamic CALL. The former causes the creation of a new Language Environment enclave and thus the latter is more efficient, particularly on the second and subsequent CALL and if you specify the Language Environment runtime option CBLPSHPOP(OFF).
Much of the above was gleaned from SHARE presentations over the years.
Some tuning information is available in a SHARE presentation from 2002 S8213TR.PDF currently available here (the information is still valid). Note that there are many tuning opportunities relative to Language Environment runtime options related to storage allocation. There exist a number of different mechanisms to set Language Environment options. Your CICS Systems Programmer likely has an opinion on the matter. There may be shop standards regarding Language Environment runtime options.
Generally speaking, mainframe CICS COBOL application tuning has more to do with using efficient algorithms, variable definitions, compile options, and Language Environment runtime options than it does with application packaging.

In addition to the things mentioned by cschneid...
A contained program can reference items declared with the GLOBAL attribute in the Data Division of the containing program. The contained program does not need to declare the GLOBAL items in order to reference them.
Contained programs cannot be declared with the RECURSIVE attribute.

Related

In multi-stage compilation, should we use a standard serialisation method to ship objects through stages?

This question is formulated in Scala 3/Dotty but should be generalised to any language NOT in MetaML family.
The Scala 3 macro tutorial:
https://docs.scala-lang.org/scala3/reference/metaprogramming/macros.html
Starts with the The Phase Consistency Principle, which explicitly stated that free variables defined in a compilation stage CANNOT be used by the next stage, because its binding object cannot be persisted to a different compiler process:
... Hence, the result of the program will need to persist the program state itself as one of its parts. We don’t want to do this, hence this situation should be made illegal
This should be considered a solved problem given that many distributed computing frameworks demands the similar capability to persist objects across multiple computers, the most common kind of solution (as observed in Apache Spark) uses standard serialisation/pickling to create snapshots of the binded objects (Java standard serialization, twitter Kryo/Chill) which can be saved on disk/off-heap memory or send over the network.
The tutorial itself also suggested the possibility twice:
One difference is that MetaML does not have an equivalent of the PCP - quoted code in MetaML can access variables in its immediately enclosing environment, with some restrictions and caveats since such accesses involve serialization. However, this does not constitute a fundamental gain in expressiveness.
In the end, ToExpr resembles very much a serialization framework
Instead, Both Scala 2 & Scala 3 (and their respective ecosystem) largely ignores these out-of-the-box solutions, and only provide default methods for primitive types (Liftable in scala2, ToExpr in scala3). In addition, existing libraries that use macro relies heavily on manual definition of quasiquotes/quotes for this trivial task, making source much longer and harder to maintain, while not making anything faster (as JVM object serialisation is an highly-optimised language component)
What's the cause of this status quo? How do we improve it?

paging and binding schemes in memory management

The concept of paging in memory management can be used with which all schemes of binding?
By binding, I mean "mapping logical addresses to physical addresses". In my knowledge there are three types of binding schemes compile time, load time and execution time binding.
Paging is not involved in compiling, so we can rule that out.
Load time can have to meanings - combining the object modules of a program and libraries to produce an executable image (program) with no unresolved symbols (unix definition) OR transferring a program into memory so it may execute (non-unix).
What unix calls loading, some other systems call link editting.
Unix loading/link-editting is really part of compiling so doesn't involve paging at all. This operation does need to know the valid program addresses it can assign, which will permit the program to load. Conventionally these are from 0 to a very large number like 2^31 or 2^47.
Transferring an image to memory and executing can be considered either phases of the same thing, or in demand loading environments, exactly the same thing. Either way, the bit of the system that prepares the program address space has to fill out a set of tables which relate a program address to a physical address.
The program address of main() might be 0x12345; which might be viewed as offset 0x345 from page 0x12. The operating system might attach that to physical page 0x100, meaning that main() might temporarily be at 0x100345. Temporarily, because the operating system is free to change this relation (conventionally called a mapping) at any time.
The dynamic nature of these mappings is a positive attribute of paging, as it permits the system to reformulate its use of physical memory to meet changing demands.

Lock-free shared variable in Swift? (functioning volatile)

The use of Locks and mutexes is illegal in hard real-time callbacks. Lock free variables can be read and written in different threads. In C, the language definition may or may not be broken, but most compilers spit out usable assembly code given that a variable is declared volatile (the reader thread treats the variable as as hardware register and thus actually issues load instructions before using the variable, which works well enough on most cache-coherent multiprocessor systems.)
Can this type of variable access be stated in Swift? Or does in-line assembly language or data cache flush/invalidate hints need to be added to the Swift language instead?
Added: Will the use of calls to OSMemoryBarrier() (from OSAtomic.h) before and after and each use or update of any potentially inter-thread variables (such as "lock-free" fifo/buffer status counters, etc.) in Swift enforce sufficiently ordered memory load and store instructions (even on ARM processors)?
As you already mentioned, volatile only guarantees that the variable will not get cached into the registries (will get treated itself as a register). That alone does not make it lock free for reads and writes. It doesn't even guarantees it's atomicity, at least not in a consistent, cross-platform way.
Why? Instruction pipelining and oversizing (e.g using Float64 on a platform that has 32bit, or less, floating-point registers) first comes to mind.
That being said, did you considered using OSAtomic?

How can two threads share memory declared as variables in an object?

C programs can use global variables to share memory between functions executed in a parent and a child thread, but a Java program with several classes of objects doesn’t have such global variables. How can two threads share memory declared as variables in an object?
The practicable answer to this depends up on the language in which you are working.
In the theoretical, a process is an address space having one or more threads. A thread is a stream of execution having a process address space.
Because all the threads in the process share the same address space they can access each other's variables with no restrictions (for good or bad). Absolutely everything is shared.
Some programming languages, such as Ada have wonderful support for threads (tasks in Ada). Java has minimal support for threads. Classic C and C++ have no language support at all.
In a language like Ada, with real thread support, there are protected mechanisms for exchanging data among tasks. (But your Ada task could call an assembly language routine that can circumvent all that protection.) In C/C++ you create a task train wreck unless you explicitly plan to avoid one.
In Java, you can use static members (including static member functions) simulate a global variable that you can unsafely access.

Importance of knowing if a standard library function is executing a system call

Is it actually important for a programmer to know if the standard library function he/she is using is actually executing a system call? If so, why?
Intuitively I'm guessing the only importance is in knowing if the general standard function is a library function or a system call itself. In other cases, I'm guessing there isn't much of a need to know if a library functions uses internally a system call?
It is not always possible to know (for sure) if a library function wraps a system call. But in one way or another, this knowledge can help improve the portability and (or) efficiency of your program. At least in the following two cases, knowing the syscall-level behaviours of your program is helpful.
When your program is time critical. Some system calls are expensive, and the library functions that wrap them are even more expensive. Thus time-critical tasks may need to switch to equivalent functions that do not enter kernel space at all.
It is also worth noticing the vsyscall (or vdso) mechanism of linux, which accelerates some system calls (i.e. gettimeofday) through mapping their implementations into user-space memory. See this for more details.
When your program needs to be deployed to some restricted environments with system call auditing. In order for your programs to survive such environments, it could be necessary to profile your program for any potential policy violations, or perhaps less tough if you are aware of the restrictions when you wrote the program.
Sometimes it might be important, and sometimes it isn't. I don't think there's any universal answer to this question. Reasons I can think of that might be important in some contexts are: if the system call requires user permissions that the user might not have; in performance critical code a system call might be too heavyweight; if you're writing a signal-handler where most system calls are forbidden; if it might use some system resource (e.g. reading from /dev/random for every random number could use up the whole entropy pool - you'd want to know if that's going to happen every time you call rand()).