Is there a way to keep perl test data in memory between test runs? - perl

When I run my test cases via prove, it takes about 20 - 25 seconds to load my test data before the first test runs. Is there a way I could have a separate, persistent process load the test data in memory, and prove just retrieve a copy instantly each time I run my tests?
I know I could have a separate process return JSON/XML, but then it would have to be parsed. I'm wondering if I could have another process that returns a reference to a data structure.

If you're talking about separate program invocations - no, there's no way to do that. Memory is owned within the process, and it's destructed after the process.
The only way to do what you're trying to is:
Have a parent process 'load' the data.
fork the child process
This means the 'loaded' data will be retained in memory - fork uses copy-on-write for each of the children (So if it doesn't change, you won't use more memory at all).
But each child will have access to the same 'memory space'.
Alternatively, you could use Storable - and either:
write your data structure to disk (or /tmp) using store and retrieve.
pass data as a scalar using freeze and thaw.

Related

mimic test_start test_stop events in distributed mode worker

In my locustfile I defined test_on_start and test_on_stop events to read a file needed for the test and to write detailed statistics in a CSV at the end of the test. when running in distributed mode, these events occur on the master, not the worker. I am assembling a list of detailed stats for each task in a task sequence and at the end of the test writing a CSV file when the test stops. I found this stackoverflow question which references a setup and teardown. I added these to my class User(HttpUser): but they appear to not be executed.
How can I mimic these events when the test is running on a worker in distributed mode?
Is there a better way?
I am using User on_start and on_stop already - my on_start calls a function to select a random user from a list which was created when the #events.test_start.add_listener is fired, which only happens on the master and not on the workers, so the worker doesn't have any user login data.
It seems counter productive to open the file, read it, select a user at random and close it every time the User on_start method is called. User on_start also sets up the iteration list [] which is where i store the times per task.
When the task sequence is done, meaning the last task is executed, i do a self.interrupt() which runs on_stop, which is where I take the iteration times, and put them into a second list, which is later written using the CSV module. maybe it would be better to just write the data to the CSV during on_stop
The setup/teardown for individual Users has been removed (because they were confusing, as it was run on the first instance of that User class, and when people set properties on that instance got very confused by the fact that later instances didnt get that). Tbh, I wish they had just been replaced by class methods...
The User still has on_start/stop methods though, and if you combine that with a flag it may be able to do what you want. Something like this:
class MyUser(HttpUser):
stopped = False
...
def on_stop(self):
if not MyUser.stopped:
MyUser.stopped = True
# write your csv
# this doesnt guarantee that all your Users are finished though.
https://docs.locust.io/en/stable/writing-a-locustfile.html#on-start-and-on-stop-methods

Itemwriter output using same order that itemreader used to read file

We have a springbatch job that reads a file (flatfileitemreader), process it and writes data to a queue (jmsitemwriter).
We have another job that reads the queue (jmsitemreader) and writes a file (flatfileitemwriter). It's asynchronous process (in between the execution of the two jobs, we have some manual process that must be performed).
The flat file content doesn't have a line identifier. And we use a multi-threaded approach when reading the file ("throttle-limit"). So, the messages queued do not maintain the same order that they used to have into the flat file.
The problem is that we should generate an output file respecting the original order. So the line 33 inside the incoming file, should be line 33 into the outgoing file (it will have the contents of the original line, plus some data).
Does springbatch provide "native" a way to order the output, respecting the original read order? I used "native" because one solution that we thought is to create an additional step just to add a line number to the file and use it at the end, but I was wondering if this "reinvent the wheel"...
We are using SB 3.0.3
TIA,
Bob
The use case you are describing asks that you maintain order across multiple jobs which is not supported. In theory (while not guaranteed) a single, single threaded step would retain the order of the input file.
Since you are reading in a multithreaded manor, there really isn't a good way to guarantee the order of the items as they are being read. The best you could do is synchronize the read method and add an id as the items are being read. If the bottleneck you're attempting to address with multithreading is in the processor or writer, this may not be a bad option.

Using Mojo Event Loop for a long-running script processing huge text files?

I have a script implemented as a Mojo::Command.
It reads a huge text file and extracts data from it. The file contains simple tab separated (C/TSV) records. One record per line.
How can I use the Mojo Event loop to store those records in small files - one file per record - so my script does not wait for each record to be stored but continues to the next record.
Here is a stripped down example:
package My::task;
use Mojo::Base 'Mojolicious::Command';
#in My::task::run
#use Text::CSV to open and read the file
while (!$csv->eof()) {
my $row = $csv->getline($fh)
do_something_time_consuming_and_store_the_record_somewhere($row)
}
I was thinking Mojo Event Loop can be used and avoid forking/threading.
I used successfully previously Parallel::Forker, but I was thinking Mojo has what to offer to speedup the execution.
Is that possible? How?
It depends on the nature of do_something_time_consuming. If that is something that has your process CPU-busy, then you're looking for parallelism, which an event loop doesn't try to give you. In that event you might want to feed each row to redis (via mojo::redis) and have worker processes consume, process, store each record. Then throughput is down to how many parallel workers you can run.
On the other hand, if do_something_time_consuming involves a lot of waiting, eg post to a web service and wait for results, then an event loop (incl mojo's) can be a big win, and handle the concurrency that you want. It's hard to guess which of the non-blocking UserAgent examples is closest to your scenario, since you're short on detail. The gist is to create a callback that does what you want (eg store_the_record_somewhere) when it gets a response back from the remote service.

Feeding multiple blocks via callbacks

I created a library for a battery. My model contains multiple instances of this battery and each one is supposed to be configured with variables from an XLS file.
The problem is that I can't seem to find a way to call the same function from the preloaded callback and write local variables to every battery out of that function. If I write into the "base" workspace the values are overwritten every time I call the function.
Can anybody help?

Explain the steps for db2-cobol's execution process if both are db2 -cobol programs

How to run two sub programs from a main program if both are db2-cobol programs?
My main program named 'Mainpgm1', which is calling my subprograms named 'subpgm1' and 'subpgm2' which are a called programs and I preferred static call only.
Actually, I am now using a statement called package instead of a plan and one member, both in 'db2bind'(bind program) along with one dbrmlib which is having a dsn name.
The main problem is that What are the changes affected in 'db2bind' while I am binding both the db2-cobol programs.
Similarly, in the 'DB2RUN'(run program) too.
Each program (or subprogram) that contains SQL needs to be pre-processed to create a DBRM. The DBRM is then bound into
a PLAN that is accessed by a LOAD module at run time to obtain the correct DB/2 access
paths for the SQL statements it contains.
You have gone from having all of your SQL in one program to several sub-programs. The basic process
remains the same - you need a PLAN to run the program.
DBA's often suggest that if you have several sub-programs containing SQL that you
create PACKAGES for them and then bind the PACKAGES into a PLAN. What was once a one
step process is now two:
Bind DBRM into a PACKAGE
Bind PACKAGES into a PLAN
What is the big deal with PACKAGES?
Suppose you have 50 sub-programs containing SQL. If you create
a DBRM for each of them and then bind all 50 into a PLAN, as a single operation, it is going
to take a lot of resources to build the PLAN because every SQL statement in every program needs
to be analyzed and access paths created for them. This isn't so bad when all 50 sub-programs are new
or have been changed. However, if you have a relatively stable system and want to change 1 sub-program you
end up reBINDing all 50 DBRMS to create the PLAN - even though 49 of the 50 have not changed and
will end up using exactly the same access paths. This isn't a very good apporach. It is analagous to compiling
all 50 sub-programs every time you make a change to any one of them.
However, if you create a PACKAGE for each sub-program, the PACKAGE is what takes the real work to build.
It contains all the access paths for its associated DBRM. Now if you change just 1 sub-program you
only have to rebuild its PACKAGE by rebinding a single DBRM into the PACKAGE collection and then reBIND the PLAN.
However, binding a set of PACKAGES (collection) into a PLAN
is a whole lot less resource intensive than binding all the DBRM's in the system.
Once you have a PLAN containing all of the access paths used in your program, just use it. It doesn't matter
if the SQL being executed is from subprogram1 or subprogram2. As long as you have associated the PLAN
to the LOAD that is being run it should all work out.
Every installation has its own naming conventions and standards for setting up PACKAGES, COLLECTIONS and
PLANS. You should review these with your Data Base Administrator before going much further.
Here is some background information concerning program preperation in a DB/2 environment:
Developing your Application