I have an expensive function
mainfunc:{[batch;savepath]
...compute large table depending on batch and save it to savepath...
}
which is copied to slave processes and each process handles a different batch input to the function.
Say I need to execute the function on input:til 1000 then process 1 executes mainfunc[input[til 50];savepath] and process 2 executes mainfunc[input[50+til 50]]. Since they are processes and not threads saving to file is possible.
The reason why I am trying to avoid threads is that the output from mainfunc is a large table and there would be a lot of serialization overhead if the table was sent to the main process and saved from there.
Now I am running into the limitation that I want to splay the large table in each process. However splaying (in the case of table with symbols) requires to modify the sym file kept for enumeration and I get an exception as soon as 2 processes are trying to access that sym file. I use something like
splaydirname set .Q.en[hsym `$savepath] largetable; / .Q.en locks the sym file enumeration is done
How can I work around that?
Thanks for the help
Related
I accidentally removed the tplogs in the box. How can I create the log file again and connect to the log file in the tickerplant without affect the data in memory?
I know .u.L shows the path and .u.i is the count of the tplogs.
It is weird that .u.i still gives me the count when the file is already removed.
If I create and connect to the tplog again, will it be the only tplog its written?
.u.i counts the messages as they flow to the log file, it has no awareness of you deleting the log file so wouldn't reflect that. When a tp starts up it will call .u.ld which will do a count of the messages in the log file to set .u.i/j if the log file already exists on startup.
https://github.com/KxSystems/kdb-tick/blob/afc67eda6dfbb2ca89322f702db23ee68c2c7be3/tick.q#L29
You could use it to open a new file and reset .u.i/j/l/L. .u.ld .z.D.
*** If this is production were this has been done, I would break qa/uat and attempt this there first.
To answer your last question, if you re-create the tplog, only the messages written to the new log from that point will be saved. If your end of day is dependent on reading the log file then you will need to figure out an alternative with the rdb. If you are using an rdb/wdb for end of day and nothing goes wrong the messages will be retained. If rdb dies there will be no log to replay and data will be lost. Wdb will have most data already written to disk but you would need to be careful in it's startup if it died. If by default it removes the intra-day db and replays log it would delete the data and be unrecoverable.
This is somewhat related to your question, but it is possible to retroactively re-create a tickerplant log either from an in-memory table (if you still had it living in an RDB) or an on-disk table (if your system did manage to write from the RDB to the HDB). However recreating it exactly as it would have been in the live situation could be tricky, especially if you have a lot of tables that would be in the log.
In-mem table
.[`:/path/to/myTPlog;();:;()];
l:hopen`:/path/to/myTPlog;
{l enlist(`upd;`tabName;value x)}each select from tabName;
One issue here is that this would do one table in a full sequence, whereas in realtime you more likely had multiple tables intertwined based on the live timestamps. You could try to piece the log together chronologially across various tables in a lock-step time sequence but that would require a bit more work and memory.
If you were happy to write entire tables to the log one-at-a-time then you could even write the entire table as one upd:
l enlist(`upd;`tabName;value flip select from tabName);
On-disk table
unenum:{#[x; where type'[flip x] within 20 77h; value]};
{l enlist(`upd;`tabName;value x)}each delete date from unenum[select from tabName where date=.z.D-1];
I have a processing pipeline generates two streams of data, then joins the two streams of data to produce a third stream. Each stream of data is a timeseries over 1 year of 30 minute intervals (so 17520 rows). Both the generated streams and the joined stream are written into a single table keyed by a unique stream id and the timestamp of each point in the timeseries.
In abstract terms, the c and g series are generated by plpgsql functions which insert into the timeseries table from data stored elsewhere in the database (e.g. with a select) and then return the unique identifiers of the newly created series. The n series is generated with a join between the timeseries identified by c_id and g_id by the calculate_n() function which returns the id of the new n series.
To illustrate with some pseudo code:
-- generation transaction
begin;
c_id = select generate_c(p_c);
g_id = select generate_g(p_g);
end transaction;
-- calculation transaction
begin;
n_id = select calculate_n(c_id, g_id);
end transaction;
I observe that generate_c() and generate_g() typically run in a lot less than a second however the first time calculate_n() runs, it typically takes 1 minute.
However if I run calculate_n() a second time with exactly the same parameters as the first run, it runs in less than a second. (calculate_n() generates a completely new series each time it runs - it is not reading or re-writing any data calculated by the first execution)
If I stop the database server, restart it, then run calculate_n() on c_id and g_id calculated previously, the execution of calculate_n() also takes less than a second.
This is very confusing to me. I could understand the second run of calculate_n() is taking only a second if, somehow, the first run had warmed a cache but if that is so, then why does the third run (after a server restart) still run quickly when any such cache would have been cleared?
It appears to me that perhaps some kind of write cache, generated by the first generation transaction, is (unexpectedly) impeding the first execution of calculate_n() but once calculate_n() completes, that cache is purged so that it doesn't get in the way of subsequent executions of calculate_n() when they occur. I have had a look at the activity of the shared buffer cache via pg_buffercache but didn't see any strong evidence that this was happening although there were certainly evidence of cache activity across executions of calculate_n().
I may be completely off-base about this being the result of an interaction with a write-cache that was populated by the first transaction, but I am struggling to understand why the performance of calculate_n() is so poor immediately after the first transaction completes but not at other times, such as immediately after the first attempt or after the database server is restarted.
I am using postgres 11.6.
What explains this behaviour?
update:
So further on this. Running the vacuum analyze between the two generate steps and the calculate step did improve the performance of the calculate step, but if I found that if I repeated the steps again, I needed to run vacuum analyze in between the generate steps and the calculate step every time I executed the sequence which doesn't seem like a particularly practical thing to do (since you can't call vacuum analyze in a function or a procedure). I understand the need to run vacuum analyze at least once with a reasonable number of rows in the table. But do I really need to do it every time I insert 34000 more rows?
I have a huge database of about 800GB. When I tried to run a query which groups certain variables and aggregates the result, it was stopping after running for a couple of hours. Postgres was throwing a message that disk space is full. After looking at the statistics I realized that the dB has about 400GB of temporary files. I believe these temp files where created while I was running the query. My question is how do I delete these temp files. Also, how do I avoid such problems - use cursors or for-loops to not process all the data at once? Thanks.
I'm using Postgres 9.2
The temporary files that get created in base/pgsql_tmp during query execution will get deleted when the query is done. You should not delete them by hand.
These files have nothing to do with temporary tables, they are use to store data for large hash or sort operations that would not fit in work_mem.
Make sure that the query is finished or canceled, try running CHECKPOINT twice in a row and see if the files are still there. If yes, that's a bug; did the PostgreSQL server crash when it ran out of disk space?
If you really have old files in base/pgsql_tmp that do not get deleted automatically, I think it is safe to delete them manually. But I'd file a bug with PostgreSQL in that case.
There is no way to avoid large temporary files if your execution plan needs to sort large result sets or needs to create large hashes. Cursors won't help you there. I guess that with for-loops you mean moving processing from the database to application code – doing that is usually a mistake and will only move the problem from the database to another place where processing is less efficient.
Change your query so that it doesn't have to sort or hash large result sets (check with EXPLAIN). I know that does not sound very helpful, but there's no better way. You'll probably have to do that anyway, or is a runtime of several hours acceptable for you?
I have 2 independent Matlab workers, with FIRST getting/saving data and SECOND reading it (and doing some calculations etc).
FIRST saves data as .mat file on the hard-disk while SECOND reads it from there. It takes ~20 seconds to SAVE this data as .mat and 8millisec to DELETE it. Before SAVING data, FIRST deletes the old file and then saves a newer version.
How can the SECOND verify that data exists and is not corrupt? I can use exists but that doesn't tell me if the data is corrupt or not. For eg, if SECOND tries to read data exactly when FIRST is saving it, exists passes but LOAD gives you an error saying - Data Corrupt etc.
Thanks.
You can't, without some synchronization mechanism - by the time SECOND completes its check and starts to read the file, FIRST might have started writing it again. You need some sort of lock or mutex.
Two options for base Matlab.
If this is on a local filesystem, you could use a separate lock file sitting next to the data file to manage concurrent access to the data file. Use Java's NIO FileChannel and FileLock objects from within Matlab to lock the first byte of the lock file and use that as a semaphore to control access to the data file, so the reader waits until the writer is finished and vice versa. (If this is on a network filesystem, don't try this - file locking may seem to work but usually is not officially supported and in my experience is unreliable.)
Or you could just put a try/catch around your load() call and have it pause a few seconds and retry if you get a corrupt file error. The .mat file format is such that you won't get a partial read if the writer is still writing it; you'll get that corrupt file error. So you could use this as a lazy sort of collision detection and backoff. This is what I usually do.
To reduce the window of contention, consider having FIRST write to a temporary file in the same directory, and then use a rename to move it to its final destination. That way the file is only unavailable during a quick filesystem move operation, not the 20 seconds of data writing. If you have multiple writers, stick the PID and hostname in the temp file name to avoid collisions.
Sounds like a classic resource sharing problem between 2 threads (R-W)
In short, you should find a method of inter-workers safe communication. Check this out.
Also, try to type
showdemo('paralleldemo_communic_prof')
in Matlab
I have some very weird data corruption trouble recently.
Basically what I do is:
transfer some large data (50files, each around 8GB) from one server to hpcc(high performance computing) using "scp"
Process each line of input files, and then append/write those modified lines to output files. And I do this on hpcc by "qsub -t 1-1000 xxx.sh", that is throwing out all 1000 jobs at the same time. Also these 1000 jobs are on average using 4GB of memory each.
The basic format of my script is:
f=open(file)
for line in f:
#process lines
or
f=open(file).readlines()
#process lines
However, weird part is: from time to time, I can see data corruption in some parts of my data.
First, I just find some of my "input" data is corrupted (not ALL); then I just doubt if it's the problem of "scp". I ask some computer guys, and also post here, but seems there's very little possibility that 'scp' can distort the data.
And I just do "scp" to transfer my data again to hpcc; and the input data this time becomes ok. weird, right?
So this propels me to think: is it possible that input data maybe disrupted by being used to run memory/CPU usage-intensive programs?
If input data is corrupted, it's very natural that output is also corrupted. Ok, then I transfer the input data again to hpcc, and check that all of them are in good-shape, I then run programs (should point out:run 1000 jobs together), and the output files...most of them are good; however very surprisingly, some portion of only one file are corrupted! So for I just singly run program for this specific file again, then get good output without any corruption!!
I'm so confused......After seeing so many weird things, my only conclusion is: maybe running many memory-intensive jobs at the same time will harm the data? (But I used to also run lots of such jobs, and seems ok)
And by data corruption, I mean:
Something like this:
CTTGTTACCCAGTTCCAAAG9583gfg1131CCGGATGCTGAATGGCACGTTTACAATCCTTTAGCTAGACACAAAAGTTCTCCAAGTCCCCACCAGATTAGCTAGACACAGAGGGCTGGTTGGTGCATCT0/1
gfgggfgggggggggggggg9583gfg1131CCGGAfffffffaedeffdfffeffff`fffffffffcafffeedffbfbb[aUdb\``ce]aafeeee\_dcdcWe[eeffd\ebaM_cYKU]\a\Wcc0/1
CTTGTTACCCAGTTCCAAAG9667gfg1137CCGGATCTTAAAACCATGCTGAGGGTTACAAA1AGAAAGTTAACGGGATGCTGATGTGGACTGTGCAAATCGTTAACATACTGAAAACCTCT0/1
gfgggfgggggggggggggg9667gfg1137CCGGAeeeeeeeaeeb`ed`dadddeebeeedY_dSeeecee_eaeaeeeeeZeedceadeeXbd`RcJdcbc^c^e`cQ]a_]Z_Z^ZZT^0/1
However it should be like:
#HWI-ST150_0140:6:2204:16666:85719#0/1
TGGGCTAAAAGGATAAGGGAGGGTGAAGAGAGGATCTGGGTGAACACACAAGAGGCTTAAAGCATTTTATCAAATCCCAATTCTGTTTACTAGCTGTGTGA
+HWI-ST150_0140:6:2204:16666:85719#0/1
gggggggggggggggggfgggggZgeffffgggeeggegg^ggegeggggaeededecegffbYdeedffgggdedffc_ffcffeedeffccdffafdfe
#HWI-ST150_0140:6:2204:16743:85724#0/1
GCCCCCAGCACAAAGCCTGAGCTCAGGGGTCTAGGAGTAGGATGGGTGGTCTCAGATTCCCCATGACCCTGGAGCTCAGAACCAATTCTTTGCTTTTCTGT
+HWI-ST150_0140:6:2204:16743:85724#0/1
ffgggggggfgeggfefggeegfggggggeffefeegcgggeeeeebddZggeeeaeed[ffe^eTaedddc^Oacccccggge\edde_abcaMcccbaf
#HWI-ST150_0140:6:2204:16627:85726#0/1
CCCCCATAGTAGATGGGCTGGGAGCAGTAGGGCCACATGTAGGGACACTCAGTCAGATCTATGTAGCTGGGGCTCAAACTGAAATAAAGAATACAGTGGTA