I would like to run the following q code in python:
table: ("ISI"; enlist ",") 0:`data.csv
I am starting with exploring qpython as its easier to use in Windows for now (compared to pyq) and would like to do the following:
q = qconnection.QConnection(host = 'localhost', port = 5000)
q.sync('table: ("ISI"; enlist ",") 0:`data.csv')
Is something like this possible or do I need to use pyq in the future when its stable for Windows? The examples I have seen for q.sync are queries and functions that take a list of parameters rather than directly running code in the q environment. I would like to make sure I am not missing some other functionality that I can use for my current task.
When trying to access a file you have to use its file handle which is of the form `:data.csv (notice the colon at the start), instead of a symbol which is what you are using. You can use hsym to turn a symbol into a file handle.
You should also check that the file is in the same working directory as the q process, using \dir in the q process on Windows, otherwise you will need to adapt your file handle to point to the correct location
q)hsym `data.csv
`:data.csv
With a file data.csv that has contents:
id,sym,val
1,APPL,50
2,GOOG,100
Running the same command that you did but using the file handle:
In: q.sync('table: ("ISI"; enlist ",") 0: `:data.csv')
or
In: q.sync('table: ("ISI"; enlist ",") 0:hsym `qpython.csv')
Checking the resulting variable using qpython:
In: q.sync('table')
Out: rec.array([(1, b'APPL', 50), (2, b'GOOG', 100)],
dtype=[('id', '<i4'), ('sym', 'S4'), ('val', '<i4')])
Checking in the q process
q)table
id sym val
-----------
1 APPL 50
2 GOOG 100
Related
Thanks to this great answer from #Przemyslaw Szufel, we can easily redirect stdio output into a file providing its path.
I wanted to know if it is possible to both redirect stio output to a file and still continue print it in the terminal. That is probably an easy feature to achieve but I could not find it!
And obviously not running the code twice :)
You need a Tee Stream such as https://github.com/fredrikekre/TeeStreams.jl/.
Unfortunately the interface of redirect_stdio does not accept abstract IO for its sink. It requires it to be an IOStream which is a concrete Julia type.
So it seems like there is no good way with the redirect_stdio via streams.
However redirect_stdio does accept sockets and named pipes and this is a way you could consider (yet too complex for this simple tasks - maybe someone finds a more elegant solution).
First you need to create a pipe server that will be running asynchronously in your REPL session (this is code for Windows for Linux you will probably need to remove \\.\pipe\ from the pipe name):
using Sockets
srv = Sockets.listen(raw"\\.\pipe\TeePipe")
mystdout = stdout
#async begin
open("log.txt", "w") do f
sock = accept(srv)
while !eof(sock)
line = readline(sock)
println(mystdout, line)
flush(mystdout)
println(f, line)
flush(f)
end
end
end
Now at the same REPL session you create the client:
teelogger = Sockets.connect(raw"\\.\pipe\TeePipe")
And here is a sample REPL session showing that this works:
julia> redirect_stdio(stdout=teelogger) do
println("Line1")
println("Line2")
#show 3+4
end
Line1
Line2
3 + 4 = 7
7
shell> more log.txt
Line1
Line2
3 + 4 = 7
Is it possible to run a single macro for all xls/xlsx files and if so how. The macro shown below scales the excel file to fit to single page which is necessary as the number of columns is 19 and is needed to convert it to pdf using lo cli.
Libre office version: 6.0.6
Macro has been recorded with libreoffice and can be seen below:
REM ***** BASIC *****
sub Main
rem ----------------------------------------------------------------------
rem define variables
dim document as object
dim dispatcher as object
rem ----------------------------------------------------------------------
vrem get access to the document
document = ThisComponent.CurrentController.Frame
dispatcher = createUnoService("com.sun.star.frame.DispatchHelper")
rem ----------------------------------------------------------------------
dispatcher.executeDispatch(document, ".uno:PageFormatDialog", "", 0, Array())
end sub
Please let me know if any info is needed regarding the tests.
Got the answer from one of the developers at Libreoffice and it works like a charm, so sharing it here. The link to the answer can be found here
Mike's Solution
First: your recorded macro wouldn't work: it doesn't apply changes, it just opens a dialog. Please always test recorded macros :-)
You may use this macro instead:
Sub FitToPage
Dim document As Object, pageStyles As Object
document = ThisComponent
pageStyles = document.StyleFamilies.getByName("PageStyles")
For i = 0 To document.Sheets.Count - 1
Dim sheet As Object, style As Object
sheet = document.Sheets(i)
style = pageStyles.getByName(sheet.PageStyle)
style.ScaleToPagesX = 1
Next
On Error Resume Next
document.storeSelf(Array())
document.close(true)
End Sub
It operates on the current document, and after setting the scale, it saves (overwrites!) and closes the document.
To use this macro from a command line, you need to save it to some of libraries, e.g. Standard. In my example below, I use Module1 to store it.
You may use this macro on a single document like this:
'path/to/LibreOffice/program/soffice' path/to/excelfile.ext macro:///Standard.Module1.FitToPage
To use it on multiple documents, you need to make this in a loop (mentioning multiple filenames as arguments to a single soffice invocation, like with shell globbing on Linux using *, will not work - actually, it will only run the macro for the last document, keeping the others open and unmodified). A loop for Windows could be like this:
for %f in (*.xls) do start /wait "" "C:\Program Files\LibreOffice\program\soffice.exe" "%f" macro:///Standard.Module1.FitToPage
I am using the following line:
`:c:/dir/ set .Q.en[`:c:/dir; tablename]
Everything is ok if I don't exit KDB, but if I do and then try to load the table using
get `dir
all the symbol columns are integer. I would really appreciate your help into understanding why this happens.
It looks like you forgot to repeat the table name on the l.h.s. of set.
Try
q)`:c:/dir/tablename/ set .Q.en[`:c:/dir; tablename]
This will correctly save table columns in c:/dir/tablename subdirectory and place the sym file alongside. Now you should be able to load both your table and the sym file by using the \l command or specifying c:/dir on the command line when you restart q
q c:/dir
or
q
q)\l c:/dir
(no backticks or leading :'s in either of those commands)
If you want to use get on this table, you will have to load sym separately:
q)load`:c:/dir/sym
q)get`:c:/dir/tablename/
(note the leading : in the path specs)
Finally, you may want to take a look at the rsave command which will save your table without you having to write tablename twice.
.Q.en takes 2 oarams - file handle and table data
Your first param isnt a hsym - should be backtick then colon then path to your db root
Also set takes 2 params - first in this case should be the path to where you want to save like dir/splayedTableName/
I keep running into an error when trying to add variables of one spss file to another. File 1 has 1.800.000 cases [payments], File 2 has 800.000 cases [recipients]. They both have an ID number to match cases on.
For every payment in File 1 I want to add the recipient, from File 2. The recipients should thus be able to match for multiple payments.
This are the two codes I have been trying, which don't work:
code using IN
DATASET ACTIVATE DataSet1.
SORT CASES BY recipientid(A).
DATASET ACTIVATE DataSet2.
SORT CASES BY recipientid(A).
Match Files /File=DataSet1
/In=DataSet2
/BY globalrecipientid.
execute
When I use /In I don't get any errors, but the files don't properly match sin it doesn't add any variables.
code using TABLE
DATASET ACTIVATE DataSet1.
SORT CASES BY recipientid(A).
DATASET ACTIVATE DataSet2.
SORT CASES BY recipientid(A).
Match Files /File=DataSet1
/TABLE=DataSet2
/BY globalrecipientid.
execute
When I use /TABLE I get the following error:
Warning # 5132
Undefined error #5132 - Cannot open text file 'S:\Progra~1\spss\IBM\SPSS\STATIS~1\20\lang\en\spss.err": No such file or directory
I have run out of tricks, wouldn't dare try this in Ruby, and excel sadly is too small to handle this.. Any thoughts?
Your first solution is wring because you are using IN subcommand wrongly. In other words you are matching Dataset1 with nothing.
IN creates a new variable in the resulting file that indicates whether
a case came from the input file named on the preceding FILE
subcommand.
Your second solution. You are sorting dataset by variable recipientid but the match files is done by the variable globalrecipientid. Why do you sort by one variable but match by another? This could be a problem. And dataset names should be in quotes.
Solution 1:
DATASET ACTIVATE DataSet1.
SORT CASES BY recipientid (A).
DATASET ACTIVATE DataSet2.
SORT CASES BY recipientid (A).
Match Files
/File = "DataSet1"
/TABLE = "DataSet2"
/BY recipientid.
execute.
Solution 2. I never liked the implementation of datasets in SPSS. I did not trusted them. Other solution is to save datasets as files and do the match of files.
get "file1.sav".
SORT CASES BY recipientid (A).
save out "file1s.sav".
get "file2.sav".
SORT CASES BY recipientid (A).
save out "file2s.sav".
Match Files
/File = "file1s.sav"
/TABLE = "file2s.sav"
/BY recipientid.
execute.
My syntax looks somwhat different:
DATASET ACTIVATE DatenSet1.
MATCH FILES /FILE=*
/FILE='DatenSet2'
/RENAME VarsToRename
/BY ID
/DROP= Vars
EXECUTE.
Maybe this helps?
I hope I'm doing something wrong, but it seems like kdb can't read data from named pipes (at least on Solaris). It blocks until they're written to but then returns none of the data that was written.
I can create a text file:
$ echo Mary had a little lamb > lamb.txt
and kdb will happily read it:
q) read0 `:/tmp/lamb.txt
enlist "Mary had a little lamb"
I can create a named pipe:
$ mkfifo lamb.pipe
and trying to read from it:
q) read0 `:/tmp/lamb.pipe
will cause kdb to block. Writing to the pipe:
$ cat lamb.txt > lamb.pipe
will cause kdb to return the empty list:
()
Can kdb read from named pipes? Should I just give up? I don't think it's a permissions thing (I tried setting -m 777 on my mkfifo command but that made no difference).
With release kdb+ v3.4 Q has support for named pipes: Depending on whether you want to implement a streaming algorithm or just read from the pipe use either .Q.fps or read1 on a fifo pipe:
To implement streaming you can do something like:
q).Q.fps[0N!]`:lamb.pipe
Then $ cat lamb.txt > lamb.pipe
will print
,"Mary had a little lamb"
in your q session. More meaningful algorithms can be implemented by replacing 0N! with an appropriate function.
To read the context of your file into a variable do:
q)h:hopen`:fifo://lamb.pipe
q)myText: `char$read1(h)
q)myText
"Mary had a little lamb\n"
See more about named pipes here.
when read0 fails, you can frequently fake it with system"cat ...". (i found this originally when trying to read stuff from /proc that also doesn't cooperate with read0.)
q)system"cat /tmp/lamb.pipe"
<blocks until you cat into the pipe in the other window>
"Mary had a little lamb"
q)
just be aware there's a reasonably high overhead (as such things go in q) for invoking system—it spawns a whole shell process just to run whatever your command is
you might also be able to do it directly with a custom C extension, probably calling read(2) directly…
The algorithm for read0 is not available to see what it is doing under the hood but, as far as I can tell, it expects a finite stream and not a continuous one; so it will will block until it receives an EOF signal.
Streaming from pipe is supported from v3.4
Details steps:
Check duplicated pipe file
rm -f /path/dataPipeFileName
Create named pipe
mkfifo /path/dataPipeFileName
Feed data
q).util.system[$1]; $1=command to fetch data > /path/dataPipeFileName &
Connect pipe using kdb .Q.fps
q).Q.fps[0N!]`$":/path/",dataPipeFileName;
Reference:
.Q.fps (streaming algorithm)
Syntax: .Q.fps[x;y] Where x is a unary function and y is a filepath
.Q.fs for pipes. (Since V3.4) Reads conveniently sized lumps of complete "\n" delimited records from a pipe and applies a function to each record. This enables you to implement a streaming algorithm to convert a large CSV file into an on-disk kdb+ database without holding the data in memory all at once.