How to multiplex named pipe / fifo - named-pipes

Say we have a named pipe:
mkfifo my_named_pipe
say there are multiple writers writing to this named pipe:
node x.js > ${my_named_pipe} &
node y.js > ${my_named_pipe} &
node z.js > ${my_named_pipe} &
something like that - is there a reliable way to multiplex it, so that one whole message gets through each time, or can a named pipe reliable read from only one writer?
It leads me to wonder how we multiplex ports/sockets etc, I don't know how it's done.

This might be a bit of a naïve answer, but this works for me.
If you have multiple writers to a single FIFO and you don't want their output to be mangled then you can use stdbuf but only if the output is line-based. Whole paragraphs will still be interleaved.
stdbuf -oL node x.js > ${my_named_pipe} &
stdbuf -oL node y.js > ${my_named_pipe} &
stdbuf -oL node z.js > ${my_named_pipe} &
man stdbuf: stdbuf - Run COMMAND, with modified buffering operations for its standard streams.
This will only work when your original program does not adjusts its buffering of the standard output stream.

Related

Variable not being recognized after "read"

-- Edit : Resolved. See answer.
Background:
I'm writing a shell that will perform some extra actions required on our system when someone resizes a database.
The shell is written in ksh (requirement), the OS is Solaris 5.10 .
The problem is with one of the checks, which verifies there's enough free space on the underlying OS.
Problem:
The check reads the df -k line for root, which is what I check in this step, and prints it to a file. I then "read" the contents into variables which I use in calculations.
Unfortunately, when I try to run an arithmetic operation on one of the variables, I get an error indicating it is null. And a debug output line I've placed after that line verifies that it is null... It lost it's value...
I've tried every method of doing this I could find online, they work when I run it manually, but not inside the shell file.
(* The file does have #!/usr/bin/ksh)
Code:
df -k | grep "rpool/ROOT" > dftest.out
RPOOL_NAME=""; declare -i TOTAL_SIZE=0; USED_SPACE=0; AVAILABLE_SPACE=0; AVAILABLE_PERCENT=0; RSIGN=""
read RPOOL_NAME TOTAL_SIZE USED_SPACE AVAILABLE_SPACE AVAILABLE_PERCENT RSIGN < dftest.out
\rm dftest.out
echo $RPOOL_NAME $TOTAL_SIZE $USED_SPACE $AVAILABLE_SPACE $AVAILABLE_PERCENT $RSIGN
((TOTAL_SIZE=$TOTAL_SIZE/1024))
This is the result:
DBResize.sh[11]: TOTAL_SIZE=/1024: syntax error
I'm pulling hairs at this point, any help would be appreciated.
The code you posted cannot produce the output you posted. Most obviously, the error is signalled at line 11 but you posted fewer than 11 lines of code. The previous lines may matter. Always post complete code when you ask for help.
More concretely, the declare command doesn't exist in ksh, it's a bash thing. You can achieve the same result with typeset (declare is a bash equivalent to typeset, but not all options are the same). Either you're executing this script with bash, or there's another error message about declare, or you've defined some additional commands including declare which may change the behavior of this code.
None of this should have an impact on the particular problem that you're posting about, however. The variables created by read remain assigned until the end of the subshell, i.e. until the code hits a ), the end of a pipe (left-hand side of the pipe only in ksh), etc.
About the use of declare or typeset, note that you're only declaring TOTAL_SIZE as an integer. For the other variables, you're just assigning a value which happens to consist exclusively of digits. It doesn't matter for the code you posted, but it's probably not what you meant.
One thing that may be happening is that grep matches nothing, and therefore read reads an empty line. You should check for errors. Use set -e in scripts to exit at the first error. (There are cases where set -e doesn't catch errors, but it's a good start.)
Another thing that may be happening is that df is splitting its output onto multiple lines because the first column containing the filesystem name is too large. To prevent this splitting, pass the option -P.
Using a temporary file is fragile: the code may be executed in a read-only directory, another process may want to access the same file at the same time... Here a temporary file is useless. Just pipe directly into read. In ksh (unlike most other sh variants including bash), the right-hand side of a pipe runs in the main shell, so assignments to variables in the right-hand side of a pipe remain available in the following commands.
It doesn't matter in this particular script, but you can use a variable without $ in an arithmetic expression. Using $ substitutes a string which can have confusing results, e.g. a='1+2'; $((a*3)) expands to 7. Not using $ uses the numerical value (in ksh, a='1+2'; $((a*3)) expands to 9; in some sh implementations you get an error because a's value is not numeric).
#!/usr/bin/ksh
set -e
typeset -i TOTAL_SIZE=0 USED_SPACE=0 AVAILABLE_SPACE=0 AVAILABLE_PERCENT=0
df -Pk | grep "rpool/ROOT" | read RPOOL_NAME TOTAL_SIZE USED_SPACE AVAILABLE_SPACE AVAILABLE_PERCENT RSIGN
echo $RPOOL_NAME $TOTAL_SIZE $USED_SPACE $AVAILABLE_SPACE $AVAILABLE_PERCENT $RSIGN
((TOTAL_SIZE=TOTAL_SIZE/1024))
Strange...when I get rid of your "declare" line, your original code seems to work perfectly well (at least with ksh on Linux)
The code :
#!/bin/ksh
df -k | grep "/home" > dftest.out
read RPOOL_NAME TOTAL_SIZE USED_SPACE AVAILABLE_SPACE AVAILABLE_PERCENT RSIGN < dftest.out
\rm dftest.out
echo $RPOOL_NAME $TOTAL_SIZE $USED_SPACE $AVAILABLE_SPACE $AVAILABLE_PERCENT $RSIGN
((TOTAL_SIZE=$TOTAL_SIZE/1024))
print $TOTAL_SIZE
The result :
32962416 5732492 25552588 19% /home
5598
Which are the value a simple df -k is returning. The variables seem to last.
For those interested, I have figured out that it is not possible to use "read" the way I was using it.
The variable values assigned by "read" simply "do not last".
To remedy this, I have applied the less than ideal solution of using the standard "while read" format, and inside the loop, echo selected variables into a variable file.
Once said file was created, I just "loaded" it.
(pseudo code:)
LOOP START
echo "VAR_A="$VAR_A"; VAR_B="$VAR_B";" > somefile.out
LOOP END
. somefile.out

Perl script to copy logs with timestamp for every hour and paste into different file

First of all, I'm very new to programming and so would need your help in writing a perl script to do the following on windows.
I have a big log file with timestamp (1gb) and its difficult to read the logs as it takes a lot of time to open. so my requirement is to copy the logs from the bigger log file for the last one hour and paste it to another file and then copy the next 1 hr of data to different file(so we will have 24 files for a day). The next day the data in these files needs to be over written or delete & create a new file.
Sample log :
09092016-00:02:00,..................
09092016-00:02:08,..................
09092016-00:02:15,..................
09092016-00:02:18,..................
Please help me with this and thanks for your help in advance.
Thanks,
A simpler solution would be to use the split command to split the files into manageable sizes.
split -l 1000 logfile
Will split your logfile into smaller files of 1000 lines each.
You can then just use grep to find the files that contain the day you need.
grep 09092016 logfile*
for example:
logfile="./log"
while read -r d m y h; do
grep "^$d$m$y-$h" "$logfile" > "partial-${y}${m}{$d}-${h}.log"
done < <(sed -n 's/\(..\)\(..\)\(....\)-\(..\)\(.*\)/\1 \2 \3 \4/p' "$logfile" | sort -u)
easy, but not efficient. It reads the whole big logfile 25x for the split. (1x for gathering the existing ddmmyyyy-hh lines in the log, and again for every different found date-hour.)

perl memory usage when processing a file inline

I have a CGI script that's used by our employees to fetch logs from servers that they don't have direct access to. For reasons I won't go into, after a recent update to our app some of these logs now have characters like linefeeds, tabs, backslashes, etc. translated into their text equivalents. As such, I've modified the CGI script to invoke the following to convert these back to their original values:
perl -i -pe 's/\\r/\r/g && s/\\n/\n/g && s/\\t/\t/g && s/\\\//\//g' $filename
I was just informed that some people are now getting out of memory errors when they try to fetch logs that are fairly large (a few hundred MB).
My question: How does perl manage memory when an inline command like this is invoked? Is it reading the whole file in, processing it, then writing it out, or is it creating a temporary file, processing the lines from the input file one at a time then replacing the file once complete?
This is using perl 5.10.1 on a 64-bit Amazon linux instance.
The -p switch creates a while(<>){...; print} loop to iterate on each “line” in your input file.
If all of your newlines have been converted into "\\n", then your file would just be a single very long line. Therefore, your command would be loading the entire file into memory to perform your fix.
To avoid that, you'll have to intentionally buffer the file using either sysread or $/.
It would probably be easiest to create an actual script instead of a one-liner to do the work. However, if you know that all of your newlines are converted, then one simple fix would be to use $/ = "\\n"
As a secondary note, your regex is flawed. You're currently listing out your translations s/// using a shortcut operator. If any one of the earlier regexes doesn't match for a particular line, then no other translations would be attempted. You should instead use simple semicolons to separate your regexes:
's/\\r/\r/g; s/\\n/\n/g; s/\\t/\t/g; s|\\/|/|g'

Can kdb read from a named pipe?

I hope I'm doing something wrong, but it seems like kdb can't read data from named pipes (at least on Solaris). It blocks until they're written to but then returns none of the data that was written.
I can create a text file:
$ echo Mary had a little lamb > lamb.txt
and kdb will happily read it:
q) read0 `:/tmp/lamb.txt
enlist "Mary had a little lamb"
I can create a named pipe:
$ mkfifo lamb.pipe
and trying to read from it:
q) read0 `:/tmp/lamb.pipe
will cause kdb to block. Writing to the pipe:
$ cat lamb.txt > lamb.pipe
will cause kdb to return the empty list:
()
Can kdb read from named pipes? Should I just give up? I don't think it's a permissions thing (I tried setting -m 777 on my mkfifo command but that made no difference).
With release kdb+ v3.4 Q has support for named pipes: Depending on whether you want to implement a streaming algorithm or just read from the pipe use either .Q.fps or read1 on a fifo pipe:
To implement streaming you can do something like:
q).Q.fps[0N!]`:lamb.pipe
Then $ cat lamb.txt > lamb.pipe
will print
,"Mary had a little lamb"
in your q session. More meaningful algorithms can be implemented by replacing 0N! with an appropriate function.
To read the context of your file into a variable do:
q)h:hopen`:fifo://lamb.pipe
q)myText: `char$read1(h)
q)myText
"Mary had a little lamb\n"
See more about named pipes here.
when read0 fails, you can frequently fake it with system"cat ...". (i found this originally when trying to read stuff from /proc that also doesn't cooperate with read0.)
q)system"cat /tmp/lamb.pipe"
<blocks until you cat into the pipe in the other window>
"Mary had a little lamb"
q)
just be aware there's a reasonably high overhead (as such things go in q) for invoking system—it spawns a whole shell process just to run whatever your command is
you might also be able to do it directly with a custom C extension, probably calling read(2) directly…
The algorithm for read0 is not available to see what it is doing under the hood but, as far as I can tell, it expects a finite stream and not a continuous one; so it will will block until it receives an EOF signal.
Streaming from pipe is supported from v3.4
Details steps:
Check duplicated pipe file
rm -f /path/dataPipeFileName
Create named pipe
mkfifo /path/dataPipeFileName
Feed data
q).util.system[$1]; $1=command to fetch data > /path/dataPipeFileName &
Connect pipe using kdb .Q.fps
q).Q.fps[0N!]`$":/path/",dataPipeFileName;
Reference:
.Q.fps (streaming algorithm)
Syntax: .Q.fps[x;y] Where x is a unary function and y is a filepath
.Q.fs for pipes. (Since V3.4) Reads conveniently sized lumps of complete "\n" delimited records from a pipe and applies a function to each record. This enables you to implement a streaming algorithm to convert a large CSV file into an on-disk kdb+ database without holding the data in memory all at once.

zsh filename globbling/substitution

I am trying to create my first zsh completion script, in this case for the command netcfg.
Lame as it may sound I have stuck on the first hurdle, disclaimer, I know how to do this crudely, however I seek the "ZSH WAY" to do this.
I need to list the files in /etc/networking but only the files, not the directory component, so I do the following.
echo $(ls /etc/network.d/*(.))
/etc/network.d/ethernet-dhcp /etc/network.d/wireless-wpa-config
What I wanted was:
ethernet-dhcp wireless-wpa-config
So I try (excuse my naivity) :
echo ${(s/*\/)$(ls /etc/network.d/*(.))}
/etc/network.d/ethernet-dhcp /etc/network.d/wireless-wpa-config
It seems that this doesn't work, I'm sure there must be some clever way of doing this by splitting into an array and getting the last part but as I say, I'm complete noob at this.
Any advice gratefully received.
General note: There is no need to use ls to generate the filenames. You might as well use echo some*glob. But if you want to protect the possible embedded newline characters even that is a bad idea. The first example below globs directly into an array to protect embedded newlines. The second one uses printf to generate NUL terminated data to accomplish the same thing without using a variable.
It is easy to do if you are willing to use a variable:
typeset -a entries
entries=(/etc/network.d/*(.)) # generate the list
echo ${entries#/etc/network.d/} # strip the prefix from each one
You can also do it without a variable, but the extra stuff to isolate individual entries is a bit ugly:
# From the inside, to the outside:
# * glob the entries
# * NUL terminate them into a single string
# * split at NUL
# * strip the prefix from each one
echo ${${(0)"$(printf '%s\0' /etc/network.d/*(.))"}#/etc/network.d/}
Or, if you are going to use a subshell anyway (i.e. the command substitution in the previous example), just cd to the directory so it is not part of the glob expansion (plus, you do not have to repeat the directory name):
echo ${(0)"$(cd /etc/network.d && printf '%s\0' *(.))"}
Chris Johnsen's answer is full of useful information about zsh, however it doesn't mention the much simpler solution that works in this particular case:
echo /etc/network.d/*(:t)
This is using the t history modifier as a glob qualifier.
Thanks for your suggestions guys, having done yet more reading of ZSH and coming back to the problem a couple of days later, I think I've got a very terse solution which I would like to share for your benefit.
echo ${$(print /etc/network.d/*(.)):t}
I'm used to seeing basename(1) stripping off directory components; also, you can use echo /etc/network/* to get the file listing without running the external ls program. (Running external programs can slow down completion more than you'd like; I didn't find a zsh-builtin for basename, but that doesn't mean that there isn't one.)
Here's something I hope will help:
haig% for f in /etc/network/* ; do basename $f ; done
if-down.d
if-post-down.d
if-pre-up.d
if-up.d
interfaces