Mosquito clear retained mojibake topic - unicode

I've published an mojibake topic because of my crash program, the ��� part is some random byte codes, e.g.
test/���������������/yoyoyo qqq
And if I want to clear it, I type something like
mosquitto_pub -t test/���������������/yoyoyo -r -n
But this didn't work, because these � character is not the original �, these codec is mismatch in ascii and unicode so present as � character.
How can I clear these retain message without delete the whole retain file?
Thanks!

If you can't work out what the actual char you can capture them and play them back into mosquitto_pub with something like this:
mosquitto_sub -v -C 1 -N -t 'test/+/yoyoyo' > file.txt
This will save to the file the topic and the payload of the first message that matches the pattern. There will be a space between the topic and paylaod.
You can then edit the file to remove the payload and leave just the topic (on a line on it's own with no new line at the end) and save this as edited_file.txt
You can then feed this back into mosquitto_pub, add -n flag (null message) and -r flag (retain message)
mosquitto_pub -t `cat edited_file.txt` -r -n

Related

Why does "sed -n -i" delete existing file contents?

Running Fedora 25 server edition. sed --version gives me sed (GNU sed) 4.2.2 along with the usual copyright and contact info. I've create a text file sudo vi ./potential_sed_bug. Vi shows the contents of this file (with :set list enabled) as:
don't$
delete$
me$
please$
I then run the following command:
sudo sed -n -i.bak /please/a\testing ./potential_sed_bug
Before we discuss the results; here is what the sed man page says:
-n, --quiet, --silent
suppress automatic printing of pattern space
and
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if extension supplied). The default operation mode is to break symbolic and hard links. This can be changed with --follow-symlinks and --copy.
I've also looked other sed command references to learn how to append with sed. Based on my understanding from the research I've done; the resulting file content should be:
don't
delete
me
please
testing
However, running sudo cat ./potential_sed_bug gives me the following output:
testing
In light of this discrepancy, is my understanding of the command I ran incorrect or is there a bug with sed/the environment?
tl;dr
Don't use -n with -i: unless you use explicit output commands in your sed script, nothing will be written to your file.
Using -i produces no stdout (terminal) output, so there's nothing extra you need to do to make your command quiet.
By default, sed automatically prints the (possibly modified) input lines to whatever its output target is, whether implied or explicitly specified: by default, to stdout (the terminal, unless redirected); with -i, to a temporary file that ultimately replaces the input file.
In both cases, -n suppresses this automatic printing, so that - unless you use explicit output functions such as p or, in your case, a - nothing gets printed to stdout / written to the temporary file.
Note that the automatic printing applies to the so-called pattern space, which is where the (possibly modified) input is held; explicit output functions such as p, a, i and c do not print to the pattern space (for potential subsequent modification), they print directly to the target stream / file, which is why a\testing was able to produce output, despite the use of -n.
Note that with -i, sed's implicit printing / explicit output commands only print to the temporary file, and not also to stdout, so a command using -i is invariably quiet with respect to stdout (terminal) output - there's nothing extra you need to do.
To give a concrete example (GNU sed syntax).
Since the use of -i is incidental to the question, I've omitted it for simplicity. Note that -i prints to a temporary file first, which, on completion, replaces the original. This comes with pitfalls, notably the potential destruction of symlinks; see the lower half of this answer of mine.
# Print input (by default), and append literal 'testing' after
# lines that contain 'please'.
$ sed '/please/ a testing' <<<$'yes\nplease\nmore'
yes
please
testing
more
# Adding `-n` suppresses the default printing, so only `testing` is printed.
# Note that the sequence of processing is exactly the same as without `-n`:
# If and when a line with 'please' is found, 'testing' is appended *at that time*.
$ sed -n '/please/ a testing' <<<$'yes\nplease\nmore'
testing
# Adding an unconditional `p` (print) call undoes the effect of `-n`.
$ sed -n 'p; /please/ a testing' <<<$'yes\nplease\nmore'
yes
please
testing
more

How are perl's -T and -B implemented?

What does perl's -T function really do? From the man page on perlfunc:
-T File is an ASCII text file (heuristic guess).
-B File is a "binary" file (opposite of -T).
Is the -B option simply equivalent to ! -T, or is it simply an inversion of the heuristic, such that some of the time, a file may be true for both -B and -T. Does the heuristic have, say, a threshold for control characters? Does it ignore tabs, EOLs, EOFs and NULs?
From the same page:
The -T and -B switches work as follows.
The first block or so of the file is examined to see if it is valid UTF-8 that includes non-ASCII characters. If, so it's a -T file. Otherwise, that same portion of the file is examined for odd characters such as strange control codes or characters with the high bit set. If more than a third of the characters are strange, it's a -B file; otherwise it's a -T file. Also, any file containing a zero byte in the examined portion is considered a binary file. (If executed within the scope of a use locale which includes LC_CTYPE , odd characters are anything that isn't a printable nor space in the current locale.) If -T or -B is used on a filehandle, the current IO buffer is examined rather than the first block. Both -T and -B return true on an empty file, or a file at EOF when testing a filehandle. Because you have to read a file to do the -T test, on most occasions you want to use a -f against the file first, as in next unless -f $file && -T $file .

mitmproxy record to outfile utf8 encoding error

I am using mitmproxy and want to record every request and reponse to file,so I use "-w" option just as following:
mitmproxy -b 192.168.1.107 -p 9527 -w ~/Desktop/aaa.txt
but when I open the 'aaa.txt',it display unreadable content which is just as following:
[x§‡:ÖáHi4GÐL¿¤Ìé4Îæyùͧq¼<µYÂ&É‹¶Mñ+GÒ‡i8
avÅÆdT£<_‰»ÚÀ—æÏÂÓSòo“çˆ$B6KƒßÛVÚ¼rq{”2w.®NÉRhÔ…x)¥qÕ¾0‡8éÙOøóŸüÍ—òÛ_þãnñ—‡"Ä‚NqiŠ¬#JÔî"œE§"CJ&0‡Í*NCBé r:G£O1yùè“æRQB4
I also try the script:https://github.com/mitmproxy/mitmproxy/blob/master/examples/flowwriter.py
it still doesn't work, so is there some encoding error?
mitmproxy -w writes a serialized (not primarily human-readable) dump file that can be read again using -r. If the content of a message are e.g. gzip-encoded, you'll see gzip-encoded data in the dumpfile. If you want human-readable output to a text file, I'd suggest running
mitmdump -r ~/Desktop/aaa.txt -n -dd
Explanation:
-r: Read an existing dump file
-n: Do not start a proxy server
-d: increase output details/verbosity (-ddd if you don't want contents to be cut off)

Extracting the contents between two different strings using bash or perl

I have tried to scan through the other posts in stack overflow for this, but couldn't get my code work, hence I am posting a new question.
Below is the content of file temp.
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/<env:Body><dp:response xmlns:dp="http://www.datapower.com/schemas/management"><dp:timestamp>2015-01-
22T13:38:04Z</dp:timestamp><dp:file name="temporary://test.txt">XJzLXJlc3VsdHMtYWN0aW9uX18i</dp:file><dp:file name="temporary://test1.txt">lc3VsdHMtYWN0aW9uX18i</dp:file></dp:response></env:Body></env:Envelope>
This file contains the base64 encoded contents of two files names test.txt and test1.txt. I want to extract the base64 encoded content of each file to seperate files test.txt and text1.txt respectively.
To achieve this, I have to remove the xml tags around the base64 contents. I am trying below commands to achieve this. However, it is not working as expected.
sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test.txt">##g'|perl -p -e 's#</dp:file>##g' > test.txt
sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test1.txt">##g'|perl -p -e 's#</dp:file></dp:response></env:Body></env:Envelope>##g' > test1.txt
Below command:
sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test.txt">##g'|perl -p -e 's#</dp:file>##g'
produces output:
XJzLXJlc3VsdHMtYWN0aW9uX18i
<dp:file name="temporary://test1.txt">lc3VsdHMtYWN0aW9uX18i</dp:response> </env:Body></env:Envelope>`
Howeveer, in the output I am expecting only first line XJzLXJlc3VsdHMtYWN0aW9uX18i. Where I am commiting mistake?
When i run below command, I am getting expected output:
sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test1.txt">##g'|perl -p -e 's#</dp:file></dp:response></env:Body></env:Envelope>##g'
It produces below string
lc3VsdHMtYWN0aW9uX18i
I can then easily route this to test1.txt file.
UPDATE
I have edited the question by updating the source file content. The source file doesn't contain any newline character. The current solution will not work in that case, I have tried it and failed. wc -l temp must output to 1.
OS: solaris 10
Shell: bash
sed -n 's_<dp:file name="\([^"]*\)">\([^<]*\).*_\1 -> \2_p' temp
I add \1 -> to show link from file name to content but for content only, just remove this part
posix version so on GNU sed use --posix
assuming that base64 encoded contents is on the same line as the tag around (and not spread on several lines, that need some modification in this case)
Thanks to JID for full explaination below
How it works
sed -n
The -n means no printing so unless explicitly told to print, then there will be no output from sed
's_
This is to substitute the following regex using _ to separate regex from the replacement.
<dp:file name=
Regular text
"\([^"]*\)"
The brackets are a capture group and must be escaped unless the -r option is used( -r is not available on posix). Everything inside the brackets is captured. [^"]* means 0 or more occurrences of any character that is not a quote. So really this just captures anything between the two quotes.
>\([^<]*\)<
Again uses the capture group this time to capture everything between the > and <
.*
Everything else on the line
_\1 -> \2
This is the replacement, so replace everything in the regex before with the first capture group then a -> and then the second capture group.
_p
Means print the line
Resources
http://unixhelp.ed.ac.uk/CGI/man-cgi?sed
http://www.grymoire.com/Unix/Sed.html
/usr/xpg4/bin/sed works well here.
/usr/bin/sed is not working as expected in case if the file contains just 1 line.
below command works for a file containing only single line.
/usr/xpg4/bin/sed -n 's_<env:Envelope\(.*\)<dp:file name="temporary://BackUpDir/backupmanifest.xml">\([^>]*\)</dp:file>\(.*\)_\2_p' securebackup.xml 2>/dev/null
Without 2>/dev/null this sed command outputs the warning sed: Missing newline at end of file.
This because of the below reason:
Solaris default sed ignores the last line not to break existing scripts because a line was required to be terminated by a new line in the original Unix implementation.
GNU sed has a more relaxed behavior and the POSIX implementation accept the fact but outputs a warning.

How to search and replace in text files only?

I have a directory containing a bunch of files, some text some binary, with no consistent naming. I want to search and replace a string in text files only. So I went with:
perl -i -pne 's#/some/text/to/replace#/replacement/text#' *
Remove the -i option and you will see that binary files get caught. How do I modify this one-liner to skip binary files?
ack -n --text --sort -f . | xargs perl -i -pne 's…'
Abusing ack goes much quicker than writing your own solution with -T.
Well, this is all based on what your definition of a text file is. Perl 5 has the -T filetest operator that will tell you if a filename or filehandle is a text file (using Perl 5's definition):
perl -i -pne 'BEGIN{#ARGV=grep-T,#ARGV}s#regex#replacement#' *
The BEGIN block will filter out any files that don't pass the -T test, so they won't even be read (except for their first block because that is what -T uses to determine if they are text).
From perldoc -f -X
The -T and -B switches work as follows. The first block or so of the file is examined for odd characters such as strange control codes or characters with the high bit set. If too many strange characters (>30%) are found, it's a -B file; otherwise it's a -T file. Also, any file containing a zero byte in the first block is considered a binary file. If -T or -B is used on a filehandle, the current IO buffer is examined rather than the first block. Both -T and -B return true on an empty file, or a file at EOF when testing a filehandle. Because you have to read a file to do the -T test, on most occasions you want to use a -f against the file first, as in next unless -f $file && -T $file .