Understanding ibase and obase used - command-line

I'm trying to solve the following exercise:
Write a command line that takes numbers from variables FT_NBR1, in ’\"?! base, and FT_NBR2, in mrdoc base, and displays the sum of both in gtaio luSnemf base.
I know the solution is:
echo $FT_NBR1 + $FT_NBR2 | sed 's/\\/1/g' | sed 's/?/3/g' | sed 's/!/4/g' | sed "s/\'/0/g" | sed "s/\"/2/g" | tr "mrdoc" "01234" | xargs echo "ibase=5; obase=23;" | bc | tr "0123456789ABC" "gtaio luSnemf"
I don't understand why ibase=5 and obase=23.
I read about ibase and obase, and I understand this is a base conversion, from base 5 to base 23. Anyone can explain me why 5 and 23. Thank you

The exercise description is a bit weird. A better one would be
Write a command line that takes numbers from variables FT_NBR1, with numbers represented by the letters "’\"?!", and FT_NBR2, represented by "mrdoc", and displays the sum of both with numbers represented by "gtaio luSnemf".
A shorter answer would be
echo $FT_NBR1 + $FT_NBR2 | tr "\'\\\\\"\?" "01234" | tr "mrdoc" "01234" | xargs echo "ibase=5; obase=23;" | bc | tr "0123456789ABC" "gtaio luSnemf"
Let's take it from the beginning:
echo $FT_NBR1 + $FT_NBR2 creates the expression using the input strings
tr "\'\\\\\"\?" "01234" translates the first input alphabet into numbers
tr "mrdoc" "01234" translates the second input alphabet into numbers
xargs echo "ibase=5; obase=23;" prepends number base information; the input base is 5 and the output base is 13, but obase must be expressed in the base of ibase and 13 in base 5 is 23.
bc does the actual calculation
tr "0123456789ABC" "gtaio luSnemf" does the translation into the output alphabet.

Related

Multiple mathematical operations on a file containing numbers

I have extracted the following data using 'grep' & 'sed' pipes from a file and now I want to perform a mathematical equation on the last two numbers, delete them and replace them with a single number.
Mathematical operations
Add the numbers together
divide by 2
multiply by 141
ROUNDUP to whole number
File Data
AJ29 IO_0_VRN_10 77.234 78.011
AJ30 IO_L1P_T0_100M 89.886 90.789
AJ31 IO_L1N_T0_100S 101.388 102.406
AK29 IO_L2P_T0_101M 66.163 66.828
AL29 IO_L2N_T0_101S 63.626 64.266
So the line starting AJ29 should appear as:
AJ29 IO_0_VRN_10 10945
I could put it in MS excel / Open Office calc and do this but want to avoid MS and keep it in a single linux script if it is possible. Hope you can help. The script I have so far is below and ideally I'd like to add a few more pipes to achieve this.
grep IOB xc7vx690tffg1930.pkg | sed 's/pin//g' | sed 's/IOB_[A-Za-z0-9]*//g' | sed 's/ /-/g' | sed 's/\t//g' | sed 's/^[-]*//g' | sed 's/-/ /g' | sed 's/ [0-9][0-9] //g' | sed 's/[[:space:]]\+/,/g' | sed 's/,X[0-9A-Z]*,//g' | sed 's/,[0-9]*[A-Z],//g' | sed 's/N\.A\.,/,/g' | sed 's/,$//g' | sed 's/,/ /g'
For calculations, use awk!
$ awk '{$(NF-1)=sprintf("%.0f", ($(NF-1) + $NF)/2 * 141); NF--}1' file
AJ29 IO_0_VRN_10 10945
AJ30 IO_L1P_T0_100M 12738
AJ31 IO_L1N_T0_100S 14367
AK29 IO_L2P_T0_101M 9376
AL29 IO_L2N_T0_101S 9016
This replaces the penultimate field with the result of (penultimate*last)/2 * 141). To make it round, we use %.0f format as indicated in Awk printf number in width and round it up.
Also, it looks to me that you are piping way too many things: I counted one call to grep and 13 (!) to sed. You can probably use sed -e 'first block' -e 'second block' ... instead.
Explanation
In awk, NF refers to the number of fields on the current line. Since $n refers to the field number n, with $(NF-1) we refer to the penultimate field.
{...}1 do stuff and then print the resulting line. 1 evaluates as True and anything True triggers awk to perform its default action, which is to print the current line.
$(NF-1) + $NF)/2 * 141 perform the calculation: `(penultimate + last) / 2 * 141
{$(NF-1)=sprintf( ... ) assign the result of the previous calculation to the penultimate field. Using sprintf with %.0f we make sure the rounding is performed, as described above.
{...; NF--} once the calculation is done, we have its result in the penultimate field. To remove the last column, we just say "hey, decrease the number of fields" so that the last one gets "removed".

How to escape minus in regular expression with sed?

I need to free a string from unwanted characters. In this example I want to filter all +'s and all -'s from b and write the result to c. So if b is +fdd-dfdf+, c should be +-+.
read b
c=$(echo $b | sed 's/[^(\+|\-)]//g')
But when i run the script, the console says:
sed: -e expression #1, char 15: Invalid range end
The reason is the \- in my regular expression. How can I solve this problem and say, that I want to filter all -'s?
are you looking for this?
kent$ echo 'a + b + c - d - e'|sed 's/[^-+]//g'
++--

Produce table with text and number with fprintf in Matlab

I need to produce a table whose first 2 columns have text, and the remaining 2 have numbers. Something like this:
| Ford | Mustang | 1975 | 35 |
| Chev | Camaro | 1976 | 38 |
I have the string in a cell, and the numeric variables in a matrix. I've tried with fprintf but can't make it work. I have no problems doing it in xlswrite, but I don't want to go that way. Any ideas please?
Thanks!
You could use fprintf in a loop like this:
fprintf(1, '| %8s | %8s | %4d | %2d |\n', ...
company{i}, model{i}, year(i), otherNumber(i));
to write to stdout. You can also modify the %#s if you want different spacing in your table, or provide a different file descriptor to the first argument.

Add leading 0 in sed substitution

I have input data:
foo 24
foobar 5 bar
bar foo 125
and I'd like to have output:
foo 024
foobar 005 bar
bar foo 125
So I can use this sed substitutions:
s,\([a-z ]\+\)\([0-9]\)\([a-z ]*\),\100\2\3,
s,\([a-z ]\+\)\([0-9][0-9]\)\([a-z ]*\),\10\2\3,
But, can I make one substitution, that will do the same? Something like:
if (one digit) then two leading 0
elif (two digits) then one leading 0
Regards.
I doubt that the "if - else" logic can be incorporated in one substitution command without saving the intermediate data (length of the match for instance). It doesn't mean you can't do it easily, though. For instance:
$ N=5
$ sed -r ":r;s/\b[0-9]{1,$(($N-1))}\b/0&/g;tr" infile
foo 00024
foobar 00005 bar
bar foo 00125
It uses recursion, adding one zero to all numbers that are shorter than $N digits in a loop that ends when no more substitutions can be made. The r label basically says: try to do substitution, then goto r if found something to substitute. See more on flow control in sed here.
Use two substitute commands: the first one will search for one digit and will insert two zeroes just before, and the second one will search for a number with two digits and will insert one zero just before. GNU sed is needed because I use the word boundary command to search for digits (\b).
sed -e 's/\b[0-9]\b/00&/g; s/\b[0-9]\{2\}\b/0&/g' infile
EDIT to add a test:
Content of infile:
foo 24 9
foo 645 bar 5 bar
bar foo 125
Run previous command with following output:
foo 024 009
foo 645 bar 005 bar
bar foo 125
Add the max number of leading zeros first, then take this number of characters from the end:
echo 55 | sed -e 's:^:0000000:' -e 's:0\+\(.\{8\}\)$:\1:'
00000055
You seem to have the sed options covered, here's one way with awk:
BEGIN { RS="[ \n]"; ORS=OFS="" }
/^[0-9]+$/ { $0 = sprintf("%03d", $0) }
{ print $0, RT }
I find the following sed approach to pad an integer number with zeroes to 5 (n) digits quite straighforward:
sed -e "s/\<\([0-9]\{1,4\}\)\>/0000\1/; s/\<0*\([0-9]\{5\}\)\>/\1/"
If there is at least one, at most 4 (n-1) digits, add 4 (n-1) zeroes in
front
If there is any number of zeroes followed by 5 (n) digits after the first transformation, keep just these last 5 (n) digits
When there happen to be more than 5 (n) digits, this approach behaves the usual way -- nothing is padded or trimmed.
Input:
0
1
12
123
1234
12345
123456
1234567
Output:
00000
00001
00012
00123
01234
12345
123456
1234567
This might work for you (GNU sed):
echo '1.23 12,345 1 12 123 1234 1' |
sed 's/\(^\|\s\)\([0-9]\(\s\|$\)\)/\100\2/g;s/\(^\|\s\)\([0-9][0-9]\(\s\|$\)\)/\10\2/g'
1.23 12,345 001 012 123 1234 001
or perhaps a little easier on the eye:
sed -r 's/(^|\s)([0-9](\s|$))/\100\2/g;s/(^|\s)([0-9][0-9](\s|$))/\10\2/g'

Wordnet synsets using perl

I installed Wordnet::Similarity and Wordnet::QueryData as an easy way to calculate information content score and probability that comes with these modules. But I'm stuck at this basic problem: given a word, print n words similar to it - which should not be difficult that iterating through the synsets and doing join.
using the wn command and piping it with a whole lot of tr, sort | uniq I can get all the words:
wn cat -synsn | grep -v Sense | tr '=' ' ' | tr '>' ' ' | tr '\t' ' ' | tr ',' '\n' | sort | uniq
OUTPUT
8 senses of cat
adult female
adult male
African tea
Arabian tea
big cat
bozo
cat
cat
CAT
Caterpillar
cat-o'-nine-tails
computed axial tomography
computed tomography
computerized axial tomography
computerized tomography
CT
excitant
felid
feline
gossip
gossiper
gossipmonger
guy
hombre
kat
khat
man
newsmonger
qat
quat
rumormonger
rumourmonger
stimulant
stimulant drug
Synonyms/Hypernyms (Ordered by Estimated Frequency) of noun cat
tracked vehicle
true cat
whip
woman
X-radiation
X-raying
but its kinda nasty,and needs further clean up.
What my script looks like is below, and what I want to get is all the words in cat#n1...8.
SCRIPT
use WordNet::QueryData;
my $wn = WordNet::QueryData->new( noload => 1);
print "Senses: ", join(", ", $wn->querySense("cat#n")), "\n";
print "Synset: ", join(", ", $wn->querySense("cat", "syns")), "\n";
print "Hyponyms: ", join(", ", $wn->querySense("cat#n#1", "hypo")), "\n";
OUTPUT:
Senses: cat#n#1, cat#n#2, cat#n#3, cat#n#4, cat#n#5, cat#n#6, cat#n#7, cat#n#8
Synset: cat#n, cat#v
Hyponyms: domestic_cat#n#1, wildcat#n#3
SCRIPT
use WordNet::QueryData;
my $wn = WordNet::QueryData->new;
foreach $word (qw/cat#n/) {
#senses = $wn->querySense($word);
foreach $wps (#senses) {
#gloss = $wn -> querySense($wps, "syns");
print "$wps : #gloss\n";
}
}
OUTPUT:
cat#n#1 : cat#n#1 true_cat#n#1
cat#n#2 : guy#n#1 cat#n#2 hombre#n#1 bozo#n#2
cat#n#3 : cat#n#3
cat#n#4 : kat#n#1 khat#n#1 qat#n#1 quat#n#1 cat#n#4 Arabian_tea#n#1 African_tea#n#1
cat#n#5 : cat-o'-nine-tails#n#1 cat#n#5
cat#n#6 : Caterpillar#n#2 cat#n#6
cat#n#7 : big_cat#n#1 cat#n#7
cat#n#8 : computerized_tomography#n#1 computed_tomography#n#1 CT#n#2 computerized_axial_tomography#n#1 computed_axial_tomography#n#1 CAT#n#8
P.S.
I have never written perl before, but have been looking into perl scripts since morning - and can now understand the basic stuff. Just need to know if there is cleaner way to do this using the api docs - couldn't figure out from the api or usergroup archives.
Update:
I think I'll settle with:
wn cat -synsn | sed '1,6d' |sed 's/Sense [[:digit:]]//g' | sed 's/[[:space:]]*=> //' | sed '/^$/d'
sed rocks!
I think you'll find the following hepful...
http://marimba.d.umn.edu/WordNet-Pairs/
What are the N most similar words to X, according to WordNet?
This data seeks to answer that question, where similarity is based on
measures from WordNet::Similarity. http://wn-similarity.sourceforge.net
-------------- verb data
These files were created with WordNet::Similarity version 2.05 using
WordNet 3.0. They show all the pairwise verb-verb similarities found
in WordNet according to the path, wup, lch, lin, res, and jcn measures.
The path, wup, and lch are path-based, while res, lin, and jcn are based
on information content.
As of March 15, 2011 pairwise measures for all verbs using the six
measures above are availble, each in their own .tar file. Each *.tar
file is named as WordNet-verb-verb-MEASURE-pairs.tar, and is approx
2.0 - 2.4 GB compressed. In each of these .tar files you will find
25,047 files, one for each verb sense. Each file consists of 25,048 lines,
where each line (except the first) contains a WordNet verb sense and the
similarity to the sense featured in that particular file. Doing
the math here, you find that each .tar file contains about 625,000,000
pairwise similarity values. Note that these are symmetric (sim (A,B)
= sim (B,A)) so you have a bit more than 300 million unique values.
-------------- noun data
As of August 19, 2011 pairwise measures for all nouns using the path
measure are available. This file is named WordNet-noun-noun-path-pairs.tar.
It is approximately 120 GB compressed. In this file you will find
146,312 files, one for each noun sense. Each file consists of
146,313 lines, where each line (except the first) contains a WordNet
noun sense and the similarity to the sense featured in that particular
file. Doing the math here, you find that each .tar file contains
about 21,000,000,000 pairwise similarity values. Note that these
are symmetric (sim (A,B) = sim (B,A)) so you have around 10 billion
unique values.
We are currently running wup, res, and lesk, but do not have an
estimated date of availability yet.
Put this is a script, say synonym.sh
wn $1 -synsn | sed '1,6d' |sed 's/Sense [[:digit:]]//g' | sed 's/[[:space:]]*=> //' | sed '/^$/d' | sed 's/ //g' | grep -iv $1 | tr '\n' ','
wn $1 -synsv | sed '1,6d' |sed 's/Sense [[:digit:]]//g' | sed 's/[[:space:]]*=> //' | sed '/^$/d' | sed 's/ //g' | grep -iv $1 | tr '\n' ',';echo
From your perl script
system("/path/synonym.sh","kittens");
system("/path/synonym.sh","cats");