How to merge two xml file (Web.config)? - merge

I have two xml files,if i use xmlDiffPatch of microsoft,it do Delete Element,but i Dont want to delete nodes,just replace value or add nodes,How to do that?for exampel:
web1.config
is like '
600
150
75
25
'
and web2.config is like this:
'
<PartPriceInfo xmlns:ns2="http://www.aa.com">
<ns2:Subaru model="Outback">
<ns2:Muffler> 700 </ns2:Muffler>
<ns2:Bumper> 150 </ns2:Bumper>
</ns2:Subaru>
</PartPriceInfo>
'
i want the results like below:
'
<PartPriceInfo xmlns:ns2="http://www.aa.com">
<ns2:Subaru model="Outback">
<ns2:Muffler> 700</ns2:Muffler>
<ns2:Bumper> 150 </ns2:Bumper>
<ns2:Floormat> 75 </ns2:Floormat>
<ns2:WindShieldWipers> 25 </ns2:WindShieldWipers>
</ns2:Subaru>
</PartPriceInfo>
'

Finally I found Microsoft XML Diff and Patch Tool,and my job done good.good luck every one.

Related

How to get the difference between two columns using Talend

I Have two excel sheets that i want to compare using talend job
First excel named Compare_Me_1
PN
STT
Designation
AY73101000
20
RC0402FR-0743K2L
AY73101000
22
RK73H1ETTP4322F
AY73101000
22
ERJ-2RKF4322X
Ac2566
70
CRCW040243K2FKED
Second excel named Compare_Me_2
PN
STT
Designation
AY73101000
20
RC0402FR-0743K2L
AY73101000
22
RK73H1ETTP4322F
AY73101000
21
ERJ-2RKF4322X
Ac2566
70
CRCW040243K2FKED
what i want to achieve is this output
PN1
STT1
STT2
STT_OK_Ko
Designation1
Designation2
Designation_Ok_Ko
AY73101000
20
20
ok
RC0402FR-0743K2L
RC0402FR-0743K2L
ok
AY73101000
22
22
ok
RK73H1ETTP4322F
RK73H1ETTP4322F
ok
AY73101000
22
21
ko
ERJ-2RKF4322X
ERJ-2RKF4322X
ok
Ac2566
70
70
ok
CRCW040243K2FKED
CRCW040243K2FKED
ok
So to achieve this i developed a talend job that looks like below :
In My tMap i linked PN with a leftouterjoin and All Matches correspandance .
And to get for example STT_Ok_KO i used bellow code to compare my two input :
(!Relational.ISNULL(row14.STT) && !Relational.ISNULL(row13.STT) &&
row14.STT.equals(row13.STT) ) ||
(Relational.ISNULL(row14.STT) && Relational.ISNULL(row13.STT))
?"ok":"ko"
Is this the correct way to achieve my ouput ? If not , recommand me to use an other method
Any suggest is welcome .
You probably need to follow the long steps below :

How to extract data set from a text file?

Am quite new in the Unix field and I am currently trying to extract data set from a text file. I tried with sed, grep, awk but it seems to only work with extracting lines, but I want to extract an entire dataset... Here is an example of file from which I'd like to extract the 2 data sets (figures after the lines "R.Time Intensity")
[Header]
Application Name LabSolutions
Version 5.87
Data File Name C:\LabSolutions\Data\Antoine\170921_AC_FluoSpectra\069_WT3a derivatized lignin LiCl 430_GPC_FOREVER_430_049.lcd
Output Date 2017-10-12
Output Time 12:07:32
[Configuration]
Instrument Name BOTAN127-Instrument1
Instrument # 1
Line # 1
# of Detectors 3
Detector ID Detector A Detector B PDA
Detector Name Detector A Detector B PDA
# of Channels 1 1 2
[LC Chromatogram(Detector A-Ch1)]
Interval(msec) 500
# of Points 9603
Start Time(min) 0,000
End Time(min) 80,017
Intensity Units mV
Intensity Multiplier 0,001
Ex. Wavelength(nm) 405
Em. Wavelength(nm) 430
R.Time (min) Intensity
0,00000 -709779
0,00833 -709779
0,01667 17
0,02500 3
0,03333 7
0,04167 19
0,05000 9
0,05833 5
0,06667 2
0,07500 24
0,08333 48
[LC Chromatogram(Detector B-Ch1)]
Interval(msec) 500
# of Points 9603
Start Time(min) 0,000
End Time(min) 80,017
Intensity Units mV
Intensity Multiplier 0,001
R.Time (min) Intensity
0,00000 149
0,00833 149
0,01667 -1
I would greatly appreciate any idea. Thanks in advance.
Antoine
awk '/^[^0-9]/&&d{d=0} /R.Time/{d=1}d' file
Brief explanation,
Set d as a flag to determine print line or not
/^[^0-9]/&&d{d=0}: if regex ^[^0-9] matched && d==1, disabled d
/R.Time/{d=1}: if string "R.Time" searched, enabled d
awk '/R.Time/,/LC/' file|grep -v -E "R.Time|LC"
grep part will remove the R.Time and LC lines that come as a part of the output from awk
I think it's a job for sed.
sed '/R.Time/!d;:A;N;/\n$/!bA' infile

How to run program with variable command line arguments?

I have the following script:
For $alpha = 1 to 10
For $beta = 1 to 10
Run('"C:\Users\MyProg.exe ' & alpha/10 & ' ' & beta/10 & ' 200 2 0.5'
;some other actions follow
Next
Next
I have checked many times that the string is well-formed, thus I have no idea why the script wouldn't run the program. Could you help me please?
Just replace the ending ' with "') and use proper variable names including the $... like $alpha instead of just alpha. Your syntax check in SciTE should have told you.
For $alpha = 1 to 10
For $beta = 1 to 10
Run('"C:\Users\MyProg.exe ' & $alpha/10 & ' ' & $beta/10 & ' 200 2 0.5"')
;some other actions follow
Next
Next

How to sum values in a column grouped by values in the other

I have a large file consisting data in 2 columns
100 5
100 10
100 10
101 2
101 4
102 10
102 2
I want to sum the values in 2nd column with matching values in column 1. For this example, the output I'm expecting is
100 25
101 6
102 12
I'm trying to work on this using bash script preferably. Can someone explain me how can I do this
Using awk:
awk '{a[$1]+=$2}END{for(i in a){print i, a[i]}}' inputfile
For your input, it'd produce:
100 25
101 6
102 12
In a perl oneliner
perl -lane "$s{$F[0]} += $F[1]; END { print qq{$_ $s{$_}} for keys %s}" file.txt
You can use an associative array. The first column is the index and the second becomes what you add to it.
#!/bin/bash
declare -A columns=()
while read -r -a line ; do
columns[${line[0]}]=$((${columns[${line[0]}]} + ${line[1]}))
done < "${1}"
for idx in ${!columns[#]} ; do
echo "${idx} ${columns[${idx}]}"
done
Using awk and maintain the order:
awk '!($1 in a){a[$1]=$2; b[++i]=$1;next} {a[$1]+=$2} END{for (k=1; k<=i; k++) print b[k], a[b[k]]}' file
100 25
101 6
102 12
Python is my choice:
d = {}
for line in f.readlines():
key,value = line.split()
if d[key] == None:
d[key] = 0
d[key] += value
print d
Why would you want a bash script?

AWK - filter file with not equal fields

I've been trying to pull a field from a row in a file although each row may have plus or minus 2 or 3 fields per row. They aren't always equal in the number of fields per row.
Here is a snippet:
A orarpp 45286124 1 1 0 20 60 Nov 25 9-16:42:32 01:04:58 11176 117056 0 - oracleXXX (LOCAL=NO)
A orarpp 45351560 1 1 3 20 61 Nov 30 5-03:54:42 02:24:48 4804 110684 0 - ora_w002_XXX
A orarpp 45548236 1 1 22 20 71 Nov 26 8-19:36:28 00:56:18 10628 116508 0 - oracleXXX (LOCAL=NO)
A orarpp 45679190 1 1 0 20 60 Nov 28 6-23:42:20 00:37:59 10232 116112 0 - oracleXXX (LOCAL=NO)
A orarpp 45744808 1 1 0 20 60 10:52:19 23:08:12 00:04:58 11740 117620 0 - oracleXXX (LOCAL=NO)
A root 45810380 1 1 0 -- 39 Nov 25 9-19:54:34 00:00:00 448 448 0 - garbage
In the case of the first line, I'm interested in 9-16:42:32 and the similar fields for each row.
I've tried to pull it by using ':' as the field separator and then filter from there however, what I am trying to accomplish is to do something if the number before the dash (in the example it's 9) is greater than one.
cat file.txt | grep oracle | awk -F: '{print substr($1, length($1)-5)}'
This is because the number of fields on either side of the actual field I need can be different from line to line.
Definitely not the most efficient but I've been trying to do this with an awk one liner.
Hints or a direction would be appreciated to get me moving again. I am not opposed to doing in a better way than awk.
Thanks.
Maybe cut is the right tool for this job? For example, with your snippet:
$ cut -c 62-71 file.txt
9-16:42:32
5-03:54:42
8-19:36:28
6-23:42:20
23:08:12
9-19:54:34
The arguments tell cut to snip columns (-c) 62 through 71.
For additional processing, you can pipe it to awk.
You can also accomplish the whole thing in awk by accepting entire lines and then using substr to extract the columns you want. For example, this awk command produces the same output as the cut command above:
awk '{ print substr($0, 62, 10) }' file.txt
Whether you create a pipeline or do the processing entirely in awk is at least in part a matter of personal taste / style.
Would this do?
awk -F: '/oracle/ {print substr($0,62,10)}' file.txt
9-16:42:32
8-19:36:28
6-23:42:20
23:08:12
This search for oracle and then print 10 characters starting from position 62
You can grab those identifiers with one of
grep -o '[[:digit:]]\+-[[:digit:]]\{2\}:[[:digit:]]\{2\}:[[:digit:]]\{2\}'
grep -oP '\d+-\d\d:\d\d:\d\d' # GNU grep
It sounds like you want to do something with the lines, not just find the ids. Please elaborate.
Using GNU awk:
gawk --re-interval '
/oracle/ && \
match($0, /([[:digit:]]+)-([[:digit:]]{2}:){2}[[:digit:]]{2}/, a) && \
a[1]>1 {
# do something with the matching line
print
}
' file