I have a requirement where I need to read text file and extract some data and send the extracted to other system for which am unable to do it.
Input file:
1BoraBora Island
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
3BR 209078 BoraBora 6798989 99999
1 BR 67854 JAIHIND 789 000Y247 9898983
2 BR CR9 BoraBora 123 QK J12Y64 00010520
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Output should be:
1BoraBora Island
0000000000000000000000
1 BR 67854 JAIHIND 789 000Y247 9898983
2 BR CR9 BoraBora 123 QK J12Y64 00010520
Need to extract only row having "BR" in it at 3th letter.
Please guide me how to achieve this in text format only.
Assuming that the input is `text/plain'. Using a DataWeave script and the subscript() function you can extract a given position from the input:
%dw 2.0
import * from dw::core::Strings
output text/plain
var lines=payload splitBy "\n" // separate text into an array of lines
---
lines[0] ++"\n" ++ lines[1] ++"\n"
++ (lines[2 to -1] // use the range selector to get the remaining lines
filter (substring($,2,4)=="BR") // filter lines that have "BR" at the right position
reduce ($$++"\n"++$) // concatenate the remaining lines again into a single text file
)
Output:
1BoraBora Island
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
1 BR 67854 JAIHIND 789 000Y247 9898983
2 BR CR9 BoraBora 123 QK J12Y64 00010520
Since you are working with Text, you can also use Regex with the scan function to scan all lines that matches your condition then joinBy a new line character
%dw 2.0
output text/plain
---
flatten(payload scan /(?<=^|\n).{2}BR.*/)
joinBy "\n"
(?<=^|\n).{2}BR.* Regex breakdown:
(?<= is a positive lookbehind, that means it will start matching the rest of the pattern only if it follows the pattern specified by it
(?<=^|\n) is positive lookahead with either start of string (^) of a new line (\n)
.{2}BR.* indicates any character twice followed by the literal BR then any number of any character thereafter
I am using following command to append string after AMP, but now I want to add after to AMP which is after SET2 or line number 9, can we modify this command to append the string only after SET2 or line number 9? And if I want to add to only to SET1 AMPs or before line number 9 , could someone help me with the command, thanks.
$ sed -i '/AMP/a Target4' test.txt
$ cat test.txt
#SET1
AMP
Target 1
Target 2
AMP
Target 3
Target 4
Target 5
#Set2
AMP
Target 11
Target 12
Note there is no line between above text.
Would you please try the following:
sed -i '
/^#Set2/,${ ;# if the line starts with "#Set2", execute the {block} until the last line $
/AMP/a Target4 ;# append the string after "AMP"
} ;# end of the block
' test.txt
If you want to append the string before the #Set2 line, please try:
sed -i '
1,/^#Set2/ { ;# excecute the {block} while the line number >= 1 until the line matches the pattern /^#Set2/
/AMP/a Target4
}
' test.txt
The expression address1,address2 is a flip-flop operator. Once the
address1 (line number, regular expression, or other condition) meets,
the operator keeps on returning true until the address2 meets.
Then the following command or block is executed from address1 until
address2.
If you want to add to after AMP which is after #Set2 or line number 9,
I think it is better to process up to the 8th line and after the 9th line separately.
For example, the command is below:
sed '
1,8{
/^#Set2/,${
/AMP/a Target4
}
}
9,${
/AMP/a Target4
}' test.txt
This question already has answers here:
Remove Left and right square brackets using sed/bash
(1 answer)
Remove a pattern using sed which has square brackets and quotes
(2 answers)
How can I use sed to delete line with square brackets?
(2 answers)
Closed 1 year ago.
How can I remove the line "name['todo']['remove'] = 3456" from a text file?
[test.txt]
name['myname']['test'] = 12
name['todo']['remove'] = 3456
name['todo']['remove']['inspection'] = 34
My current approach is not working as expected. The line is still in my file.
sed -i "name\['todo'\]\['remove'\]" test.txt
The error message is "sed: -e expression #1, char 2: extra characters after command"
A simple grep -vF would work fine here that matches using fixed string without requiring escaping of special regex characters:
grep -ivF "name['todo']['remove'] " file
[test.txt]
name['myname']['test'] = 12
name['todo']['remove']['inspection'] = 34
You can use
sed -i "/name\['todo']\['remove'] =/d" test.txt
Note that the pattern is wrapped with / regex delimiters, and the d means the matched line will get removed.
See an online demo:
s="[test.txt]
name['myname']['test'] = 12
name['todo']['remove'] = 3456
name['todo']['remove']['inspection'] = 34"
sed "/name\['todo']\['remove'] =/d" <<< "$s"
yielding
[test.txt]
name['myname']['test'] = 12
name['todo']['remove']['inspection'] = 34
If you want to make sure you only match a whole line with digits after =, you may use "/^name\['todo']\['remove'] = [0-9]*$/d" command with sed.
As the title suggests, I need to remove an character between two characters along a string.
E.g. I want to remove the semicolon between two parenthesis
<>word (word ; word) word<>
output desidered:
<>(word word)<>
Your description and the desired output do not match!
To remove the semicolon between two brackets you can use
sed '/([^)]*/s/;//'
example
echo "<>word (word ; word) word<>" | sed '/([^)]*/s/;//'
output
<>word (word word) word<>
i wonder if there is the possibility to read a .csv file looking like:
0,0530,0560,0730,....
90,15090,15290,157....
i should get:
0,053 0,056 0,073 0,...
90,150 90,152 90,157 90,...
when using dlmread(path, '') matlab spits out an error saying
Mismatch between file and Format character vector.
Trouble reading 'Numeric' field frin file (row 1, field number 2) ==> ,053 0,056 0,073 ...
i also tried using "0," as the delimiter but matlab prohibits this.
Thanks,
jonnyx
str= importdata('file.csv',''); %importing the data as a cell array of char
for k=1:length(str) %looping till the last line
str{k}=myfunc(str{k}); %applying the required operation
end
where
function new=myfunc(str)
old = str(1:regexp(str, ',', 'once')); %finding the characters till the first comma
%old is the pattern of the current line
new=strrep(str,old,[' ',old]); %adding a space before that pattern
new=new(2:end); %removing the space at the start
end
and file.csv :
0,0530,0560,073
90,15090,15290,157
Output:
>> str
str=
'0,053 0,056 0,073'
'90,150 90,152 90,157'
You can actually do this using textscan without any loops and using a few basic string manipulation functions:
fid = fopen('no_delim.csv', 'r');
C = textscan(fid, ['%[0123456789' 10 13 ']%[,]%3c'], 'EndOfLine', '');
fclose(fid);
C = strcat(C{:});
output = strtrim(strsplit(sprintf('%s ', C{:}), {'\n' '\r'})).';
And the output using your sample input file:
output =
2×1 cell array
'0,053 0,056 0,073'
'90,150 90,152 90,157'
How it works...
The format string specifies 3 items to read repeatedly from the file:
A string containing any number of characters from 0 through 9, newlines (ASCII code 10), or carriage returns (ASCII code 13).
A comma.
Three individual characters.
Each set of 3 items are concatenated, then all sets are printed to a string separated by spaces. The string is split at any newlines or carriage returns to create a cell array of strings, and any spaces on the ends are removed.
If you have access to a GNU / *NIX command line, I would suggest using sed to preprocess your data before feeding into matlab. The command would be in this case : sed 's/,[0-9]\{3\}/& /g' .
$ echo "90,15090,15290,157" | sed 's/,[0-9]\{3\}/& /g'
90,150 90,152 90,157
$ echo "0,0530,0560,0730,356" | sed 's/,[0-9]\{3\}/& /g'
0,053 0,056 0,073 0,356
also, you easily change commas , to decimal point .
$ echo "0,053 0,056 0,073 0,356" | sed 's/,/./g'
0.053 0.056 0.073 0.356