i would like to remove double quotes from the header row only. The data below is in a txt file with tab-delimiter. Thanks!
"$sedol" "$cusip" "$rbss_id"
"2877365" "22122P101" "53301020"
"B0G72D1" " " "50102020"
The desired answer is:
$sedol $cusip $rbss_id
"2877365" "22122P101" "53301020"
"B0G72D1" " " "50102020"
Try this sed:
sed '1s/"//g' file.txt
1s will make sure to replace " on line # 1 only.
Related
I am using following command to append string after AMP, but now I want to add after to AMP which is after SET2 or line number 9, can we modify this command to append the string only after SET2 or line number 9? And if I want to add to only to SET1 AMPs or before line number 9 , could someone help me with the command, thanks.
$ sed -i '/AMP/a Target4' test.txt
$ cat test.txt
#SET1
AMP
Target 1
Target 2
AMP
Target 3
Target 4
Target 5
#Set2
AMP
Target 11
Target 12
Note there is no line between above text.
Would you please try the following:
sed -i '
/^#Set2/,${ ;# if the line starts with "#Set2", execute the {block} until the last line $
/AMP/a Target4 ;# append the string after "AMP"
} ;# end of the block
' test.txt
If you want to append the string before the #Set2 line, please try:
sed -i '
1,/^#Set2/ { ;# excecute the {block} while the line number >= 1 until the line matches the pattern /^#Set2/
/AMP/a Target4
}
' test.txt
The expression address1,address2 is a flip-flop operator. Once the
address1 (line number, regular expression, or other condition) meets,
the operator keeps on returning true until the address2 meets.
Then the following command or block is executed from address1 until
address2.
If you want to add to after AMP which is after #Set2 or line number 9,
I think it is better to process up to the 8th line and after the 9th line separately.
For example, the command is below:
sed '
1,8{
/^#Set2/,${
/AMP/a Target4
}
}
9,${
/AMP/a Target4
}' test.txt
I am new to spark. I have a huge file which has data like-
18765967790#18765967790#T#20130629#00#31#2981546 " "18765967790#18765967790#T#20130629#19#18#3240165 " "18765967790#18765967790#T#20130629#18#18#1362836
13478756094#13478756094#T#20130629#31#26#2880701 " "13478756094#13478756094#T#20130629#19#18#1230206 " "13478756094#13478756094#T#20130629#00#00#1631440
40072066693#40072066693#T#20130629#79#18#1270246 " "40072066693#40072066693#T#20130629#79#18#3276502 " "40072066693#40072066693#T#20130629#19#07#3321860
I am trying to replace " " with new line character so that my output looks like this-
18765967790#18765967790#T#20130629#00#31#2981546
18765967790#18765967790#T#20130629#19#18#3240165
18765967790#18765967790#T#20130629#18#18#1362836
13478756094#13478756094#T#20130629#31#26#2880701
13478756094#13478756094#T#20130629#19#18#1230206
13478756094#13478756094#T#20130629#00#00#1631440
40072066693#40072066693#T#20130629#79#18#1270246
40072066693#40072066693#T#20130629#79#18#3276502
40072066693#40072066693#T#20130629#19#07#3321860
I have tried with-
val fact1 = sc.textFile("s3://abc.txt").map(x=>x.replaceAll("\"","\n"))
But this doesn't seem to be working. Can someone tell what I am missing?
Edit1- My final output will be a dataframe with schema imposed after splitting with delimeter "#".
I am getting below o/p-
scala> fact1.take(5).foreach(println)
18765967790#18765967790#T#20130629#00#31#2981546
18765967790#18765967790#T#20130629#19#18#3240165
18765967790#18765967790#T#20130629#18#18#1362836
13478756094#13478756094#T#20130629#31#26#2880701
13478756094#13478756094#T#20130629#19#18#1230206
13478756094#13478756094#T#20130629#00#00#1631440
40072066693#40072066693#T#20130629#79#18#1270246
40072066693#40072066693#T#20130629#79#18#3276502
40072066693#40072066693#T#20130629#19#07#3321860
I am getting extra blank lines which is further troubling me to create dataframe. This might seem simple here, but the file is huge, also the rows containing " " are long. In the question I have put only 2 double quotes but they can be more than 40-50 in numbers.
There are more than one quote in between textes, which is creating multiple line breaks. You either need to remove additional quotes before replace or empty lines after replace:
.map(x=>x.replaceAll("\"","\n").replaceAll("(?m)^[ \t]*\r?\n", ""))
Reference: Remove all empty lines
You might be missing implicit Encoders and you try the code as below
spark.read.text("src/main/resources/doubleQuoteFile.txt").map(row => {
row.getString(0).replace("\"","\n") // looking to replace " " with next line
row.getString(0).replace("\" \"","\n") // looking to replace " " with next line
})(org.apache.spark.sql.Encoders.STRING)
I am having difficulties replacing a string containing special characters using sed. My old and new string are shown below
oldStr = "# td=(nstates=20) cam-b3lyp/6-31g geom=connectivity"
newStr = "# opt b3lyp/6-31g geom=connectivity"
My sed command is the following
sed -i 's/\# td\=\(nstates\=20\) cam\-b3lyp\/6\-31g geom\=connectivity/\# opt b3lyp\/6\-31g geom\=connectivity/g' myfile.txt
I dont get any errors, however there is no match. Any ideas on how to fix my patterns.
Thanks
try s|# td=(nstates=20) cam-b3lyp/6-31g geom=connectivity|# opt b3lyp/6-31g geom=connectivity|g'
you can use next to anything after s instead of /, as your expression contains slashes I used | instead. -, = and # don't have to be escaped (minus only in character sets [...]), escaped parens indicate a group, nonescaped parens are literals.
Hi I want to replace a string coming between to symbols by using sed
example: -amystring -bxyz
what to replace mystring with ****
value after -a can be anything like -amystring 123 -bxyz, -amystring 123<newline_char>, -a'mystring 123' -bxyz, -a'mystring 123'<newline_char>
I tried following regex but it does not work in all the cases
sed -re "s#(-w)([^\s\-]+)#\1**** #g"
can anybody help me to solve this issue ?
MyString="YourStringWithoutRegExSpecialCharNotEscaped"
sed "s/-a${MyString} -b/-a**** -b/g"
if you can escape your string for any regex key char like * + . \ / with something like
echo "${MyString}" | sed 's/\[*.\\/+?]/\\&/g' | read -r MyString
before us it in sed.
otherwise, you need to better define the edge pattern
Suppose I have following string. I want to replace <b>2</b> to <b>20</b> if <a>2</a>
<start>
<a>1</a><b>1</b>
<a>2</a><b>2</b>
.
.
<a>10</a><b>10</b>
<a>2</a><b>2</b>
</start>
New string should look like this
<start>
<a>1</a><b>1</b>
<a>2</a><b>20</b>
.
.
<a>10</a><b>10</b>
<a>2</a><b>20</b>
</start>
can I do this using sed?
You can start with this:
sed '/<start>/,/<\/start>/s!\(<a>2</a><b>2\)</b>!\10</b>!' input
and relax the expression as required, for example allow spaces in tag a:
sed '/<start>/,/<\/start>/{/<a>[ ]*2[ ]*<\/a>/s!<b>2<!<b>20<!}' input
This will replace the first occurrence of <b>2</b> with <b>20</b> in all the lines with <a>2</a> invoke:
sed '/<a>2<\/a>/s/<b>2<\b>/<b>20<\/b>/' input