sed replace special characters followed by newline - sed

I have the following file
"user_id1": "184767",
"timeStamp": "2017-03-08 19:55:25.000"
},
{
"user_id1": "146364",
"timeStamp": "2017-03-12 23:48:48.000"
},
]
I want to replace all instances of },] with }]

Try this:
sed '/},/N;s/,\n *\]/]/' file
When }, is found, adds next line to the pattern space and replace the new line followed by ] with ].

Related

Nested patterns in tmLanguage.json - how to match first keyword occurrence only?

I'm confused how nested patterns work in tmLanguage.json
When I have the following code:
{
"$schema": "https://raw.githubusercontent.com/martinring/tmlanguage/master/tmlanguage.json",
"name": "myexample",
"scopeName": "source.myexample",
"patterns": [
{
"begin": "^(foo):",
"end": "$",
"beginCaptures": {
"1" : {"name": "myfoo"}
},
"patterns": [
{
"match": "^(bar)",
"captures": {
"1" : {"name": "mybar"}
}
}
]
}
]
}
I'd expect it to match this text:
foo:bar:bar
and capture word foo and first bar but not the second bar
However, it only matches foo and nothing else.
If I remove ^ from ^(bar) it matches both bar, but if I replace it with : it matches only second bar
So how do I make sure it captures only first occurrence of bar in the line, but only if it follows after foo: or if it's at the beginning of the entire line?:
foo:bar:bar -> should match
bar:bar -> should match
blah:bar:bar -> should not match
The foo: is optional it may or may not be present and I'd like to avoid duplicating the pattern with something like ^(?!foo:)bar
As I understand from TextMate grammar manual that nested patterns will use text that matched between begin and end of it's parent patterns, which confuses me why using ^ doesn't work or : captures only second occurrence...

Sed to add quotes around json text following a specific json key

I have below malformed json file. I want to quote the value of email, i.e "sampleemail#sampledoman.co.org". How do I go about it? I tried below but doesn't work.
sed -e 's/"email":\[\(.*\)\]/"email":["\1"]/g' sample.json
where sample.json looks like below
{
"supplementaryData": [
{
"xmlResponse": {
"id": "100001",
"externalid": "200001",
"info": {
"from": "C1022929291",
"phone": "000963586",
"emailadresses": {
"email": [sampleemail#sampledoman.co.org
]
}
},
"status": "OK",
"channel": "mobile"
}
}
]
}
Your code does not work because
[ is not escaped so not treated as a literal
You are using BRE, so capturing brackets will need to be escaped. In its current format, you will need -E to use extended functionality
The line does not end with ]
You did not add the space so there is no match, hence, no replacement.
For your code to work, you can use;
$ sed -E 's/"email": \[(.*)/"email": ["\1"/' sample.json
or
$ sed -E '/\<email\>/s/[a-z#.]+$/"&"/' sample.json
{
"supplementaryData": [
{
"xmlResponse": {
"id": "100001",
"externalid": "200001",
"info": {
"from": "C1022929291",
"phone": "000963586",
"emailadresses": {
"email": ["sampleemail#sampledoman.co.org"
]
}
},
"status": "OK",
"channel": "mobile"
}
}
]
}
With your shown samples, please try following awk code. Written and tested in GNU awk. Making RS as NULL and using awk's function named match where I am using regex (.*)(\n[[:space:]]+"emailadresses": {\n[[:space:]]+"email": \[)([^\n]+)(.*) to get required output which is creating 4 capturing groups which are 4 different values into array named arr(GNU awk's functionality in match function to save captured values into arrays) and then printing values as per requirement(adding " before and after email address value, which is 3rd element of arr OR 3rd capturing group of regex).
awk -v RS= '
match($0,/(.*)(\n[[:space:]]+"emailadresses": {\n[[:space:]]+"email": \[)([^\n]+)(.*)/,arr){
print arr[1] arr[2] "\"" arr[3] "\"" arr[4]
}
' Input_file

matching begin and pattern without end with tmLanguage

I'm trying to define a language using tmLanguage for syntax highlighting in vscode. I have the following rule.
"sexp": {
"name": "entity.sexp",
"patterns": [
{"include": "#list_of_sexp"},
{"include": "#atom"}
]
}
Is it possible to have a comment rule that matches sexp prefixed with a ";"? I'm not sure what to put in "end".
"comment": {
"name": "comment.sexp",
"begin": ";",
"end": ??,
"patterns": [{ "include": "#sexp" }]
}
I ended up solving this with a positive lookahead regex in “end”.

How can I highlight or change the color of the pipe character in VSCode?

I thought maybe the Highlight extension might work (https://marketplace.visualstudio.com/items?itemName=fabiospampinato.vscode-highlight), but I can't figure out how to create the regular expression.
"highlight.regexes": {
"(\\|)": [
{
"color":"red"
}
]
}
Try this:
"highlight.regexes": {
"([^|]*)(\\|)": {
"regexFlags": "gi",
// "filterLanguageRegex": "markdown",
"decorations": [
{},
{
"color": "red",
},
]
}
}
From the readme:
All characters of the matched string must be wrapped in a capturing
group.
Your (\\|) would only work if the pipe was the first character on the line.
And sometimes a reload is required to get the highlighting to work properly.

Sed for parsing

I have file:
"data_personnel": [
{
"id": "1",
"name": "Mathieu"
}
],
"struct_hospital": [
{
"id": "9",
"geo": "chamb",
"nb": ""
},
{
"id": "",
"geo": "jsj",
"nb": "SMITH"
},
{
"id": "10",
"geo": "",
"nb": "12"
},
{
"id": "2",
"geo": "marqui",
"nb": "20"
},
{
"id": "4",
"geo": "oliwo",
"nb": "1"
},
{
"id": "1",
"geo": "par",
"nb": "5"
}
]
How to use sed for for to have all the values ​​of geo in struct_hospital? (chamb, jsj, , marqui, oliwo, etc ..)
The file can be in any form. With tabs, everything on a line, etc ..
As pointed out by Sundeep, it makes more sense to use a proper JSON parser.
But if you are looking for a one-time quick and dirty solution, then this might do:
sed -n '/^"struct_hospital"/,/^]/s/^.*"geo"\s*:\s*"\([^"]*\)"\s*,\?.*$/\1/p' input.txt
Sample output:
chamb
jsj
marqui
oliwo
par
Explanation:
/^"struct_hospital"/,/^]/ - only consider lines between struct_hospital and the closing bracket.
s/.../\1/p search and replace; only print the first capturing subpattern of every matching line
^.*"geo"\s*:\s*"\(.*\)"\s*,\?.*$ matches the geo lines; captures the value following the colon
In case the input spans a single line, you can use another sed invocation as a preprocessor to insert line breaks:
sed 's/]\|,/\n&/g'
This makes the full command:
sed 's/]\|,/\n&/g' input.txt | sed -n '/^"struct_hospital"/,/^]/s/^.*"geo"\s*:\s*"\([^"]*\)"\s*,\?.*$/\1/p'