Why does sed delete the wrong range of lines

Why does sed delete the wrong range of lines - sed

Im trying to write a script that deletes a certain range of lines using the sed command. I first printed out the lines.
sed -n '482,486p' original.json
{
"vlans": "ALL",
"hostname": "hostname",
"interface": "interface"
},
After confirming that those are the lines I executed the following code
sed '482,486d' original.json > new.json
When I ran the diff command I got the following results
diff original.json new.json
484,488d483
< "hostname": "hostname",
< "interface": "interface"
< },
< {
< "vlans": "ALL",
My question is why was the range of lines from 484-448 deleted when I specified lines 482-486 to be deleted and how do I fix it. Any help would be appreciated

You haven't given us enough information to reproduce the error, but I'll go out on a limb.
My guess is that lines 487 and 488 of the original file are
{
"vlans": "ALL",
So if we put the original and the new side by side, it looks like this:
481 ... 481 ...
482 { 487 {
483 vlans 488 vlans
484 host 489 ...
485 int
486 }
487 {
488 vlans
489 ...
So lines 482-486 were excised from original, but diff sees everything matching up through line 483, then five lines 484-488 in original not appearing in new, and then all the rest matching.

Related

How to comment on a specific line number on a PR on github

I am trying to write a small script that can comment on github PRs using eslint output.
The problem is eslint gives me the absolute line numbers for each error.
But github API wants the line number relative to the diff.
From the github API docs: https://developer.github.com/v3/pulls/comments/#create-a-comment
To comment on a specific line in a file, you will need to first
determine the position in the diff. GitHub offers a
application/vnd.github.v3.diff media type which you can use in a
preceding request to view the pull request's diff. The diff needs to
be interpreted to translate from the line in the file to a position in
the diff. The position value is the number of lines down from the
first "##" hunk header in the file you would like to comment on.
The line just below the "##" line is position 1, the next line is
position 2, and so on. The position in the file's diff continues to
increase through lines of whitespace and additional hunks until a new
file is reached.
So if I want to add a comment on new line number 5 in the above image, then I would need to pass 12 to the API
My question is how can I easily map between the new line numbers which the eslint will give in it's error messages to the relative line numbers required by the github API
What I have tried so far
I am using parse-diff to convert the diff provided by github API into json object
[{
"chunks": [{
"content": "## -,OLD_TOTAL_LINES +NEW_STARTING_LINE_NUMBER,NEW_TOTAL_LINES ##",
"changes": [
{
"type": STRING("normal"|"add"|"del"),
"normal": BOOLEAN,
"add": BOOLEAN,
"del": BOOLEAN,
"ln1": OLD_LINE_NUMBER,
"ln2": NEW_LINE_NUMBER,
"content": STRING,
"oldStart": NUMBER,
"oldLines": NUMBER,
"newStart": NUMBER,
"newLines": NUMBER
}
}]
}]
I am thinking of the following algorithm
make an array of new line numbers starting from NEW_STARTING_LINE_NUMBER to
NEW_STARTING_LINE_NUMBER+NEW_TOTAL_LINESfor each file
subtract newStart from each number and make it another array relativeLineNumbers
traverse through the array and for each deleted line (type==='del') increment the corresponding remaining relativeLineNumbers
for another hunk (line having ##) decrement the corresponding remaining relativeLineNumbers

I have found a solution. I didn't put it here because it involves simple looping and nothing special. But anyway answering now to help others.
I have opened a pull request to create the similar situation as shown in question
https://github.com/harryi3t/5134/pull/7/files
Using the Github API one can get the diff data.
diff --git a/test.js b/test.js
index 2aa9a08..066fc99 100644
--- a/test.js
+++ b/test.js
## -2,14 +2,7 ##
var hello = require('./hello.js');
-var names = [
- 'harry',
- 'barry',
- 'garry',
- 'harry',
- 'barry',
- 'marry',
-];
+var names = ['harry', 'barry', 'garry', 'harry', 'barry', 'marry'];
var names2 = [
'harry',
## -23,9 +16,7 ## var names2 = [
// after this line new chunk will be created
var names3 = [
'harry',
- 'barry',
- 'garry',
'harry',
'barry',
- 'marry',
+ 'marry', 'garry',
];
Now just pass this data to diff-parse module and do the computation.
var parsedFiles = parseDiff(data); // diff output
parsedFiles.forEach(
function (file) {
var relativeLine = 0;
file.chunks.forEach(
function (chunk, index) {
if (index !== 0) // relative line number should increment for each chunk
relativeLine++; // except the first one (see rel-line 16 in the image)
chunk.changes.forEach(
function (change) {
relativeLine++;
console.log(
change.type,
change.ln1 ? change.ln1 : '-',
change.ln2 ? change.ln2 : '-',
change.ln ? change.ln : '-',
relativeLine
);
}
);
}
);
}
);
This would print
type (ln1) old line (ln2) new line (ln) added/deleted line relative line
normal 2 2 - 1
normal 3 3 - 2
normal 4 4 - 3
del - - 5 4
del - - 6 5
del - - 7 6
del - - 8 7
del - - 9 8
del - - 10 9
del - - 11 10
del - - 12 11
add - - 5 12
normal 13 6 - 13
normal 14 7 - 14
normal 15 8 - 15
normal 23 16 - 17
normal 24 17 - 18
normal 25 18 - 19
del - - 26 20
del - - 27 21
normal 28 19 - 22
normal 29 20 - 23
del - - 30 24
add - - 21 25
normal 31 22 - 26
Now you can use the relative line number to post a comment using github api.
For my purpose I only needed the relative line numbers for the newly added lines, but using the table above one can get it for deleted lines also.
Here's the link for the linting project in which I used this. https://github.com/harryi3t/lint-github-pr

Using sed to separate pattern from streams

I have a file that has log entries on each line like:
vert.x-worker-thread-5:606-5 [28281755664384/companyOfflineCaEnricherRSS] [oiq.contentdigestion.PipelineProcessorLink] - CertainClassifierPipelineProcessorInternal COMPLETE [75ms]: http://www.cadc.uscourts.gov/recordings/recordings.nsf/uscadcoralarguments.xml
vert.x-worker-thread-6:524-7 [28281755664384/companyWebAndEventWorkerMultiPass][oiq.contentdigestion.PipelineProcessorLink] - CertainClassifierPipelineProcessorInternal COMPLETE [54ms]: http://a1851.g.akamaitech.net/f/1851/2996/24h/cache.xerox.com/downloads/usa/en/c/CEO_Commitment.pdf
Could any one help me in getting the sed command to have the lines processed in to following pattern
companyOfflineCaEnricherRSS : CertainClassifierPipelineProcessorInternal 75
companyWebAndEventWorkerMultiPass : CertainClassifierPipelineProcessorInternal 54

Use this sed command and get you expected output ,
sed -r 's/[^\/]+.([^]]+).*- ([^ ]+)[^[]+.([^a-z]+).*/\1 : \2 \3/' FileName
-r, --regexp-extended
use extended regular expressions in the script.
OutPut :
companyOfflineCaEnricherRSS : CertainClassifierPipelineProcessorInternal 75
companyWebAndEventWorkerMultiPass : CertainClassifierPipelineProcessorInternal 54

Displaying human-readable text in perl Log::Report stack traces

A library that I'm using XML::Compile::Translate::Reader calls Log::Report's error method
or error __x"data for element or block starting with `{tag}' missing at {path}"
, tag => $label, path => $path, _class => 'misfit';
As I've got Log::Report set to debug mode, it returns a stack trace for an error.
[11 07 2014 22:17:39] [2804] error: data for element or block starting with `MSISDN' missing at {http://www.sigvalue.com/acc}TA
at /usr/local/share/perl5/XML/Compile/Translate/Reader.pm line 476
Log::Report::error("Log::Report::Message=HASH(0x2871cf8)") at /usr/local/share/perl5/XML/Compile/Translate/Reader.pm line 476
<snip>
XML::Compile::SOAP::Daemon::LWPutil::lwp_run_request("HTTP::Request=HASH(0x2882858)", "CODE(0x231ba38)", "HTTP::Daemon::ClientConn::SSL=GLOB(0x231b9c0)", undef) at /usr/local/share/perl5/XML/Compile/SOAP/Daemon/LWPutil.pm line 95
Any::Daemon::run("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "child_task", "CODE(0x2548128)", "max_childs", 36, "background", 1) at /usr/local/share/perl5/XML/Compile/SOAP/Daemon/AnyDaemon.pm line 75
XML::Compile::SOAP::Daemon::AnyDaemon::_run("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "HASH(0x18dda00)") at /usr/local/share/perl5/XML/Compile/SOAP/Daemon.pm line 99
(eval)("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "HASH(0x18dda00)") at /usr/local/share/perl5/XML/Compile/SOAP/Daemon.pm line 94
XML::Compile::SOAP::Daemon::run("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "name", "rizserver.pl", "background", 1, "max_childs", 36, "socket", [7 more]) at ./rizserver.pl line 95
There is lots of juicy data in those HASH, SCALAR, GLOB, and other elements that I want to get logged; as we are having trouble logging the original request in case it doesn't match.
I've explored using
Some leads that I don't know how to use are using Log::Dispatch, or some sort of Filter on Log::Report; but in the end, all I really want is to apply Data::Dumper to those elements.

AWK - filter file with not equal fields

I've been trying to pull a field from a row in a file although each row may have plus or minus 2 or 3 fields per row. They aren't always equal in the number of fields per row.
Here is a snippet:
A orarpp 45286124 1 1 0 20 60 Nov 25 9-16:42:32 01:04:58 11176 117056 0 - oracleXXX (LOCAL=NO)
A orarpp 45351560 1 1 3 20 61 Nov 30 5-03:54:42 02:24:48 4804 110684 0 - ora_w002_XXX
A orarpp 45548236 1 1 22 20 71 Nov 26 8-19:36:28 00:56:18 10628 116508 0 - oracleXXX (LOCAL=NO)
A orarpp 45679190 1 1 0 20 60 Nov 28 6-23:42:20 00:37:59 10232 116112 0 - oracleXXX (LOCAL=NO)
A orarpp 45744808 1 1 0 20 60 10:52:19 23:08:12 00:04:58 11740 117620 0 - oracleXXX (LOCAL=NO)
A root 45810380 1 1 0 -- 39 Nov 25 9-19:54:34 00:00:00 448 448 0 - garbage
In the case of the first line, I'm interested in 9-16:42:32 and the similar fields for each row.
I've tried to pull it by using ':' as the field separator and then filter from there however, what I am trying to accomplish is to do something if the number before the dash (in the example it's 9) is greater than one.
cat file.txt | grep oracle | awk -F: '{print substr($1, length($1)-5)}'
This is because the number of fields on either side of the actual field I need can be different from line to line.
Definitely not the most efficient but I've been trying to do this with an awk one liner.
Hints or a direction would be appreciated to get me moving again. I am not opposed to doing in a better way than awk.
Thanks.

Maybe cut is the right tool for this job? For example, with your snippet:
$ cut -c 62-71 file.txt
9-16:42:32
5-03:54:42
8-19:36:28
6-23:42:20
23:08:12
9-19:54:34
The arguments tell cut to snip columns (-c) 62 through 71.
For additional processing, you can pipe it to awk.
You can also accomplish the whole thing in awk by accepting entire lines and then using substr to extract the columns you want. For example, this awk command produces the same output as the cut command above:
awk '{ print substr($0, 62, 10) }' file.txt
Whether you create a pipeline or do the processing entirely in awk is at least in part a matter of personal taste / style.

Would this do?
awk -F: '/oracle/ {print substr($0,62,10)}' file.txt
9-16:42:32
8-19:36:28
6-23:42:20
23:08:12
This search for oracle and then print 10 characters starting from position 62

You can grab those identifiers with one of
grep -o '[[:digit:]]\+-[[:digit:]]\{2\}:[[:digit:]]\{2\}:[[:digit:]]\{2\}'
grep -oP '\d+-\d\d:\d\d:\d\d' # GNU grep
It sounds like you want to do something with the lines, not just find the ids. Please elaborate.
Using GNU awk:
gawk --re-interval '
/oracle/ && \
match($0, /([[:digit:]]+)-([[:digit:]]{2}:){2}[[:digit:]]{2}/, a) && \
a[1]>1 {
# do something with the matching line
print
}
' file

How can I fill out blank spaces in a text file with the word from the line above?

I have a large report file (about 20MB) that looks like this:
586 700006207 8,622.09 896
9,882.82 896
777 68607099 900.00 896
587 800006207 7,059.22 896
959.02 896
697.87 896
7 280667985 .00 899
On 1st and 2nd columns there are blanks if the values are the same as the line above. I need help with a grep/sed/powershell one-liner to fill out the empty spaces, so that it looks like this:
586 700006207 8,622.09 896
586 700006207 9,882.82 896
777 68607099 900.00 896
587 800006207 7,059.22 896
587 800006207 959.02 896
587 800006207 697.87 896
7 280667985 .00 899
Thanks.

This might work for you:
sed ':a;$!N;/^\(\( [0-9]\+ *[0-9]\+\).*\n\)\( \{15\}\)/{s//\1\2/;ta};P;D' file
From the data you have provided the line always begins with a space. If this is not the case then:
sed ':a;$!N;/^\(\([0-9]\+ *[0-9]\+\).*\n\)\( \{14\}\)/{s//\1\2/;ta};P;D' file

Assuming there are not blank lines and the inter-column delimiters are spaces, the following works (tested in Ubuntu/bash shell, using GNU sed)...
sed -r "/^ {15}/{G; s/^ {15}(.*)\n(.{15}).*/\2\1/};h" "report"

Assuming you need to keep the spacing and the first 2 columns are the first 15 chars:
awk '
NF==2 {print fill substr($0, 16); next}
{print; fill = substr($0, 1, 15)}
'

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Why does sed delete the wrong range of lines - sed

Related

How to comment on a specific line number on a PR on github

Using sed to separate pattern from streams

Displaying human-readable text in perl Log::Report stack traces

AWK - filter file with not equal fields

How can I fill out blank spaces in a text file with the word from the line above?

Categories

Resources