sublime text / ms word delete misc line breaks in code - ms-word

I have a csv file that has random line breaks throughout the file. (probably load errors when the file was created where the loader somehow managed to put a carriage return into the field)
How do I go in and remove all carriage returns / line breaks where the last character is not "
I have word and sublime text available for text editors
I have tried ^p with a letter infront and find and replace, but that doesnt seem to work for some of the lines for some reason
Example
"3203","Shelving Units
",".033"
instead of
"3203","Shelving Units",".033"
and
"3206","Broom
","1.00"
instead of
"3206","Broom","1.00"

Menu > Find > Replace... or Ctrl+H
Select "Regular Expression" (probably a .* icon in the bottom left, depending on your theme).
Use \n to select newlines (LF) or \r\n (CRLF).

As #GerardRoche said you can use search and replace in Sublime Text. Open it via ctrl+h and press alt+r to toggle regex to enable it. (You may want to create a backup of your file before doing such changes.)
Search for (?<=[^"\n])\n+ and replace it with nothing, press Replace All or ctrl+alt+enter to replace it.
The regex just mean: search for alt least one (+) newlines (\n), that are preceded by something different than a quotation mark or a newline (?<=[^"\n]).
You don't need to worry about carriage returns, because ST only uses them when reading and writing the file and not in the editor.

Related

Specify anything in brackets - vscode

I have a file that contains texts and inside each text there is a number in parentheses
Is there a way in vscode to select all the numbers in parentheses and delete or replace them in an easy way, I can't do it manually because the scripts exceed 5000
image
You can use the search function in VS Code.
Click on the magnifying glass, in the sidebar.
toggle search details by clicking on the 3 dots in the search pane
name your file in the files to include
enable regular expression
use a regular expression that fits your case, i.e \(\d+\)
enter a replace text (or leave it empty to remove), in the second input and click replace all
You could also use sed to do that.
sed -E 's/\([0-9]+\)/(something else)/g' file.txt > newfile.txt

Use BBedit to replace CR with LF

I have a text file where I would like to change all the Carriage Return to Line Feed. I am working on a Mac, and it seems like BBedit should be able to easily do this. However, in the search function it does not appear to differentiate between CR (\r) and LF (\n). Searching for either character gets the same hits, and a search for "\r" to replace with "\n" does not work.
Is there some other way to represent CR and LF, so that BBedit can differentiate between them?
A possibly simpler way to do this is to make sure you enable "Line Break Types" in BBedit's Preferences -> Appearance.
This will add a Line Types popup menu at the bottom of the window.
Not only does it show you what the current format is, but it makes it trivial to switch between them.
(I also turn on "Text Encoding" as well, in case I need to switch between UTF8 and UFT16 and others.)
For a single file:
Text -> Apply Text Transform
Select Change Line Endings
Click the Configure button and choose your desired line endings
For multiple files:
File -> New -> Text Factory
Select Change Line Endings
Click the Options button and choose your desired line endings
Save the Text Factory and apply it to the files you want changed through the Choose and Options buttons

How to remove the leading and trailing space from each line of a file using shell script?

I have a file with some leading and trailing spaces. Here is the file
val1=22
val2=23
val4=34
How can I remove both the leading and trailing white space from it? The white space could be a 'tab' too. Is there a single command to do it?
You have a very wide variety of options to achieve the desired result. One of them is simply to use Notepad++, one of the best text editors around:
Open your file in Notepad++,
Press Ctrl+H to open the "Replace" dialog box,
Insert the (^[\s\t]+)|([\s\t]+$) in the "Find what" text box,
Leave the "Replace with" text box blank,
Select the "Regular expression" in the "Search Mode" group at the bottom of the dialog box.
Press the "Replace All" button and you're done.
The code inserted in the "Find what" text box is a regular expression that instructs the finder to find the leading (^[\s\t]+) or trailing ([\s\t]+$) spaces or tabs.
Here is the command I found to remove both the leading and trailing spaces and tabs from every line of a file. It works for me.
sed -i 's/^[ \t]*//;s/[ \t]*$//' "filename"
Where
s/ : Substitute command ~ replacement for pattern (^[ \t]*) on each addressed line
^[ \t]* : Search pattern ( ^ – start of the line; [ \t]* match one or more blank spaces including tab)
// : Replace (delete) all matched pattern
Credit goes to this link

Copy Lines including carriage return linefeed in notepad++

Using Notepad++ I want to select individual, non-contiguous lines, copy them, and past them and include the CR/LF at the end. Preferrably, I would hold Ctrl, then click the line numbers I want, then press Ctrl+C or right click and select copy; however doing this selects all text (which is frustrating and doesn't make much sense). Furthermore, only selecting the line partially includes the line below it, so that if i press Ctrl+Shift+Up(or down) the line below it also moves up or down.
In summary, I want to copy non-contiguous lines and past them with their respective EOL characters.
Use Ctrl+F2 to mark desired lines.
Menu Search > Bookmarks > Copy bookmarked lines will copy these lines into clipboard.
As option for step 1, you can use Mark tab of Find dialog (Ctrl+F) with Mark Line option checked.

Notepad++ newline in regex

Suppose you have this file:
x
a
b
c
x
x
a
b
c
x
x
and you want to find the sequence abc (and select the whole 3 lines) with Notepad++ . How to express the newline in regex, please?
Notepad++ can do that comfortably, you don't even need regexes
In the find dialogue box look in the bottom left and switch your search mode to Extended which allows \n etc.
As odds on you're working on a file in windows format you'll be looking for \r\n (carriage return, newline)
a\r\nb\r\nc
Will find the pattern over three lines
Update 18th June 2012
With the new Notepad++ v6, you can indeed search for newlines with regexes. So you can just use
a\r\nb\r\nc
even with regular expressions to accomplish what you want. Note \r\n is Windows encoding of line-breaks. In Unix files, its just \n.
Unfortunately, you can't do that in Notepad++ when using regex search. Notepad++ is based on the Scintilla editor component, which doesn't handle newlines in regex.
You can use extended search for newline searching, but I don't think that will help you search for 3 lines.
More info here.
Update: Robb and StartClass0830 were right about extended search. It does work, but not when using regular expressions search.
^a\x0D\x0Ab\x0D\x0Ac
This will work \x0D is newline and \x0A is carriage return. Assumption is that each line in your file ends with ascii 10 and 13.
I found a workaround for this.
Simply, in Extended mode replace all \r\n to a string that didn't exist in the rest of the document eg. ,,,newline,,, (watch out for special regexp chars like $, &, and *).
Then switch to Regexp mode, do some replacements (now newline is ,,,newline,,,).
Next, switch to Extended mode again and replace all ,,,newline,,, to \r\n.
For Notepad 6 and beyond, do this as a regular expression:
Select Search Mode > Regular expression (w/o . matches newline)
And in the Find what Textbox : a[\r\n]b[\r\n]+c[\r\n]
or if you are looking at the (Windows or Unix) file to see its line breaks as \r\n or \n then you may find it easier to use Extended Mode:
Select Search Mode > Extended (\n, \r, \t, \0, \x...)
And in the Find what Textbox for Windows: a\r\nb\r\nc\r\n
Or in the Find what Textbox for Unix: a\nb\nc\n
Wasn't clear if the OP intent is to select the trailing line return (after the 'c') as well, as would be necessary to remove the lines.
To not select the trailing line return, as appropriate for replacing with a non-empty string, simply remove the final line return from the matching statement.
Note that if there should be a match on the last line of the string, without a matching trailing line return, the match fails.
a\r\nb\r\nc works for me, but not ^a\x0D\x0Ab\x0D\x0Ac
Hmm, too bad that newline is not working with regular expressions. Now I have to go back to Textpad again. :(
Select Search Mode Which is
Extended (\n, \r, \t, \0, \x...)
\n is new line and such
This is Manuel
Find: "(^a.$)\r\n(b.)\r\n^(c.*)$" - pickup 3 whole lines, only storing data
Replace with: "\1\2\3" - Put down (replay) data
Works fine in Regex with Notepad++ v7.9.5
Place holders: ^ Start and $ End of line can be inside or out of ()store as shown, though clearly not necessary in given example. Note "[^x]" is different - here "^" is "NOT".
Advantage of storing and replay allows much more complicated pattern match without having to type in again what you want to end up with, and even change of replay: "\2\3\1" for "bca"
I have run accross this little issue when the document is windows CR/LF
If you click the box for . to match newlines you need .. to match CR/LF so if you have
<blah><blah>",
"<more><blah>
you need to use ",.." to match some string comma cr/lf another string
In Notepad++ you can also try highlighting the desired part of the text and then pressing CTRL+J.
That would justify the text and thus removing all line endings.