I need to create and export an excel file in my iPhone app. Unfortunately, excel won't read it if the line encoding is LF (the unix default when I write the file) instead of CRLF (the Windows standard)...Is there any way to write a file using CRLF line breaks?
I can tell this is the issue as if I open the file in TextWrangler after outputting it, then change the line breaks to CRLF, excel opens it fine.
Thanks,
Toby
If you're using printf or fprintf in C you typically terminate lines like this:
printf( "this is a line of text.\n" );
The \n outputs a linefeed. You can output a carriage return with \r, so to get a CRLF, you just:
printf( "this is a CRLF terminated line.\r\n" );
Related
I have a file like this
{CRLF
sum: 21.46,CRLF
first: 99.10,CRLF
last: 57.71 CRLF
}CRLF
{CRLF
sum: 159.32,CRLF
first: 456.71,CRLF
last: 89.27 CRLF
}CRLF
...
ps. CRLF is the line break in windows system, not really text in this file.
I want to add a comma at the end of every line containing "last:".
I used the following command
sed '/last/ s/$/,/' old.txt >new.txt
but I got a weird result
{CRLF
sum: 21.46,CRLF
first: 99.10,CRLF
last: 57.71 CR
,CRLF
}CRLF
{CRLF
sum: 159.32,CRLF
first: 456.71,CRLF
last: 89.27 CR
,CRLF
}CRLF
...
The comma doesn't append at the end of line. Instead, it append at a new line. Any idea will be greatly appreciated. Thanks.
Your data file has DOS-style (Windows-style) CRLF line endings. sed inserts the comma between the CR and the LF (because it doesn't know about CRLF line endings and CR is just another character before the end of line).
Edit your file to remove the DOS line endings: see How to convert DOS/Windows newline to Unix newline in bash script for information on how to do that. Or, as Beta pointed out, you can do that at the same time that you add the comma:
sed -e 's/.$//' -e '/last:/s/$/,/'
This is mildly dangerous if applied to a file with Unix line endings; it will remove the last character on those lines too. It might be better to embed the CR in the script:
sed -e $'s/\r$//' -e '/last:/s/$/,/'
which uses bash's ANSI-C Quoting mechanism to embed a CR into the command string.
You're not completely consistent in your question. You say 'lines containing line:' but your code handles 'lines containing line' (missing out the colon). Your choice.
I have a directory on a Linux host with several property files which I want to edit by replacing hardcoded values with placeholder tags. My goal was to have a perl script which reads a delimited file that contains entries for each of the property files listing the hardcoded value, the placeholder value and the name of the file to edit.
For example, in file.prop I have these values set
<connection targetHostUrl="99.99.99.99"
targetHostPort="9999"
And I want to replace the values with tags as shown below
<connection targetHostUrl="TARGETHOST"
targetHostPort="PORT"
There will be several entries similar to this so I have to match on the unique combination of IP and PORT so I need a multiline match.
To do this I wrote the following script to take the input of the delimited filename, which is delimited with ||. I go get that file from the config directory and read in the values to get the hardcoded value, tag, and filename to edit. Then I read in that property file, do the substitution and then write it out again.
#!/usr/bin/perl
my $config = $ARGV[0];
chomp $config;
my $filename = '/config/' . $config;
my ($hard,$tagg,$prop);
open(DATAFILE, $filename) or die "Could not open DATAFILE $filename.";
while(<DATAFILE>)
{
chomp $_;
($hard,$tagg,$prop) = split('\|\|', $_);
$*=1;
open(INPUT,"</properties/$prop") or die "Could not open INPUT $prop.";
#input_array=<INPUT>;
close(INPUT);
$input_scalar=join("",#input_array);
$input_scalar =~ s/$hard/$tagg/;
open(OUTPUT,">/properties/$prop") or die "Could not open OUTPUT $prop.";
print(OUTPUT $input_scalar);
close(OUTPUT);
}
close DATAFILE;
Inside the config file I have the following entry
<connection targetHostUrl="99.99.999.99"(.|\n)*?targetHostPort="9999"||<connection targetHostUrl="TARGETHOST1"\n targetHostPort="PORT"||file.prop
My output is as shown below. It puts what I hoped to be a newline as a literal \n
<connection targetHostUrl="TARGETHOST"\n targetHostPort="PORT"
I can't find a way to get the \n taken as a newline. At first I thought, no problem, I'll just do a 2nd substitution like
perl -i -pe 's/\\n/\n/o' $prop
and although this works, for some reason it puts ^M characters at the end of every line except the one I did the replacement on. I don't want to do a 3rd replace to strip them out.
I've searched and found other ways of doing the multiline search/replace but they all interpret the \n literally.
Any suggestions?
My output is as shown below. It puts what I hoped to be a newline as a literal \n
Why would it insert a newline when the string doesn't contain one?
I can't find a way to get the \n taken as a newline.
There isn't any. If you want to substitute a newline, you need to provide a newline.
If you used a proper CSV parser like Text::CSV_XS, you could put a newline in your data file.
Otherwise, you'll have to write some code to handle the escape sequences you want your code to handle.
for some reason it puts ^M characters at the end of every line except the one I did the replacement on.
Quite the opposite. It removes it from the one line you did the replacement on.
That's home some programs represent a Carriage Return. You have a file with CR LF line ends. You could use dos2unix to convert it, or you could leave it as is because XML doesn't care.
I want to read an input file line by line, but this file has unknown ending character.
Editor vim does not know it either, it represents this character as ^A and immediately starts with characters from new line. The same is for perl. It tried to load all lines in once, because it ignores these strange end of line character.
How can I set this character as end of line for perl? I don't want to use any special module for it (because of our strict system), I just want to define the character (maybe in hex code) of end of line.
The another option is to convert the file to another one, with good end of line character (replace them). Can I make it in some easy way (something like sed on input file)? But everything need to be done in perl.
It is possible?
Now, my reading part looks like:
open (IN, $in_file);
$event=<IN>; # read one line
The ^A character you mention is the "start of heading" character. You can set the special Perl variable $/ to this character. Although, if you want your code to be readable and editable by the guy who comes after you (and uses another editor), I would do something like this:
use English;
local $INPUT_RECORD_SEPARATOR = "\cA" # 'start of heading' character
while (<>)
{
chomp; # remove the unwanted 'start of heading' character
print $_ . "\n";
}
From Perldoc:
$INPUT_RECORD_SEPARATOR
$/
The input record separator, newline by default. This influences Perl's idea of what a "line" is.
More on special character escaping on PerlMonks.
Oh and if you want, you can enter the "start of heading" character in VI, in insert mode, by pressing CTRL+V, then CTRL+A.
edit: added local per Drt's suggestion
I have a script that reads a large file line by line. The record separator ($/) that I would like to use is (\n). The only problem is that the data on each line contains CRLF characters (\r\n), which the program should not be considered the end of a line.
For example, here is a sample data file (with the newlines and CRLFs written out):
line1contents\n
line2contents\n
line3\r\ncontents\n
line4contents\n
If I set $/ = "\n", then it splits the third line into two lines. Ideally, I could just set $/ to a regex that matches \n and not \r\n, but I don't think that's possible. Another possibility is to read in the whole file, then use the split function to split on said regex. The only problem is that the file is too large to load into memory.
Any suggestions?
For this particular task, it sounds pretty straightforward to check your line ending and append the next line as necessary:
$/ = "\n";
...
while(<$input>) {
while( substr($_,-2) eq "\r\n" ) {
$_ .= <$input>;
}
...
}
This is the same logic used to support line continuation in a number of different programming contexts.
You are right that you can't set $/ to a regular expression.
dos2unix would put a UNIX newline character in for the "\r\n" and so wouldn't really solve the problem. I would use a regex that replaces all instances of "\r\n" with a space or tab character and save the results to a different file (since you don't want to split the line at those points). Then I would run your script on the new file.
Try using dos2unix on the file first, and then read in as normal.
I am using Perl to read UTF-16LE files in Windows 7.
If I read in an ASCII file with following code then each "\r\n" in file will be converted into a "\n" in memory:
open CUR_FILE, "<", $asciiFile;
If I read in an UTF-16LE(windows 1200) file with following code, this inconsistency cause problems when I trying to regexp lines with line breaks.
open CUR_FILE, "<:encoding(UTF-16LE)", $utf16leFile;
Then "\r\n" will keep unchanged.
Update:
For each line of a UTF-16LE file:
line =~ /(.*)$/
Then the string matched in $1 will include a "\r" at the end...
What version of Perl are you using? UTF-16 and CRLF handling did not mix properly before 5.8.9 (Unicode changes in 5.8.9). I'm not sure about 5.10.0, but it works in 5.10.1 and 5.8.9. You might need to use "<:encoding(UTF-16LE):crlf" when opening the file.
That is windows performing that magic for you.... If you specify UTF this is the equivalent of opening the file in binary mode vs text.
Newer versions of Perl have the \R which is a generic newline (ie, will match both \r\n and \n) as well as \v which will match all the OS and Unicode notions of vertical whitespace (ie, \r \n \r\n nonbreaking space, etc)
Does you regex logic allow using \R instead of \n?