I have written the server program using the select. Then I have connect the client using telnet. The connection also completed successfully.
If I have the input length as 6 character including newline, in the server side it display the length as 7 character. How it is possible?
Server side:
The client is sending \r\n instead of \n, which would account for the extra character. You can translate it back to just a newline with a simple regex:
# $data holds the input line from the client.
$data =~ s/\r\n/\n/g; # Search for \r\n, replace it with \n
Client side:
Assuming you're using Net::Telnet, you're probably sending 2 characters for the newline, \r and \n, as specified by the Telnet RFC.
The documentation I linked to says this,
In the input stream, each sequence of
carriage return and line feed (i.e.
"\015\012" or CR LF) is converted to
"\n". In the output stream, each
occurrence of "\n" is converted to a
sequence of CR LF. See binmode() to
change the behavior. TCP protocols
typically use the ASCII sequence,
carriage return and line feed to
designate a newline.
And the default is not binary mode (binmode), meaning that all instances of \n in your client data will be replaced by \r\n before it gets sent to the server.
The default Binmode is 0, which means
do newline translation.
You can stop the module from replacing your newlines by calling binmode on your file descriptor, or in the case of Net::Telnet, call binmode on your object and pass 1.
# Do not translate newlines.
$obj->binmode(1);
Or on the server you can search for \r\n on the input data and replace it with \n.
Related
I am a newcomer to Perl, however not to programming in general. I have been looking for any hints how to escape from open() in Perl, but have not been lucky, and that is why I am asking here.
I have a:
$mailprog = '/usr/lib/sendmail';
open(MAIL,"|$mailprog -t");
read(STDIN, $buffer, 18);
print MAIL "To: xxx#xxx.xxx\n";
print MAIL "From: xxx#xxx.xxx\n";
print MAIL "Subject: xxx\n";
print MAIL $buffer;
close (MAIL);
Is there any way how I can shape the input into the $buffer so as to escape from sendmail ? The buffer input length is arbitrary. Input is totally under my control. Thanks a lot for any ideas !
man sendmail says:
By default, Postfix sendmail(1) reads a message from standard input
until EOF or until it reads a line with only a . character, and
arranges for delivery. Postfix sendmail(1) relies on the postdrop(1)
command to create a queue file in the maildrop directory.
So you would want your input to contain the sequence "\n.\n" somewhere.
Only one sequence is special to sendmail once it starts reading the body: A line containing a single . signals the end of the input. (EOF does the same.)
That means that if your input contains a line that contains nothing but ., you need to escape it. The default transfer encoding doesn't provide a means of escape, so you will need to specifying a Content-Transfer-Encoding that avoids the issue (e.g. base64) or allows you to escape the period (e.g. quote-printable), and encode the content accordingly.
This brings us to the restrictions of the content transfer encoding you choose adds.
The default content transfer encoding, 7bit, requires lines of no more than 998 octets terminated by CRLF. Those lines may only contain octets in [1,127], and octets 10 and 13 may only appear as part of the line terminator.
If the content transfer encoding you chose isn't suitable to encode your input, you will need to choose a different one.
You really should be using something like Email::Sender instead of working at such a low level.
Get-Content $user| Foreach-Object{
$user = $_.Split('=')
New-Variable -Name $user[0] -Value $user[1]}
Im trying to work on a script and have it split a text file into an array, splitting the file based on each new line
What should I change the "=" sign to
It depends on the exact encoding of the textfile, but [Environment]::NewLine usually does the trick.
"This is `r`na string.".Split([Environment]::NewLine)
Output:
This is
a string.
The problem with the String.Split method is that it splits on each character in the given string. Hence, if the text file has CRLF line separators, you will get empty elements.
Better solution, using the -Split operator.
"This is `r`na string." -Split "`r`n" #[Environment]::NewLine, if you prefer
You can use the String.Split method to split on CRLF and not end up with the empty elements by using the Split(String[], StringSplitOptions) method overload.
There are a couple different ways you can use this method to do it.
Option 1
$input.Split([string[]]"`r`n", [StringSplitOptions]::None)
This will split on the combined CRLF (Carriage Return and Line Feed) string represented by `r`n. The [StringSplitOptions]::None option will allow the Split method to return empty elements in the array, but there should not be any if all the lines end with a CRLF.
Option 2
$input.Split([Environment]::NewLine, [StringSplitOptions]::RemoveEmptyEntries)
This will split on either a Carriage Return or a Line Feed. So the array will end up with empty elements interspersed with the actual strings. The [StringSplitOptions]::RemoveEmptyEntries option instructs the Split method to not include empty elements.
The answers given so far consider only Windows as the running environment. If your script needs to run in a variety of environments (Linux, Mac and Windows), consider using the following snippet:
$lines = $input.Split(
#("`r`n", "`r", "`n"),
[StringSplitOptions]::None)
There is a simple and unusual way to do this.
$lines = [string[]]$input
This will split $input like:
$input.Split(#("`r`n", "`n"))
This is undocumented at least in docs for Conversions.
Beware, this will not remove empty entries.
And it doesn't work for Carriage Return (\r) line ending at least on Windows.
Experimented in Powershell 7.2.
This article also explains a lot about how it works with carriage return and line ends. https://virot.eu/powershell-and-newlines/
having some issues with additional empty lines and such i found the solution to understanding the issue. Excerpt from virot.eu:
So what makes up a new line. Here comes the tricky part, it depends.
To understand this we need to go to the line feed the character.
Line feed is the ASCII character 10. It in most programming languages
escaped by writing \n, but in powershell it is `n. But Windows is not
content with just one character, Windows also uses carriage return
which is ASCII character 13. Escaped \r. So what is the difference?
Line feed advances the pointer down one row and carriage return
returns it to the left side again. If you store a file in Windows by
default are linebreaks are stored as first a carriage return and then
a line feed (\r\n). When we aren’t using any parameters for the
split() command it will split on all white-space characters, that is
both carriage return, linefeed, tabs and a few more. This is why we
are getting 5 results when there is both carriage return and line
feeds.
Apologies from the outset. I cannot give code that I am using.
I am querying a database via DBI and using perl to print the output via fetchrow-array and print $variable
But the fields in the database contain \0 \t \r etc as part of the normal text.
When these fields are printed as via the variable and the print command, these \t \r \0 text characters are mistakenly printed as tab, newline, hex character. I see no way to tell print to ignore any character strings like this.
Any ideas?
Thanks.
Neither fetching data using DBI nor printing will convert \t into a tab. The only time Perl converts \t is if it's found in a double-quoted string literal[1], which is to say in a file passed to perl, do, require or use, or in a string passed to perl -e or eval EXPR.
If you have a tab, you are taking steps to convert \t to a tab, or it's actually a tab in the database.
This includes qx and the replacement expression of a substitution without /e.
I want to read an input file line by line, but this file has unknown ending character.
Editor vim does not know it either, it represents this character as ^A and immediately starts with characters from new line. The same is for perl. It tried to load all lines in once, because it ignores these strange end of line character.
How can I set this character as end of line for perl? I don't want to use any special module for it (because of our strict system), I just want to define the character (maybe in hex code) of end of line.
The another option is to convert the file to another one, with good end of line character (replace them). Can I make it in some easy way (something like sed on input file)? But everything need to be done in perl.
It is possible?
Now, my reading part looks like:
open (IN, $in_file);
$event=<IN>; # read one line
The ^A character you mention is the "start of heading" character. You can set the special Perl variable $/ to this character. Although, if you want your code to be readable and editable by the guy who comes after you (and uses another editor), I would do something like this:
use English;
local $INPUT_RECORD_SEPARATOR = "\cA" # 'start of heading' character
while (<>)
{
chomp; # remove the unwanted 'start of heading' character
print $_ . "\n";
}
From Perldoc:
$INPUT_RECORD_SEPARATOR
$/
The input record separator, newline by default. This influences Perl's idea of what a "line" is.
More on special character escaping on PerlMonks.
Oh and if you want, you can enter the "start of heading" character in VI, in insert mode, by pressing CTRL+V, then CTRL+A.
edit: added local per Drt's suggestion
I am using Perl to read UTF-16LE files in Windows 7.
If I read in an ASCII file with following code then each "\r\n" in file will be converted into a "\n" in memory:
open CUR_FILE, "<", $asciiFile;
If I read in an UTF-16LE(windows 1200) file with following code, this inconsistency cause problems when I trying to regexp lines with line breaks.
open CUR_FILE, "<:encoding(UTF-16LE)", $utf16leFile;
Then "\r\n" will keep unchanged.
Update:
For each line of a UTF-16LE file:
line =~ /(.*)$/
Then the string matched in $1 will include a "\r" at the end...
What version of Perl are you using? UTF-16 and CRLF handling did not mix properly before 5.8.9 (Unicode changes in 5.8.9). I'm not sure about 5.10.0, but it works in 5.10.1 and 5.8.9. You might need to use "<:encoding(UTF-16LE):crlf" when opening the file.
That is windows performing that magic for you.... If you specify UTF this is the equivalent of opening the file in binary mode vs text.
Newer versions of Perl have the \R which is a generic newline (ie, will match both \r\n and \n) as well as \v which will match all the OS and Unicode notions of vertical whitespace (ie, \r \n \r\n nonbreaking space, etc)
Does you regex logic allow using \R instead of \n?