Is EOF hidden in txt file?

Is EOF hidden in txt file? - eof

I have made an .exe file (echo_eof.exe) which is written in C.
The code goes like this:
#include <stdio.h>
int main(void)
{
int ch;
while ((ch = getchar()) != EOF)
putchar(ch);
}
Then I typed echo_eof < words.txt in Windows cmd where words.txt is written as
Hello world!
The command output is
Hello world!
I have never typed EOF in the text file but it seems like EOF is hidden in the text file. Is this true? If it is, is there a way to see the hidden EOF in the text file?

If your reading function is at the end of the file and can't get another symbol (probably char), then it gets told that you have reached EOF.
This is not in the file, it is a signal from the filehandler.

Related

Garbage characters printed by vscode [duplicate]

Everytime I use the terminal to print out a string or any kind of character, it automatically prints an "%" at the end of each line. This happens everytime I try to print something from C++ or php, havent tried other languages yet. I think it might be something with vscode, and have no idea how it came or how to fix it.
#include <iostream>
using namespace std;
int test = 2;
int main()
{
if(test < 9999){
test = 1;
}
cout << test;
}
Output:
musti#my-mbp clus % g++ main.cpp -o tests && ./tests
1%
Also changing the cout from cout << test; to cout << test << endl; Removes the % from the output.

Are you using zsh? A line without endl is considered a "partial line", so zsh shows a color-inverted % then goes to the next line.
When a partial line is preserved, by default you will see an inverse+bold character at the end of the partial line: a ‘%’ for a normal user or a ‘#’ for root. If set, the shell parameter PROMPT_EOL_MARK can be used to customize how the end of partial lines are shown.
More information is available in their docs.

lex program to count the Number of Words

I made the following lex program to count the Number of words in a Textfile. A 'Word' for me is any string that starts with an alphabet and is followed by 0 or more occurrence of alphabets/numbers/_ .
%{
int words;
%}
%%
[a-zA-Z][a-zA-Z0-9_]* {words++; printf("%s %d\n",yytext,words);}
. ;
%%
int main(int argc, char* argv[])
{
if(argc == 2)
{
yyin = fopen(argv[1], "r");
yylex();
printf("No. of Words : %d\n",words);
fclose(yyin);
}
else
printf("Invalid No. of Arguments\n");
return 0;
}
The Problem is that for the following Textfile, I am getting the No. of Words : 13. I tried printing the yytext and it shows that it is taking 'manav' from '9manav' as a word even though it doesnot match my definition of a word.
I also tried including [0-9][a-zA-Z0-9_]* ; within my code but still shows the same output. I want to know why is this happening and possible ways to avoid it.
Textfile : -
the quick brown fox jumps right over the lazy dog cout for
9manav
-99-7-5 32 69 99 +1

First, the manav is perfectly matching your definition of word. The 9 in front of it is matched by the . rule. Remember, that white space is not special in lex.
You had the right idea by adding another rule [0-9][a-zA-Z0-9_]* ; but since the ruleset is ambiguous (there are several ways to match the input) order of the rules matters. It's a while I worked with lex but I think putting the new rule before the word rule should work.

Weird looking symbols in dat file?

Learning to program in C. Used a textbook to learn about writing data randomly to a random access file. It seems like the textbook code works ok. However the output in the Notepad file is: Jones Errol Ÿru ”#e©A Jones Raphael Ÿru €€“Ü´A. This can't be correct yeah? Do you know why the numbers don't show?
I have no idea how to format code properly. Always someone tells me it is bad. I use CTRL +K. And in my compiler follow the book exactly. I'm sorry if it isn't correct. Maybe you can tell me how? Thanks
Here is the code:
#include <stdio.h>
//clientData structure definition
struct clientData{
int acctNum;
char lastName[15];
char firstName[10];
double balance;
};
int main (void){
FILE *cfPtr;//credit.dat file pointer
//create clientData with default information
struct clientData client={0,"","",0.0};
if ((cfPtr=fopen("c:\\Users\\raphaeljones\\Desktop\\clients4.dat","rb+"))==NULL){
printf("The file could not be opened\n");
}
else {
//require user to specify account number
printf("Enter account number"
"(1 to 100, 0 to end input)\n");
scanf("%d",&client.acctNum);
// users enters information which is copied into file
while (client.acctNum!=0) {
//user enters lastname,firstname and balance
printf("Enter lastname,firstname, balance\n");
//set record lastName,firstname and balance value
fscanf(stdin,"%s%s%lf", client.lastName,
client.firstName, &client.balance);
//seek position in file to user specified record
fseek(cfPtr,
(client.acctNum-1)* sizeof (struct clientData),
SEEK_SET);
//write user specified information in file
fwrite(&client,sizeof(struct clientData),1,cfPtr);
//enable user to input another account number
printf("Enter account number\n");
scanf("%d",&client.acctNum);
}
fclose(cfPtr);
return 0;
}

You have created a structure clientData which contains an integer, two strings and a double. You open the file in binary mode and you use fwrite() to write the structure to it.
This means you are writing the integer and the double in binary, and not as character strings, so what you see is logically correct, and you could read the file back into a structure with fread() and then print it out.
If you want to create a text file, you should use fprintf(). You can specify the field widths for integer and double values, so you can create a fixed-length record (which is essential for random access).

What is the fastest way to autobreak a line of gigabytes separated by keywords using bash shell?

For example, given a line a11b12c22d322 e... the fields of break are the numbers or spaces, we want to transform it into
a
b
c
d
e
...
sed need to read the whole line into memory, for gigabytes a line, it would not be efficient, and the job could not be done if we don't have sufficient memory.
EDIT:
Could anyone please explain how do grep, tr, Awk, perl, and python manipulate the memory in reading a large file? What and how much content do they read into memory once a time?

If you use gawk (which is the default awk on Linux, I believe), you can use the RS parameter to specify that multi-digit numbers or spaces are recognized as line terminators instead of a new-line.
awk '{print}' RS="[[:digit:]]+| +" file.txt
As to your second question, all of these programs will need to read some fixed number of bytes and search for its idea of a line separator in an internal buffer to simulate the appearance of reading a single line at a time. To prevent it from reading too much data while searching for the end of the line, you need to change the programs idea of what terminates a line.
Most languages allow you to do this, but only allow you to specify a single character. gawk makes it easy by allowing you to specify a regular expression to recognize an end-of-line character. This saves you from having to implement the fixed-size buffer and end-of-line search yourself.

Fastest... You can do it with help of gcc, here's a version which reads data from given file name if given, otherwise from stdin. If this is still too slow, you can see if you can make it faster by replacing getchar() and putchar() (which may be macros and should optimize very well) with your own buffering code. If we want to get ridiculous, for even faster, you should have three threads, so kernel can copy next block of data with one core, while another core does processing, and third core copies processed output back to kernel.
#!/bin/bash
set -e
BINNAME=$(mktemp)
gcc -xc -O3 -o $BINNAME - <<"EOF"
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int sep = 0;
/* speed is a requirement, so let's reduce io overhead */
const int bufsize = 1024*1024;
setvbuf(stdin, malloc(bufsize), _IOFBF, bufsize);
setvbuf(stdout, malloc(bufsize), _IOFBF, bufsize);
/* above buffers intentionally not freed, it doesn't really matter here */
int ch;
while((ch = getc(stdin)) >= 0) {
if (isdigit(ch) || isspace(ch)) {
if (!sep) {
if (putc('\n', stdout) == EOF) break;
sep = 1;
}
} else {
sep = 0;
if (putc(ch, stdout) == EOF) break;
}
}
/* flush should happen by on-exit handler, as buffer is not freed,
but this will detect write errors, for program exit code */
fflush(stdout);
return ferror(stdin) || ferror(stdout);
}
EOF
if [ -z "$1" ] ; then
$BINNAME <&0
else
$BINNAME <"$1"
fi
Edit: I happened too look at GNU/Linux stdio.h, some notes: putchar/getchar are not macros, but putc/getc are, so using those instead might be a slight optimization, probably avoiding one function call, changed code to reflect this. Also added checking return code of putc, while at it.

With grep:
$ grep -o '[^0-9 ]' <<< "a11b12c22d322 e"
a
b
c
d
e
With sed:
$ sed 's/[0-9 ]\+/\n/g' <<< "a11b12c22d322 e"
a
b
c
d
e
With awk:
$ awk 'gsub(/[0-9 ]+/,"\n")' <<< "a11b12c22d322 e"
a
b
c
d
e
I'll let you benchmark.

Try with tr:
tr -s '[:digit:][:space:]' '\n' <<< "a11b12c22d322e"
That yields:
a
b
c
d
e

Is there any Progress 4GL statement used for editing an ASCII files?

Is there any 4GL statement used for editing an ASCII files from the disk, if so how?

Editing involves reading a file, probably using IMPORT, then manipulating the text using string functions like REPLACE() and finally writing the result probably using PUT. Something like this:
define stream inFile.
define stream outFile.
define variable txtLine as character no-undo.
input stream inFile from "input.txt".
output stream outFile to "output.txt".
repeat:
import stream inFile unformatted txtLine.
txtLine = replace( txtLine, "abc", "123" ). /* edit something */
put stream outFile unformatted txtLine skip.
end.
input stream inFile close.
output stream outFile close.

Yes there is. You can use a STREAM to do so.
/* Define a new named stream */
DEF STREAM myStream.
/* Define the output location of the stream */
OUTPUT STREAM myStream TO VALUE("c:\text.txt").
/* Write some text into the file */
PUT STREAM myStream UNFORMATTED "Does this work?".
/* Close the stream now that we're done with it */
OUTPUT STREAM myStream CLOSE.

Progress could call operating system editor:
OS-COMMAND("vi /tmp/yoyo.txt").

You could use copy-lob to read and write the file
DEF VAR lContents AS LONGCHAR NO-UNDO.
/* read file */
COPY-LOB FROM FILE "ascii.txt" TO lContents.
/* change Contents, e.g. all capital letters */
lContents = CAPS(lContents).
/* save file */
COPY-LOB lContents TO FILE "ascii.txt".

I Think that for "editing" you mean to be able to read and then show the file in screen and manipulate the file?
If so then here you have an easy one, of course, the size of the file can't be bigger than the max. capacity of a vchar variable:
def var fileline as char format "x(250)". /* or shorter or longer, up to you*/
def var filedit as char.
/*you have to quote it to obtain & line into teh charvar*/
unix silent quoter kk.txt > kk.quoted.
input from kk.quoted no-echo.
repeat:
set fileline.
filedit = filedit + (fileline + chr(13) + chr(10)) .
end.
input close.
update filedit view-as editor size 65 by 10.
Sure you can manage to save the file once edited ;-)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Is EOF hidden in txt file? - eof

If your reading function is at the end of the file and can't get another symbol (probably char), then it gets told that you have reached EOF. This is not in the file, it is a signal from the filehandler.

Related

Garbage characters printed by vscode [duplicate]

lex program to count the Number of Words

Weird looking symbols in dat file?

What is the fastest way to autobreak a line of gigabytes separated by keywords using bash shell?

Is there any Progress 4GL statement used for editing an ASCII files?

Categories

Resources