Read each character from file and store in array - scanf

I need to use Verilog to store each character from input file into an array.
I'm able to use $fgets to read each line, but I'm not sure how to break it down to each character.
Input file:
foo
bar
joe
stack
main:
c = $fgetc(infile);
while(c != `EOF) begin
r = ungetc(c,infile);
$fgets(str,infile);
c = $fgetc(infile);
$display (%0s,str);
end
I want to store it into str so that [0]str[0] = f and so on.

You can use $fscanf to read one line at a time, discarding the newline character. Since each line only has one word, store the word into a temporary string variable word. Then push each word into a queue of strings.
module tb;
int fd;
string word;
string str [$];
initial begin
fd = $fopen("data.txt", "r");
while (! $feof(fd)) begin
void'($fscanf(fd, "%s\n", word));
str.push_back(word);
end
$display("%s", str[0][0]);
$display("%s", str[0][1]);
$display("%s", str[0][2]);
end
endmodule
Prints:
f
o
o

Related

How to get the length of a formatted string read in using fscanf in SystemVerilog?

I am reading a text file which has string test cases and decode them to process as Verilog test constructs to simulate. The code that I use to read a file is as follows:
integer pntr,file;
string a,b,c,d;
initial
begin
pntr = $fopen(FOO, "r");
end
always
begin
if(!$feof(pntr))
begin
file = $fscanf(pntr, "%s %s %s %s \n", a,b,c,d);
end
else
$fclose(pntr);
I have tried using
integer k;
k = strlen($fscanf(pntr, "%s %s %s %s \n", a,b,c,d));
$display(k);
and the display statement outputs an "x"
I also tried using
$display(file)
but this also gives me x as the display output. The above code is just a representation of my problem, I am using a larger formatted string to read in larger data. Each line of my testcase may have different size. I have initialized the format to the maximum number of string literals that my testcase can have. I wanted to ask if there is a way to get the length of each line that I read or number of string literals that fscanf read ?
Note: I am using Cadence tools for this task.
Input file looks like
read reg_loc1
write regloc2 2.5V regloc3 20mA
read regloc3 regloc5 regloc7
It's hard to debug your code when you have lots of typos and incomplete code. And you also have a race condition in that pntr may not have been assigned from $fopen if the always block executes before the initial block.
But in any case, the problem with using $fscanf and the %s format is that a newline gets treated as whitespace. It's better to use $fgets to read a line at a time, and the use $sscanf to parse the line:
module top;
integer pntr,file;
string a,b,c,d, line;
initial begin
pntr = $fopen("FOO", "r");
while(!$feof(pntr))
if ((file = $fgets(line,pntr)!=0)) begin
$write("%d line: ", file, line);
file = $sscanf(line, "%s %s %s %s \n", a,b,c,d);
$display(file,,a,b,,c,,d);
end
$fclose(pntr);
end
endmodule

Shifting a string in matlab

Ok so I have retrieved this string from the text file now I am supposed to shift it by a specified amount. so for example, if the string I retrieved was
To be, or not to be
That is the question
and the shift number was 5 then the output should be
stionTo be, or not
to beThat is the que
I was going to use circshift but the given string wouldn't of a matching dimesions. Also the string i would retrieve would be from .txt file.
So here is the code i used
S = sprintf('To be, or not to be\nThat is the question')
circshift(S,5,2)
but the output is
stionTo be, or not to be
That is the que
but i need
stionTo be, or not
to beThat is the que
By storing the locations of the new lines, removing the new lines and adding them back in later we can achieve this. This code does rely on the insertAfter function which is only available in MATLAB 2016b and later.
S = sprintf('To be, or not to be\nThat is the \n question');
newline = regexp(S,'\n');
S(newline) = '';
S = circshift(S,5,2);
for ii = 1:numel(newline)
S = insertAfter(S,newline(ii)-numel(newline)+ii,'\n');
end
S = sprintf(S);
You can do this by performing a circular shift on the indices of the non-newline characters. (The code below actually skips all control characters with ASCII code < 32.)
function T = strshift(S, k)
T = S;
c = find(S >= ' '); % shift only printable characters (ascii code >= 32)
T(c) = T(circshift(c, k, 2));
end
Sample run:
>> S = sprintf('To be, or not to be\nThat is the question')
S = To be, or not to be
That is the question
>> r = strshift(S, 5)
r = stionTo be, or not
to beThat is the que
If you want to skip only the newline characters, just change to
c = find(S != 10);

what is the substr? in the systemverilog

Int fd;
String str;
fd = $fopen(path, "r");
Status= $fgets(str, fd);
cm = str.substr(0,1);
cm1= str.substr(0,0);
I want to know what is substr function? What is the purpose above that??
The substr function returns a new string that is a substring formed by characters in position i through j of str. Very similar to examples posted here.
module test;
string str = "Test";
initial
$display(str.substr(0,1));
endmodule
The output will be:
>> Te
As you can see in section 6.16.8, IEEE SystemVerilog Standard 1800-2012.
substr function, as it name suggests, subtracts, or takes a chunk from a bigger string, in systemverilog.
Example:
stri0 = "my_lago";
stri1 = stri0.substr(1,5);
$display("This will give stri1 = %s" , stri1);
....
OUTPUT :- This will give stri1 = y_lag
Substring: This method extracts strings. It needs the Position of the substring ( start index, length). It then returns a new string with the characters in that range.
C# program Substring
using System;
class Program
{
static void Main()
{
string input = "ManCatDog";
// Get Middle three characters.
string subString = input.Substring(3, 6);
Console.WriteLine("SubString: {0}", subString);
}
}
Output
Substring: Cat

Deleting all special characters from a string in progress 4GL

How can I delete all special characters from a string in Progress 4GL?
I guess this depends on your definition of special characters.
You can remove ANY character with REPLACE. Simply set the to-string part of replace to blank ("").
Syntax:
REPLACE ( source-string , from-string , to-string )
Example:
DEFINE VARIABLE cOldString AS CHARACTER NO-UNDO.
DEFINE VARIABLE cNewString AS CHARACTER NO-UNDO.
cOldString = "ABC123AACCC".
cNewString = REPLACE(cOldString, "A", "").
DISPLAY cNewString FORMAT "x(10)".
You can use REPLACE to remove a complete matching string. For example:
REPLACE("This is a text with HTML entity &", "&", "").
Handling "special characters" can be done in a number of ways. If you mean special "ASCII" characters like linefeed, bell and so on you can use REPLACE together with the CHR function.
Basic syntax (you could add some information about code pages as well but that's rarely needed) :
CHR( expression )
expression: An expression that yields an integer value that you want to convert to a character value. (ASCII numberic value).
So if you want to remove all Swedish letter Ö:s (ASCII 214) from a text you could do:
REPLACE("ABCDEFGHIJKLMNOPQRSTUVWXYZÅÄÖ", "Ö", "").
or
REPLACE("ABCDEFGHIJKLMNOPQRSTUVWXYZÅÄÖ", CHR(214), "").
Putting this together you could build an array of unwanted characters and remove all those in the string. For example:
FUNCTION cleanString RETURNS CHARACTER (INPUT pcString AS CHARACTER):
DEFINE VARIABLE iUnwanted AS INTEGER NO-UNDO EXTENT 3.
DEFINE VARIABLE i AS INTEGER NO-UNDO.
/* Remove all capital Swedish letters ÅÄÖ */
iUnwanted[1] = 197.
iUnwanted[2] = 196.
iUnwanted[3] = 214.
DO i = 1 TO EXTENT(iUnwanted):
IF iUnwanted[i] <> 0 THEN DO:
pcString = REPLACE(pcString, CHR(iUnwanted[i]), "").
END.
END.
RETURN pcString.
END.
DEFINE VARIABLE cString AS CHARACTER NO-UNDO INIT "AANÅÅÖÖBBCVCÄÄ".
DISPLAY cleanString(cString) FORMAT "x(10)".
Other functions that could be useful to look into:
SUBSTRING: Returns a part of a string. Can be used to modify it as well.
ASC: Like CHR but the other way around - displays ASCII value from a character).
INDEX: Returns the position of a character in a string.
R-INDEX: Like INDEX but searches right to left.
STRING: Converts a value of any data type into a character value.
This function will replace chars according to the current collation.
function Dia2Plain returns character (input icTxt as character):
define variable ocTxt as character no-undo.
define variable i as integer no-undo.
define variable iAsc as integer no-undo.
define variable cDia as character no-undo.
define variable cPlain as character no-undo.
assign ocTxt = icTxt.
repeat i = 1 to length(ocTxt):
assign cDia = substring(ocTxt,i,1)
cPlain = "".
if asc(cDia) > 127
then do:
repeat iAsc = 65 to 90: /* A..Z */
if compare(cDia, "eq" , chr(iAsc), "case-sensitive")
then assign cPlain = chr(iAsc).
end.
repeat iAsc = 97 to 122: /* a..z */
if compare(cDia, "eq" , chr(iAsc), "case-sensitive")
then assign cPlain = chr(iAsc).
end.
if cPlain <> ""
then assign substring(ocTxt,i,1) = cPlain.
end.
end.
return ocTxt.
end.
/* testing */
def var c as char init "ÄëÉÖìÇ".
disp c Dia2Plain(c).
def var i as int.
def var d as char.
repeat i = 128 to 256:
assign c = chr(i) d = Dia2Plain(chr(i)).
if asc(c) <> asc(d) then disp i c d.
end.
This function will remove anything that is not a letter or number (adapt it as you wish).
/* remove any characters that are not numbers or letters */
FUNCTION alphanumeric RETURN CHARACTER
(lch_string AS CHARACTER).
DEFINE VARIABLE lch_newstring AS CHARACTER NO-UNDO.
DEFINE VARIABLE i AS INTEGER NO-UNDO.
DO i = 1 TO LENGTH(lch_string):
/* check to see if this is a number or letter */
IF (ASC(SUBSTRING(lch_string,i,1)) GE ASC("1")
AND ASC(SUBSTRING(lch_string,i,1)) LE ASC("9"))
OR (ASC(SUBSTRING(lch_string,i,1)) GE ASC("A")
AND ASC(SUBSTRING(lch_string,i,1)) LE ASC("Z"))
OR (ASC(SUBSTRING(lch_string,i,1)) GE ASC("a")
AND ASC(SUBSTRING(lch_string,i,1)) LE ASC("z"))
THEN
/* only keep it if it is a number or letter */
lch_newstring = lch_newstring + SUBSTRING(lch_string,i,1).
END.
RETURN lch_newstring.
END FUNCTION.
Or you can simply use regex
System.Text.RegularExpressions.Regex:Replace("Say,Hi!", "[^a-zA-Z0-9]","")

SAS Reading multiple records from one line without Line Feed CRLF

I have only 1 line without line feed (CRLF CRLF), the linefeed is a string of 4 characters, in this example is "#A$3" I don't need dlm for now, and I need to import it from a external file (/files/Example.txt)
JOSH 30JUL1984 1011 SPANISH#A$3RACHEL 29OCT1986 1013 MATH#A$3JOHNATHAN 05JAN1985 1015 chemistry
I need this line into 3 lines:
JOSH 30JUL1984 1011 SPANISH
RACHEL 29OCT1986 1013 MATH
JOHNATHAN 05JAN1985 1015 chemistry
How I can do that in SAS?
*Added: Your solutions are working with this example, but i have a issue, a line that contains more than the maximum length allowed for the line(32,767 bytes),
For example this line in the above exercise contains 5,000 records.
Is it possible?
Use the DLMSTR= option on the infile statement -- this will specify "#A$3" as the delimiter. Then use ## on the input statement to tell SAS to look for more records on the same line.
data test;
infile "/files/Example.txt" dsd dlmstr='#A$3';
informat var $255.;
input var $ ##;
run;
With your example, you will get a data set with 3 records with 1 variable containing the strings you are looking for.
Adjust the length of var as needed.
You could do something like this:
First import the file as a single row (be sure to adjust the length):
DATA WORK.IMPORTED_DATA;
INFILE "/files/Example.txt" TRUNCOVER;
LENGTH Column1 $ 255;
INPUT #1 Column1 $255.;
RUN;
Then parse imported data into variables using a data step:
data result (keep=var1-var4);
set WORK.IMPORTED_DATA;
delim = '#A$3';
end = 1;
begin = 1;
do while (end > 0);
end = find(Column1, delim, begin);
row = substr(Column1, begin, end - begin);
var1 = scan(row, 1);
var2 = scan(row, 2);
var3 = scan(row, 3);
var4 = scan(row, 4);
begin = end + length(delim);
output;
end;
run;
Try this in data step by viewing #A$3 as a multi-character delimiter:
data want (keep=subject);
infile 'C:\sasdata\test.txt';
input;
length line $4500 subject $80;
line=tranwrd(_infile_,"#A$3",'!');
do i=1 by 1 while (scan(line,i,'!') ^= ' ');
subject=scan(line,i,'!');
output;
end;
run;
_infile_ gives the current row that is being read in the data step. I converted the multi-character delimiter #A$2 into a single-character delimiter. tranwrd() can replace a sub-string inside a string. And then use the delimiter inside the scan() function.
Also, if you want to break the values up into separate variables, just scan some more. E.g. put something like B = scan(subject,2); into do loop and data want (keep= A B C D);. Cheers.