creating url with sprintf creates wrong url - matlab

I am trying to create a ulr using sprintf. To open various websites I changed part of the URL using sprintf. Now the following code writes 3times the url instread of replacing part of the url????Any suggestions?Many thanks!!
current_stock = 'AAPL';
current_url = sprintf('http://www.finviz.com/quote.ashx?t=%d&ty=c&ta=0&p=d',current_stock)
web(current_url, '-browser')
%d should be the place holer for appl. Result is :
http://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=dhttp://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=dhttp://www.finviz.com/quote.ashx?t=80&ty=c&ta=0&p=dhttp://www.finviz.com/quote.ashx?t=76&ty=c&ta=0&p=d

I'm not sure why you're using %d for a value that is clearly a string? You should be using %s.
The reason you're seeing what you're seeing is that it appears to be giving you a copy of your format string for each character in the AAPL string.
You can see that the differences lie solely in the ?t=XX bit, with XX being, in sequence, 65, 65, 80 and 76, the ASCII codes for the four letters in your string:
vv
http://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=d
http://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=d
http://www.finviz.com/quote.ashx?t=80&ty=c&ta=0&p=d
http://www.finviz.com/quote.ashx?t=76&ty=c&ta=0&p=d
^^
Whether that's a feature or bug in MatLab (a), I couldn't say for sure, but I suspect it'll fix itself if you just use the correct format specifier.
(a) It's probably a feature since it does similarly intelligent stuff with other mismatches, as per here:
If you apply a string conversion (%s) to integer values, MATLAB converts values that correspond to valid character codes to characters. For example, '%s' converts [65 66 67] to ABC.

I would follow this easy way:
current_stock = 'AAPL';
current_url = ['http://www.finviz.com/quote.ashx?t=%d&ty=c&ta=0&p=d',current_stock];
web(current_url,'-browser')
That redirected me to a valid webpage.

Related

Replace every non letter or number character in a string with another

Context
I am designing a code that runs a bunch of calculations, and outputs figures. At the end of the code, I want to save everything in a nice way, so my take on this is to go to a user specified Output directory, create a new folder and then run the save process.
Question(s)
My question is twofold:
I want my folder name to be unique. I was thinking about getting the current date and time and creating a unique name from this and the input filename. This works but it generates folder names that are a bit cryptic. Is there some good practice / convention I have not heard of to do that?
When I get the datetime string (tn = datestr(now);), it looks like that:
tn =
'07-Jul-2022 09:28:54'
To convert it to a nice filename, i replace the '-',' ' and ':' characters by underscores and append it to a shorter version of the input filename chosen by the user. I do that using strrep:
tn = strrep(tn,'-','_');
tn = strrep(tn,' ','_');
tn = strrep(tn,':','_');
This is fine but it bugs me to have to use 3 lines of code to do so. Is there a nice one liner to do that? More generally, is there a way to look for every non letter or number character in a string and replace it with a given character? I bet that's what regexp is there for but frankly I can't quite get a hold on how regexps work.
Your point (1) is opinion based so you might get a variety of answers, but I think a common convention is to at least start the name with a reverse-order date string so that sorting alphabetically is the same as sorting chronologically (i.e. yymmddHHMMSS).
To answer your main question directly, you can use the built-in makeValidName utility which is designed for making valid variable names, but works for making similarly "plain" file names.
str = '07-Jul-2022 09:28:54';
str = matlab.lang.makeValidName(str)
% str = 'x07_Jul_202209_28_54'
Because a valid variable can't start with a number, it prefixes an x - you could avoid this by manually prefixing something more descriptive first.
This option is a bit more simple than working out the regex, although that would be another option which isn't too nasty here using regexprep and replacing non-alphanumeric chars with an underscore:
str = regexprep( str, '\W', '_' ); % \W (capital W) matches all non-alphanumeric chars
% str = '07_Jul_2022_09_28_54'
To answer indirectly with a different approach, a nice trick with datestr which gets around this issue and addresses point (1) in one hit is to use the following syntax:
str = datestr( now(), 30 );
% str = '20220707T094214'
The 30 input (from the docs) gives you an ISO standardised string to the nearest second in reverse-order:
'yyyymmddTHHMMSS' (ISO 8601)
(note the T in the middle isn't a placeholder for some time measurement, it remains a literal letter T to split the date and time parts).
I normally use your folder naming approach with a meaningful prefix, replacing ':' by something else:
folder_name = ['results_' strrep(datestr(now), ':', '.')];
As for your second question, you can use isstrprop:
folder_name(~isstrprop(folder_name, 'alphanum')) = '_';
Or if you want more control on the allowed characters you can use good old ismember:
folder_name(~ismember(folder_name, ['0':'9' 'a':'z' 'A':'Z'])) = '_';

how to remove # character from national data type in cobol

i am facing issue while converting unicode data into national characters.
When i convert the Unicode data into national using national-of function, some junk character like # is appended after the string.
E.g
Ws-unicode pic X(200)
Ws-national pic N(600)
--let the value in Ws-Unicode is これらの変更は. getting from java end.
move function national-of ( Ws-unicode ,1208 ) to Ws-national.
--after converting value is like これらの変更は #.
i do not want the extra # character added after conversion.
please help me to find out the possible solution, i have tried to replace N'#' with space using inspect clause.
it worked well but failed in some specific scenario like if we have # in input from user end. in that case genuine # also converted to space.
Below is a snippet of code I used to convert EBCDIC to UTF. Before I was capturing string lengths, I was also getting # symbols:
STRING
FUNCTION DISPLAY-OF (
FUNCTION NATIONAL-OF (
WS-EBCDIC-STRING(1:WS-XML-EBCDIC-LENGTH)
WS-EBCDIC-CCSID
)
WS-UTF8-CCSID
)
DELIMITED BY SIZE
INTO WS-UTF8-STRING
WITH POINTER WS-XML-UTF8-LENGTH
END-STRING
SUBTRACT 1 FROM WS-XML-UTF8-LENGTH
What this code does is string the UTF8 representation of the EBCIDIC string into another variable. The WITH POINTER clause will capture the new length of the string + 1 (+ 1 because the pointer is positioned to the next position after the string ended).
Using this method, you should be able to know exactly how long second string is and use that string with the exact length.
That should remove the unwanted #s.
EDIT:
One thing I forgot to mention, in my case, the # signs were actually EBCDIC low values when viewing the actual hex on the mainframe
Use inspect with reverse and stop after first occurence of #

Checking the format of a string in Matlab

So I'm reading multiple text files in Matlab that have, in their first columns, a column of "times". These times are either in the format 'MM:SS.milliseconds' (sorry if that's not the proper way to express it) where for example the string '29:59.9' would be (29*60)+(59)+(.9) = 1799.9 seconds, or in the format of straight seconds.milliseconds, where '29.9' would mean 29.9 seconds. The format is the same for a single file, but varies across different files. Since I would like the times to be in the second format, I would like to check if the format of the strings match the first format. If it doesn't match, then convert it, otherwise, continue. The code below is my code to convert, so my question is how do I approach checking the format of the string? In otherwords, I need some condition for an if statement to check if the format is wrong.
%% Modify the textdata to convert time to seconds
timearray = textdata(2:end, 1);
if (timearray(1, 1) %{has format 'MM.SS.millisecond}%)
datev = datevec(timearray);
newtime = (datev(:, 5)*60) + (datev(:, 6));
elseif(timearray(1, 1) %{has format 'SS.millisecond}%)
newtime = timearray;
You can use regular expressions to help you out. Regular expressions are methods of specifying how to search for particular patterns in strings. As such, you want to find if a string follows the formats of either:
xx:xx.x
or:
xx.x
The regular expression syntax for each of these is defined as the following:
^[0-9]+:[0-9]+\.[0-9]+
^[0-9]+\.[0-9]+
Let's step through how each of these work.
For the first one, the ^[0-9]+ means that the string should start with any number (^[0-9]) and the + means that there should be at least one number. As such, 1, 2, ... 10, ... 20, ... etc. is valid syntax for this beginning. After the number should be separated by a :, followed by another sequence of numbers of at least one or more. After, there is a . that separates them, then this is followed by another sequence of numbers. Notice how I used \. to specify the . character. Using . by itself means that the character is a wildcard. This is obviously not what you want, so if you want to specify the actual . character, you need to prepend a \ to the ..
For the second one, it's almost the same as the first one. However, there is no : delimiter, and we only have the . to work with.
To invoke regular expressions, use the regexp command in MATLAB. It is done using:
ind = regexp(str, expression);
str represents the string you want to check, and expression is a regular expression that we talked about above. You need to make sure you encapsulate your expression using single quotes. The regular expression is taken in as a string. ind would this return the starting index of your string of where the match was found. As such, when we search for a particular format, ind should either be 1 indicating that we found this search at the beginning of the string, or it returns empty ([]) if it didn't find a match. Here's a reproducible example for you:
B = {'29:59.9', '29.9', '45:56.8', '24.5'};
for k = 1 : numel(B)
if (regexp(B{k}, '^[0-9]+:[0-9]+\.[0-9]+') == 1)
disp('I''m the first case!');
elseif (regexp(B{k}, '^[0-9]+\.[0-9]+') == 1)
disp('I''m the second case!');
end
end
As such, the code should print out I'm the first case! if it follows the format of the first case, and it should print I'm the second case! if it follows the format of the second case. As such, by running this code, we get:
I'm the first case!
I'm the second case!
I'm the first case!
I'm the second case!
Without knowing how your strings are formatted, I can't do the rest of it for you, but this should be a good start for you.

ValueError when converting to int lines retrieved via getline.linecache

this is my first message here, I hope I will not commit any mistake.
I am writing a python 2.7 script which performs comparisons between lines from a long list of lines provided as an external input file. Some of these lines contain just numbers, and on those I perform simple sums after their retrieval via getline.linecache.
My problem is that after a certain number of lines I am getting the error:
ValueError: invalid literal for int() with base 10
I do understand that somehow this has to do with the fact that there is some problem when I try to convert the lines retrieved to the integer type, but according to what I read each line should be retrieved from a memory database as a string, and indeed if I try to print the type of the values retrieved I get str. I printed the problematic values in order to understand why they failed to be converted to int: at first i included some semantic mistakes (I was taking some wrong lines, which were containing letters, and this of course failed to be converted to int), but still I get the error on merely numerical strings. On all of those numerical strings, I tried len(linecache.getline('input', line_n)) to see if any extra characters were present, but I just found '\n', which does not give any problems when converting from str to int.
My input file is made by a series of lines, some numerical some not; here are few lines:
1
id3021-a
1
129485768
129485769
2
id2034
102
944709842
944709848
For examples, line 4 here can be retrieved, but not converted to int. How could I convert str to int without getting errors?
I found the solution! Adding a '0' to the beginning of the string fixes the problem (I do not know why, the problematic lines were not empty):
int('0' + linecache.getline('input', line_n))
See here: Trouble converting string to int in Django/Python

sprintf function's arguments and formats in matlab

I'v read and re-read the help about the function sprintf in matlab but I do not understand everything about this function and the format they talk about.
I was asking myself the logic behind the function formats.
If I run the example
sprintf('%05d%s%02d%s%02d',546,'.',1,'.',3)
I get
00546.01.03
which is logic, since the first number (546) is written as an integer and with 5 digits, the second is a character, and so on... But if now I try this
sprintf('%05d%s%02d%s%02d',546,'.',1,'.',3,4)
I get
00546.01.0300004
the first part is the same as above... But the last part of it (00004) has the format '%05d', that corresponds to the first format I entered in the function's arguments. My question is then Does the first format become the 'default' format ?
By trying this
sprintf('%05d%s%02d%s%02d',546,'.',1,'.',3,4,56)
and getting this
00546.01.03000048
I think the answer is no... But why ? And what is then the logic behind those arguments?
Thanks for your help !
You are providing sprintf more arguments than there are %s in the format string. Therefore, sprintf re-uses the format string from begining:
sprintf('%05d%s%02d%s%02d',546,'.',1,'.',3,4,56)
result:
00546.01.03000048
^
starting fromat anew printing 00004 for %05d with 4
The final '8' character is 56 printed as '%s' (if you want to check it out the ascii code of '8' (the char) is 56!)