R, strptime, No format="%m-%d-%Y"? - strptime

I have a data as below, which I want to use strptime as I used to do but somehow it doesn't work.
> ic.df$timeonly
[1] "‎01-28-2014" "‎01-28-2014" "‎01-28-2014"
strptime(ic.df$timeonly,format="%m-%d-%Y")
[1] NA
(hmm..)
strptime("01-28-2014",format="%m-%d-%Y")
[1] NA
(what..?)
So I swapped the year,month,day location and this works.
strptime("2014-01-28",format="%Y-%m-%d")
[1] "2014-01-28"
Can anyone explain what's going on here or how I can convert "m-d-y" version into date object?

for me it works:
strptime("01-28-2014", "%m-%d-%Y")
Regards
JDaniel

When you pasted "01-28-2014" there was a special character (hidden) before the "01" that causes the NA error because it was trying to perform the conversion on that character string with special characters.
It works when you modify it to strptime("2014-01-28",format="%Y-%m-%d") because that character in the argument did not contain the original special character. You can copy and paste the code from your original question to a code editor or RStudio's console to verify that.
As a suggestion, to convert "01-28-2014" into a date object in R, you can use the as.Date function which handle the processing via strptime under the hood anyway:
as.Date("01-28-2014",format="%m-%d-%Y")

Related

how to remove # character from national data type in cobol

i am facing issue while converting unicode data into national characters.
When i convert the Unicode data into national using national-of function, some junk character like # is appended after the string.
E.g
Ws-unicode pic X(200)
Ws-national pic N(600)
--let the value in Ws-Unicode is これらの変更は. getting from java end.
move function national-of ( Ws-unicode ,1208 ) to Ws-national.
--after converting value is like これらの変更は #.
i do not want the extra # character added after conversion.
please help me to find out the possible solution, i have tried to replace N'#' with space using inspect clause.
it worked well but failed in some specific scenario like if we have # in input from user end. in that case genuine # also converted to space.
Below is a snippet of code I used to convert EBCDIC to UTF. Before I was capturing string lengths, I was also getting # symbols:
STRING
FUNCTION DISPLAY-OF (
FUNCTION NATIONAL-OF (
WS-EBCDIC-STRING(1:WS-XML-EBCDIC-LENGTH)
WS-EBCDIC-CCSID
)
WS-UTF8-CCSID
)
DELIMITED BY SIZE
INTO WS-UTF8-STRING
WITH POINTER WS-XML-UTF8-LENGTH
END-STRING
SUBTRACT 1 FROM WS-XML-UTF8-LENGTH
What this code does is string the UTF8 representation of the EBCIDIC string into another variable. The WITH POINTER clause will capture the new length of the string + 1 (+ 1 because the pointer is positioned to the next position after the string ended).
Using this method, you should be able to know exactly how long second string is and use that string with the exact length.
That should remove the unwanted #s.
EDIT:
One thing I forgot to mention, in my case, the # signs were actually EBCDIC low values when viewing the actual hex on the mainframe
Use inspect with reverse and stop after first occurence of #

Python : Convert ascii string to unicode string

I have an ascii string, e.g.
"\u005c\u005c192.150.4.89\u005ctpa_test_python\u005c5.1\u005c\videoquality\u005crel_5.1.1Mx86\u005cblacklevelsetting\u005c\u5e8f\u5217\u5e8f\u5217.xml"
And I want to convert it into unicode and dump into a file, so that it gets dumped like:
"\\192.150.4.89\tpa\tpa_test_python\5.1\videoquality\logs\blacklevelsetting\序列序列.xml"
Please share your thoughts.
Thanks,
Abhishek
Use the unicode_escape codec. Python 3 example:
s=rb'\u005c\u005c192.150.4.89\u005ctpa_test_python\u005c5.1\u005cvideoquality\u005crel_5.1.1Mx86\u005cblacklevelsetting\u005c\u5e8f\u5217\u5e8f\u5217.xml'
s=s.decode('unicode_escape')
with open('out.txt','w',encoding='utf8') as f:
f.write(s)
Output to file:
\\192.150.4.89\tpa_test_python\5.1\videoquality\rel_5.1.1Mx86\blacklevelsetting\序列序列.xml
Note: There was an extra backslash before videoquality that turned the v to a \v character (vertical form feed) that I removed from your example string.

creating url with sprintf creates wrong url

I am trying to create a ulr using sprintf. To open various websites I changed part of the URL using sprintf. Now the following code writes 3times the url instread of replacing part of the url????Any suggestions?Many thanks!!
current_stock = 'AAPL';
current_url = sprintf('http://www.finviz.com/quote.ashx?t=%d&ty=c&ta=0&p=d',current_stock)
web(current_url, '-browser')
%d should be the place holer for appl. Result is :
http://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=dhttp://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=dhttp://www.finviz.com/quote.ashx?t=80&ty=c&ta=0&p=dhttp://www.finviz.com/quote.ashx?t=76&ty=c&ta=0&p=d
I'm not sure why you're using %d for a value that is clearly a string? You should be using %s.
The reason you're seeing what you're seeing is that it appears to be giving you a copy of your format string for each character in the AAPL string.
You can see that the differences lie solely in the ?t=XX bit, with XX being, in sequence, 65, 65, 80 and 76, the ASCII codes for the four letters in your string:
vv
http://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=d
http://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=d
http://www.finviz.com/quote.ashx?t=80&ty=c&ta=0&p=d
http://www.finviz.com/quote.ashx?t=76&ty=c&ta=0&p=d
^^
Whether that's a feature or bug in MatLab (a), I couldn't say for sure, but I suspect it'll fix itself if you just use the correct format specifier.
(a) It's probably a feature since it does similarly intelligent stuff with other mismatches, as per here:
If you apply a string conversion (%s) to integer values, MATLAB converts values that correspond to valid character codes to characters. For example, '%s' converts [65 66 67] to ABC.
I would follow this easy way:
current_stock = 'AAPL';
current_url = ['http://www.finviz.com/quote.ashx?t=%d&ty=c&ta=0&p=d',current_stock];
web(current_url,'-browser')
That redirected me to a valid webpage.

sprintf function's arguments and formats in matlab

I'v read and re-read the help about the function sprintf in matlab but I do not understand everything about this function and the format they talk about.
I was asking myself the logic behind the function formats.
If I run the example
sprintf('%05d%s%02d%s%02d',546,'.',1,'.',3)
I get
00546.01.03
which is logic, since the first number (546) is written as an integer and with 5 digits, the second is a character, and so on... But if now I try this
sprintf('%05d%s%02d%s%02d',546,'.',1,'.',3,4)
I get
00546.01.0300004
the first part is the same as above... But the last part of it (00004) has the format '%05d', that corresponds to the first format I entered in the function's arguments. My question is then Does the first format become the 'default' format ?
By trying this
sprintf('%05d%s%02d%s%02d',546,'.',1,'.',3,4,56)
and getting this
00546.01.03000048
I think the answer is no... But why ? And what is then the logic behind those arguments?
Thanks for your help !
You are providing sprintf more arguments than there are %s in the format string. Therefore, sprintf re-uses the format string from begining:
sprintf('%05d%s%02d%s%02d',546,'.',1,'.',3,4,56)
result:
00546.01.03000048
^
starting fromat anew printing 00004 for %05d with 4
The final '8' character is 56 printed as '%s' (if you want to check it out the ascii code of '8' (the char) is 56!)

Why does the filename requested from the server start with Unicode characters?

I use FTP to list the file attributes on the server. I request the name of file and put them into an array. I print the array directly like this:
NSLog(#"%#", array);
What I got is like this:
\U6587\U4ef6\U540d\Uff1afilename.txt
\U6587\U4ef6\U540d\Uff1afilename1.txt
......
When I want to print the Unicode "\U6587\U4ef6\U540d\Uff1a" to see what it is, I got the compiling error: "incomplete universal character name".
However, If I print the name instead of the whole array, I can get the name correctly without the Unicode. But I need to do something with the name in the array. I want to know why the Unicode is there, and is it proper to just remove the Unicode then to do something with the real file name?
In C99, and therefore presumably Objective C too, there are two Unicode escapes:
\uXXXX
\UXXXXXXXX
The lower-case u is followed by 4 hex digits; the upper-case U is followed by 8 hex digits (of which, the first two should be zeroes to be valid Unicode (and the third should be 0 or 1; the maximum Unicode code point is U+10FFFF).
I believe that if you replace the upper-case U's with lower-case u's, you should get the code to compile.
On my Mac OS 10.7.4 system, compiling with GCC 4.7.0 (home built), I compiled this code:
#include <stdio.h>
int main(void)
{
char array[] = "\u6587\u4ef6\u540d\uff1a";
puts(array);
return 0;
}
and got this output:
文件名:
I can't answer why the characters are there, but the colon-like character at the end suggests that the site might be preceding the actual file name with a tag of some sort (analogous to 'file:').
Adding to what Jonathan said, you might have to use stringWithUTF8String:, but I agree that the error is with the capital U rather than u.