I have following print statement:-
print("{0:0.2f}% in traing set".format((len(X_train)/(df.index))*100))
where
X_train = is 70% of sample data(training data)
and
df.index is RangeIndex(start=0, stop=768, step=1)
when I run print statement I get error as follow
non-empty format string passed to object.__format__
Answer of the print statement should be
69.92 in training set
30.08 in test set
I am not able to correct this behavior.
Any help will be appreciated.
Bharat.
I get this error if the object that's supposed to be formatted is a numpy array. RangeIndex() must be producing something similar.
In [82]: "{0:0.2f}% in traing set".format((20/(np.arange(3)))*100)
...
TypeError: non-empty format string passed to object.__format__
In [83]:
Same error if I just give it a list: "{0:0.2f}% in traing set".format([1,2,3])
Formatting works fine if the argument is just a number:
In [83]: "{0:0.2f}% in traing set".format((20/3)*100)
Out[83]: '666.67% in traing set'
The only format specifier that works with an object like a list or array is the plain str/repr,
In [102]: print("{} in traing set".format((len(X_train)/(np.arange(1,3)))*100))
[ 5000. 2500.] in traing set
===============
It's hard to read code in a comment; better to add it as an edit to the original question (clearly marked as such). Here's my guess as to what the formatting is:
value1 = len(X_train)
value2 = len(X_test)
value3 = len(df.index)
value4 = (value1/value3)*100
value5 = (value2/value3)*100
print("{0:0.2f}% in training set".format(value4))
print("{0:0.2f}% in test set".format(value5))
Yes, it is a good idea to separate the calculation of the values from the print formatting.
Related
I have a measurement device PCE-VDL, which gives me measurements in following CSV format below, which I need to import to OCTAVE for further investigation.
Especially I need to import last 3 columns with xyz acceleration data.
The file is in CSV format with delimiter of semicolon ";".
I have tried:
A_1 = importdata ("file.csv", ";", 3);
but have recieved
error: missing_idx(10): out of bound 9
The CSV file looks like this:
#PCE-VDL X - TableView series
#2020.16.11
#Date;Time;Duration [s];t [°C];RH [%];p [mbar];aX [g];aY [g];aZ [g];
2020.28.10;16:16:32:0000;00:000;;;;0,0195;-0,0547;1,0039;
2020.28.10;16:16:32:0052;00:005;;;;0,0898;-0,0273;0,8789;
2020.28.10;16:16:32:0104;00:010;;;;0,0977;-0,0313;0,9336;
2020.28.10;16:16:32:0157;00:015;;;;0,1016;-0,0273;0,9297;
The numbers in last 3 columns have also decimal coma and not decimal point. So there probably should be done also some conversion.
Thank you very much for any help.
Regards
EDIT: 18.11.2020
Thanks for help. I have tried now following:
A_1_str = fileread ("file.csv");
A_1_str_m = strrep (A_1_str, ".", "-");
A_1_str_m = strrep (A_1_str_m, ",", ".");
save "A_1_str_m.csv" A_1_str_m;
A_1 = importdata ("A_1_str_m.csv", ";", 8);
and still receive error: file_content(140): out of bound 139
There is probably some problem with time format in first columns, which I do not want to read. I just need last three columns.
After my conversion, the file looks like this:
# Created by Octave 5.1.0, Wed Nov 18 21:40:52 2020 CET <zdenek#ASUS-F5V>
# name: A_1_str_m
# type: sq_string
# elements: 1
# length: 7849
#PCE-VDL X - TableView series
#2020-16-11
#Date;Time;Duration [s];t [°C];RH [%];p [mbar];aX [g];aY [g];aZ [g];
2020-28-10;16:16:32:0000;00:000;;;;0.0195;-0.0547;1.0039;
2020-28-10;16:16:32:0052;00:005;;;;0.0898;-0.0273;0.8789;
2020-28-10;16:16:32:0104;00:010;;;;0.0977;-0.0313;0.9336;
Thanks for support!
You can first read the data with fileread, which stores the data as a string. Then you can manipulate the string like this:
new_string = strrep(string, ",", ".");
strrep replaces all occurrences of a pattern within a string. Afterwards you save this data as a separate file or you overwrite the existing file with the manipulated data. When this is done you proceed as you have tried before.
EDIT: 19.11.2020
To avoid the additional heading lines in the new file, you can save it like this:
fid = fopen("A_1_str_m.csv", "w");
fputs(fid, A_1_str_m);
fclose(fid);
fputs will just write the string to the file.
The you can read the new file with dlmread.
A1_buf = dlmread("A_1_str_m.csv", ";");
A1_buf = real(A1); # get the real value of the complex number
A1_buf(1:3, :) = []; # remove the headlines
A1 = A1_buf(:, end-3:end-1); # get only the the 3 columns you're looking for
This will give you the three columns your looking for. But the date and time data will be ignored.
EDIT 20.11.2020
Replaced abs with real, so the sign of the value will be kept.
Use csv2cell from the io package.
I am running python 3.5, I have imported pandas. My csv file (payinfo.csv) looks like:
"01 DEC",1234.45,2344,11,1212.66
"01 NOV", 9898.33, 2343,12,1009.33
When I run the following:
dateparse = lambda x: pd.datetime.strptime(x,"%d %b")
pay_data = pd.read_csv('payinfo.csv', parse_dates = ['Date'], date_parse
I always get
"ValueError: time data '“01 DEC”' does not match format '%d %b'
I am a new programmer to python, and would appreciate any help.
I think it was just the double quotes around string that caused that error. Try stripping away any hardcoded (not 'python generated') single or double quote marks with .strip('"')
Example:
a = '"01 DEC"'
# Gives error
#a = pd.datetime.strptime(a,"%d %b")
# string without unneccessary quote marks
a = pd.datetime.strptime(a.strip('"'),"%d %b")
print a
Output:
1900-12-01 00:00:00
You haven't included the headers in the question. But this works:
import io
import pandas as pd
a = io.StringIO(u""""01 DEC",1234.45,2344,11,1212.66
"01 NOV", 9898.33, 2343,12,1009.33""")
dateparse = lambda x: pd.datetime.strptime(x,"%d %b")
df = pd.read_csv(a,header=None, parse_dates=[0], date_parser=dateparse)
print df
You can append custom year to x before converting it to datetime
.strptime(year + x,"%Y%d %b")
Output:
0 1 2 3 4
0 1900-12-01 1234.45 2344 11 1212.66
1 1900-11-01 9898.33 2343 12 1009.33
Thank you both for your input. From your answers I modified the csv file to remove the quotes around the date entry, then things worked fine! I am puzzled because I have used the read_csv method before on similar data that looked like this:
"12/31/2016","The UPS Store","THE UPS STORE 031","10.74","debit","Business Services","Interest Checking","",""
"12/31/2016","Hospice of The East Bay","HOSPICE OF THE EAST","14.00","debit","Clara","Interest Checking","",""
and had no problems – in fact I didn't need to parse the data at all and the reader was able to correctly identify the date. Huh! I guess the real issue was that the date was stored in an unconventional format. In any case, I have the answer and thank you both for your answers.
I also tried the following:
datestr('19-01-2004','dd-mm-yyyy')
ans =
26-06-0024
I am new to MATLAB, so I am not sure what else to check.
In the function datestr(), the 2nd parameter denotes how the output should look like. It doesn't say anything about the input.
Essentially, you try to perform 2 steps: parse a string and then format the parsed date again.
So you can do
n = datenum('19-01-2004','dd-mm-yyyy')
datestr(n, 'yyyy-mm-dd')
and you'll get an n of 731965 and a final output of 2004-01-19.
You can as well do
v = datevec('19-01-2004','dd-mm-yyyy')
datestr(v, 'yyyy-mm-dd')
and your v becomes [2004 1 19 0 0 0].
So remember: step 1 - parsing of input with the appropriate format string, step 2 - formatting of output with the wanted format string.
If you want to give the date in a "clean" and readable format, you could just do
v = [2004 1 19 0 0 0]
datestr(v, 'yyyy-mm-dd')
datestr(v, 'dd.mm.yyyy')
datestr(v, 'mm/dd/yyyy')
When using datestr to convert a date string from one form to another, the format of the input date string is limit to those listed here. The format of your input '19-01-2004' is 'dd-mm-yyyy' and is not one of the supported formats.
If we change the input string to '01/19/2004', which is the supported format 'mm/dd/yyyy', we get the correct output:
>> datestr('01/19/2004','dd-mm-yyyy')
ans =
19-01-2004
To circumvent the limited number of supported input formats, the documentation recommends using datenum first. So you can map your original input onto itself like:
>> datestr(datenum('19-01-2004','dd-mm-yyyy'),'dd-mm-yyyy')
ans =
19-01-2004
As for why MATLAB returns the date it does has to do with how it handles the unknown format.
I suspect whatever method they use to finally decide upon a format results in a really small date number, hence the year 24 output.
I am writing a script that takes two strings of the format 'HH:MM' as inputs. These strings are times in hours (HH) and minutes (MM). I would like to display an error message if the user inputs the wrong format for a time, such as 'HH:MM:SS' if they think the script can interpret seconds as well. I have it set up to accept negative times, so an input like '-HH:MM' will be interpreted correctly. An input like 'HHH:MMM' with variable hour and minute sizes is also OK, actually any input of the form %s:%s should be accepted since errors like '5:30 AM' are dealt with later.
What I need is to test that the inputs are of the form "string colon string" before reading, is this possible? To make the problem clearer, here is code explaining how I read the inputs time1 and time2:
[hour1, min1] = strread(time1, '%s%s', 'delimiter', ':');
[hour2, min2] = strread(time2, '%s%s', 'delimiter', ':');
If time1 and time2 are formatted wrong, strread throws an unhelpful error. I want to display my own error first to explain what the problem was. How can I check the formats of time1 and time2 before actually reading them?
Ideas:
formatSpec = '%s : %s';
input = textscan(time1,formatSpec);
%Compare input to formatSpec somehow to see if they match?
if (no_match)
error('time1 must be formatted as HH:MM');
end
You can try something like that :
time1 = '10:21';
if isempty(regexp(time1,'^\d{2}:\d{2}'))
disp('the format is wrong') %won't display because the format if ok
end
And to check other format :
time1 = '100:21';
if isempty(regexp(time1,'^\d{2}:\d{2}'))
disp('the format is wrong') %will display because the format is wrong
end
EDIT
If you want to accept 'HHH:MMM' and other cases use:
regexp(time1,'^\d+:\d+')
And for the negative case ('-HHH:MMM' or other negative cases) use:
regexp(time1,'^-\d+:\d+')
Second edit
And if you want to test it in only one line :
regexp(time1,'^(-|.){1}\d+:\d+$') % however this one doesn't support 'HH:MM AM'
regexp(time1,'^(-|.){1}\d+:\d+.+$') % Now support 'HH:MM AM'
I tested it and it returns 1 for every case you mentionned.
It looks like you accept any numbers as long as there is only one : sign. In another words, perhaps you wanted to detect the more-than-one-colon case? You could count number of : signs and generate errors for those cases first before processing the string?
How do I break a long Hexadecimal Value in Coffeescript so that it spans multiple lines?
authKey = 0xe6b86ae8bdf696009c90e0e650a92c63d52a4b3232cca36e0ff2f5911e93bd0067df904dc21ba87d29c32bf17dc88da3cc20ba65c6c63f21eaab5bdb29036b83
to something like
authKey = 0xe6b86ae8bdf696009c90e0e650a92c63d52a4b323\
2cca36e0ff2f5911e93bd0067df904dc21ba87d29c3\
2bf17dc88da3cc20ba65c6c63f21eaab5bdb29036b83
Using \ results in an Unexpected 'NUMBER' Error,
using line break in an Unexpected 'INDENT' Error
There's actually no point in doing this in CoffeeScript because numbers are stored as 64-bit IEEE 754 values and you have too many bits of precision for the value to be stored as a number.
If you write
authKey = 0xe6b86ae8bdf696009c90e0e650a92c63d52a4b3232cca36e0ff2f5911e93bd0067df904dc21ba87d29c32bf17dc88da3cc20ba65c6c63f21eaab5bdb29036b83
console.log(authKey)
then the value logged is
1.2083806867379407e+154
You want to store your authKey as a string or byte array, both of which are trivial to write across multiple lines.
Like others have said, this doesn't really make a whole lot of sense to be stored in a number, as opposed to a string; however, I decided to throw something together to allow it anyway:
stringToNumber = ( str ) -> parseInt( str.replace( /\n/g, '' ) )
authKey = stringToNumber """
0xe6b86ae8bdf696009c90e0e650a92c63d52a4b323
2cca36e0ff2f5911e93bd0067df904dc21ba87d29c3
2bf17dc88da3cc20ba65c6c63f21eaab5bdb29036b83
"""
Like Ray said, this will just result in:
1.2083806867379407e+154