I am quite new to data science and just started a project.
I have to convert Excel files to a dataframe I can use for this project. I would not like to do some VSB stuff to conver the Excel files but would love to do this in R.
I have 5 different types input files, 4 of them works great. One of them has negative numbers.
When I use the readxl function it changes numbers like: -6.75 to -6.7548000000000004 automaticly...
I've tried change the col_types or digits but maybe not in the right way.I would like to bring them back to 2 decimals
Anyone can help me with this?
This is probably right. Perhaps Excel is just displaying the values rounded to two decimal places. If you want to round the numbers after reading them in R, you can use round():
https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/Round
Related
I am very new to MATLAB. I am sorry if my question is basic. I am using "printmat" function to show some matrices in the command console. For example, printmat(A) and printmat(B), where A = 2.79 and B = 0.45e-7 is a scalar (for the sake of simplicity).
How do I increase the precision arbitrarily to seven decimals? For example: my output looks like 2.7943234 and B = 0.00000004563432.
How do I add a currency (say dollar) figure to the output of printmat?
How do I add a percentage figure (%) to the output of printmat?
Note: The reason I use printmat is that I can name my rows and columns. If you know a better function that can do all above, I would be glad to know.
Regards Mariam. From what I understand, you would like to display the numbers and show their full precision. I am also newbie, If I may contribute, you could convert the number data to string data (for display purposes) by using the sprintf function.
I am using the variable A=2.7943234 as example. This value will not display the full precision, instead it will display 2.7943. To show all the decimal tails, you could first convert this to string by
a = sprintf('%0.8f',A);
It will set the value a to a string '2.79432340'. The %0.8f means you want it to display 8 decimal tails. For this example,%0.7f is sufficient of course.
Another example: A=0.00000004563432, use %0.14f.
A=0.00000004563432;
a=sprintf('%0.14f $ or %%',A);
the output should be : '0.00000004563432 $ or %'.
You could analyze further in https://www.mathworks.com/help/matlab/ref/sprintf.html
You could try this first. If this does not help to reach your objective, I appreciate some inputs. Thanks.
The printmat function is very obsolete now. I think table objects are its intended successor (and functions such as array2table to convert a matrix to a table of data). Tables allow you to add row and column names and format the columns in different ways. I don't think there's a way to add $ or % to each number, but you can specify the units of each column.
In general, you can also format the display precision using format. Something like this may be what you want:
format long
I was wondering if somebody could help me with this one. It’s probably so simple, but its got me at a loss.
I have a Excel spreadsheet looking like this
Basically it’s a simple if statement if A1 = 10 then B1 will display 600, if not it will display a 0. It works ……but I can not format the number in B1, I right click Format cell, number, and select currency with 2 decimal places. No change . Colour, alignment, Boarder and fill works, its just the number type will not. B1 will feed other cells to workout a formulas. I need it in a currency format.
Does anyone know the answer to this.
Thanks in advance
David
Use the integers instead of strings. Excel interprets your strings as... well... strings, and not numbers. Change your strings to numbers like this:
=IF(A1=10,600,0)
This should work and it should justify right again.
I'm reading in a csv file that is about 80MB - data_O3. It's about 250,000 x 5 in size. I created E, which is a little bit larger because it has all the days (data_O3 is missing some days). I want to compare the two so that if the date (saved in variable d3) and siteID (d4) are the same, the data point (column 5) is placed in E.
for j = 1:size(data_O3,1)
E(strcmp(d3,data_O3{j,3})&d4 == data_O3{j,4},5) = data_O3(j,5);
end
This script works fine, but for some reason, running it takes longer than expected. I've run the same code for other data that were only slightly smaller with no problem. Is this an issue with the strcmp code or something else?
The script and files used can be found here: https://www.dropbox.com/sh/7bzq3m1ixfeuhu6/i4oOvxHPkn
There are certainly see a number of ways to speed this up significantly.
First of all, read in all numeric data in as numbers. Matlab is not optimized to work with strings, and even cells should generally be avoided as much as possible. If you want to keep everything as strings, use another language (python or perl)
Once you have the state, county and site read in as numbers, then create a number instead of a string for the siteID. One approach would be to use the formula:
siteID = siteNum + 1e4*countyCode + 1e7*stateCode
That would generate unique siteIDs for all sites.
Use datenum to convert the date field into a number.
You are now in a position where the data_O3 defined on line 79 can be a purely numeric array (no cells!), as can your E matrix. That alone will make the process many times faster.
You also might want to define the E as something other than NaN. Maybe give it values of -1.
There may be more optimizations you can do in the comparison, but do the above first and I expect you will see a huge improvement.
I need to generate a random number that is between .0000001 and 1, I have been using rand(1) but this only gives me 4 decimal points, is there any other way to do this generation?
Thanks!
From the Octave docs:
By default, Octave displays 5 significant digits in a human readable form (option ‘short’ paired with ‘loose’ format for matrices).
So it's probably an issue with the way you're printing the value rather than the value itself.
That same page shows the other output formats in addition to short, the one you may want to look in to is long, giving 15 significant digits.
And there is also the output_precision which can be set as per here:
old_val = output_precision (7)
disp (whatever)
old_val = output_precision (old_val)
Set the output_precision to 7 and it should be ok :)
Setting the output precision won't help though because the number can still be less than .0000001 in theory but you will only be displaying the first 7 digits. The simplest way is:
req=0;
while (req<.0000001)
req=rand(1);
end
It is possible that this could get you stuck in a loop but it will produce the right number. To display all the decimals you can also use the following command:
format long
This will show you 15 decimal places. To switch back go:
formay short
I'm having some difficulties processing some numbers. The results I get are some like:
0.000093145+1.6437e-011i
0.00009235+4.5068e-009i
I've already try to use format long and as alternative passing to string and then str2num and with no good results also. Although is not being possible to convert them properly as I want (e.g. to a number with 9 decimals) If nobody is able to help me, at least I would appreciate if someone can tell me how to interpret the meaning of the i base.
You are talking about the imaginary unit i. If you are just using real number, you could neglect the imaginary part (it is very small). Thus, try:
real(0.000093145+1.6437e-011i)
After taking real() you can also control the decimal place formatting by sprintf:
sprintf('%0.2f', pi)
Will result in:
'3.14'
Place a 9 instead of a 2 for 9 decimal places.