How to save financial data to a file in mathematica? - import

I want to save the financial data I get from the following code:
data = FinancialData["GE","OHLCV", "Jan. 1, 2000"];
The format is:
{{yy, mm, dd}, {O, H, L, C, V}}
I want 2 columns, one for the {date}, other for the {O, H, L, C, V} but inside the second column I want to treat each individual value (like a list?)
I have tried:
Export[dir <> filename <> ".csv", data];
data1 = Import[dir <> filename <> ".csv", "Table"];
And also with other formats, "List", etc.
The problem is that I have a running program to backtest the data and it works fine when I get it from FinancialData but I just can't find a way to export and import like if I did the FinancialData...
For example I can't do thinks like:
C = Table[data1[[i]][[2]][[4]], {i, 1, n}];
(Of course everything works if I put data, instead of data1)
Any ideas?

Why do you export to csv when you want to keep the Mathematica list structure intact? Please try the following
data = FinancialData["GE", "OHLCV", "Jan. 1, 2000"];
Export["tmp/test.m", data]
data2 = Import["tmp/test.m"];
and you will see that
data2 == data
gives True

Its not clear what you are trying to accomplish, presumably some external program needs to read the data??
you might do
Export["file.csv", Flatten[data,{2}]]
then partition it when you read it back:
{#[[;;3]],#[[4;;]]}&/#Import["file.csv"]

Related

Import csv as MATLAB struct

I have a large csv log file. Here is a simplified sample:
ts,a.b.c,a.b.d,a.b.e,b.c,b.d,c.d.e,c.d.f,c.g
2021-03-29 06:38:39,1.0000,2,3,28.20,1,2,3,4
2021-03-29 06:38:40,1.0000,2,3,28.20,1,2,3,0.000000
I am using MATLAB's Import Data tool to import it, but, unfortunately, it removes all dots from the header and imports all variables as, e.g.: abc, abd, abe etc.
What is an efficient way to import a csv like the one above as structs?
It am looking for a way to have data imported as structs: a, b and c for this particular log file, so that I can easily access variables as a.b.c or c.d.f.
Here is what I came up with, by simply using readtable.
function res = log_import(logfile)
log_table = readtable(logfile);
res = [];
for i = 1:width(log_table)
str_path = log_table.Properties.VariableDescriptions{i};
fields = strsplit(str_path,'.');
res = setfield(res, fields{1:end}, log_table{:, i});
end
end

Matlab: read multiple files

my matlab script read several wav files contained in a folder.
Each read signal is saved in cell "mat" and each signal is saved in array. For example,
I have 3 wav files, I read these files and these signals are saved in arrays "a,b and c".
I want apply another function that has as input each signal (a, b and c) and the name of corresponding
file.
dirMask = '\myfolder\*.wav';
fileRoot = fileparts(dirMask);
Files=dir(dirMask);
N = natsortfiles({Files.name});
C = cell(size(N));
D = cell(size(N));
for k = 1:numel(N)
str =fullfile(fileRoot, Files(k).name);
[C{k},D{k}] = audioread(str);
mat = [C(:)];
fs = [D(:)];
a=mat{1};
b=mat{2};
c=mat{3};
myfunction(a,Files(1).name);
myfunction(b,Files(2).name);
myfunction(c,Files(3).name);
end
My script doesn't work because myfunction considers only the last Wav file contained in the folder, although
arrays a, b and c cointain the three different signal.
If I read only one wav file, the script works well. What's wrong in the for loop?
Like Cris noticed, you have some issues with how you structured your for loop. You are trying to use 'b', and 'c' before they are even given any data (in the second and third times through the loop). Assuming that you have a reason for structuring your program the way you do (I would rewrite the loop so that you do not use 'a','b', or 'c'. And just send 'myfunction' the appropriate index of 'mat') The following should work:
dirMask = '\myfolder\*.wav';
fileRoot = fileparts(dirMask);
Files=dir(dirMask);
N = natsortfiles({Files.name});
C = cell(size(N));
D = cell(size(N));
a = {};
b = {};
c = {};
for k = 1:numel(N)
str =fullfile(fileRoot, Files(k).name);
[C{k},D{k}] = audioread(str);
mat = [C(:)];
fs = [D(:)];
a=mat{1};
b=mat{2};
c=mat{3};
end
myfunction(a,Files(1).name);
myfunction(b,Files(2).name);
myfunction(c,Files(3).name);
EDIT
I wanted to take a moment to clarify what I meant by saying that I would not use the a, b, or c variables. Please note that I could be missing something in what you were asking so I might be explaining things you already know.
In a certain scenarios like this it is possible to articulate exactly how many variables you will be using. In your case, you know that you have exactly 3 audio files that you are going to process. So, variables a, b, and c can come out. Great, but what if you have to throw another audio file in? Now you need to go back in and add a 'd' variable and another call to 'myfunction'. There is a better way, that not only reduces complexity but also extends functionality to the program. See the following code:
%same as your code
dirMask = '\myfolder\*.wav';
fileRoot = fileparts(dirMask);
Files = dir(dirMask);
%slight variable name change, k->idx, slightly more meaningful.
%also removed N, simplifying things a little.
for idx = 1:numel(Files)
%meaningful variable name change str -> filepath.
filepath = fullfile(fileRoot, Files(idx).name);
%It was unclear if you were actually using the Fs component returned
%from the 'audioread' call. I wanted to make sure that we kept access
%to that data. Note that we have removed 'mat' and 'fs'. We can hold
%all of that data inside one variable, 'audio', which simplifies the
%program.
[audio{idx}.('data'), audio{idx}.('rate')] = audioread(filepath);
%this function call sends exactly the same data that your version did
%but note that we have to unpack it a little by adding the .('data').
myfunction(audio{idx}.('data'), Files(idx).name);
end

Read specific portions of an excel file based on string values in MATLAB

I have an excel file and I need to read it based on string values in the 4th column. I have written the following but it does not work properly:
[num,txt,raw] = xlsread('Coordinates','Centerville');
zn={};
ctr=0;
for i = 3:size(raw,1)
tf = strcmp(char(raw{i,4}),char(raw{i-1,4}));
if tf == 0
ctr = ctr+1;
end
zn{ctr}=raw{i,4};
end
data=zeros(1,10); % 10 corresponds to the number of columns I want to read (herein, columns 'J' to 'S')
ctr=0;
for j = 1:length(zn)
for i=3:size(raw,1)
tf=strcmp(char(raw{i,4}),char(zn{j}));
if tf==1
ctr=ctr+1;
data(ctr,:,j)=num(i-2,10:19);
end
end
end
It gives me a "15129x10x22 double" thing and when I try to open it I get the message "Cannot display summaries of variables with more than 524288 elements". It might be obvious but what I am trying to get as the output is 'N = length(zn)' number of matrices which represent the data for different strings in the 4th column (so I probably need a struct; I just don't know how to make it work). Any ideas on how I could fix this? Thanks!
Did not test it, but this should help you get going:
EDIT: corrected wrong indexing into raw vector. Also, depending on the format you might want to restrict also the rows of the raw matrix. From your question, I assume something like selector = raw(3:end,4); and data = raw(3:end,10:19); should be correct.
[~,~,raw] = xlsread('Coordinates','Centerville');
selector = raw(:,4);
data = raw(:,10:19);
[selector,~,grpidx] = unique(selector);
nGrp = numel(selector);
out = cell(nGrp,1);
for i=1:nGrp
idx = grpidx==i;
out{i} = cell2mat(data(idx,:));
end
out is the output variable. The key here is the variable grpidx that is an output of the unique function and allows you to trace back the unique values to their position in the original vector. Note that unique as I used it may change the order of the string values. If that is an issue for you, use the setOrderparameter of the unique function and set it to 'stable'

Does h5py offer a neat simple way to create or use a group or dataset?

I'm fixing a python script using h5py. It contains code like this:
hdf = h5py.File(hdf5_filename, 'a')
...
g = hdf.create_group('foo')
g.create_dataset('bar', ...whatever...)
Sometimes this runs on a file which already has a group named 'foo', in which case I see "ValueError: Unable to create group (Name already exists)"
One way to fix this is to replace the one simple line with create_group with four lines, like this:
if 'foo' in hdf.keys():
g = hdf['foo']
else:
g = hdf.create_group['foo']
g.create_dataset(...etc...)
Is there a neater way to do this, maybe in only one line? Like how with files in the standard C library, 'a' mode will either append to an existing file, or create a file if it's not already there.
Same goes for datasets - I have
create_dataset('bar', ...)
but should check first:
if 'bar' in g.keys():
d = g['bar']
else:
d = g.create_dataset('bar')
My wish: to find h5py has methods named create_or_use_group() and create_or_use_dataset(). What actually exists?
Yes: require_group and require_dataset:
with h5py.File("/tmp/so_hdf5/test2.h5", 'w') as f:
a = f.create_dataset('a',data=np.random.random((10, 10)))
with h5py.File("/tmp/so_hdf5/test2.h5", 'r+') as f:
a = f.require_dataset('a', shape=(10, 10), dtype='float64')
d = f.require_dataset('d', shape=(20, 20), dtype='float64')
g = f.require_group('foo')
print(a)
print(d)
print(g)
Note that you do need to know the shape and dtype of the dataset, otherwise require_dataset throws a TypeError. In that case, you could do something like:
try:
a = f.require_dataset('a', shape=(10, 10), dtype='float64')
except TypeError:
a = f['a']
If you don't already know the shape and dtype, I don't think there's much advantage to require_dataset over using try ... except ...

How to export data from Matlab to excel for a loop?

I have a code for "for loop"
for i=1:4
statement...
y=sim(net, I);
end
now i need to export the value of y to excel sheet. for that i used..
xlswrite('output_data.xls', y, 'output_data', 'A1')
but my problem is that the ID of excel i.e. "A1" should change according to each iteration... in my case for iteration 1-> A1, iteration-> A2 and so on..
anybody please help me out ..
thanks in advance. for any assistance.. or suggestion..
You can store sim outputs in a vector (y(ii)) and save in the sheet with a single write. This is also more efficient since you perform a single bulk-write instead of many small writes.
Specify the first cell and y will be written starting from there.
last = someNumber;
for i=1:last statement... y(i)=sim(net, I); end
xlswrite('output_data.xls', y', 'output_data', 'A1');
If you prefer specify the range write ['A1:A',num2str(last)] instead of A1.
If you really want to write within the loop try:
for ii=1:last
...
y=sim(net, I);
xlswrite('output_data.xls', y, 'output_data', sprintf('A%d',ii));
end
You can also do for yourself what xlswrite does internally, which is interact using COM. I prefer to do this when I have a frequently used excel template or data file, because it allows for more control (albeit with more lines of code).
Excel = actxserver('Excel.Application');
Workbook = Excel.Workbooks.Open('myExcelFile.xlsx');
MySheet = Excel.ActiveWorkBook.Sheets.Item(1);
set( get(MySheet,'Range','A1:A10'), 'Value', yourValues);
...
invoke(Workbook, 'Save');
invoke(Excel, 'Quit');
delete(Excel);
This would allow you to save new data to new ranges without re-opening excel each time.
Even better would be to define an oncleanup function (as does xlswrite) to prevent lost file locks (especially when you're doing things like exiting out of debug mode):
...
myWorkbook = Excel.Workbooks.Open(filename,0,true);
cleanUp = onCleanup(#()xlsCleanup(Excel, filename));
function xlsCleanup(Excel,filepath)
try
Excel.DisplayAlerts = 0; %// Turn off dialog boxes
[~,n,e] = fileparts(filepath); %// Excel API expects just the filename
fileName = [n,e];
Excel.Workbooks.Item(fileName).Close(false);
end
Excel.Quit;
end
You can put xlswrite after for loop.You just want to do is save you result in a matrix.This function can write a matrix.
also,you can use [] to combine string to change the range.
>> for i=1:4
Range=['A' num2str(i)]
end
Range =
A1
Range =
A2
Range =
A3
Range =
A4
But,this is a bad way.You should open and write Excel file every time.