Adjusting STATA id and t columns with a user-friendly matlab function - matlab

Here is a piece of matlab code that seems very peculiar. For some time flow reasons, sorting id and t columns the data thoroughly makes stata working almost faster than Excel usage.
function output = stataenterance(N,T)
% generating id and t columns in stata
% helping stata data entrance
time=(1:T)';
t=repmat(time,N,1);
for i = 1:N
x=i*ones(T,1);
end
idc=cell(N,1);
for i=1:N
for j=1:N
idc{j} = repmat(j,T,1);
end
end
id=cell2mat(idc);
output=[id t];
Is there any alternative way do you know faster than this code? Thanks a lot :D

Related

Matlab code Sha-1 hashing password

clc;clear all;close all;
fileID = fopen('H:\dictionary.txt');
S = textscan(fileID,'%s','Delimiter','\n') ;
fclose(fileID);
S = S{1} ;
% remove empty cells
S = S(~cellfun('isempty',S));
n=length(S);
k=0;
for i=1:n
for j=1:n
k=k+1;
y(k,1)=strcat(S(i),S(j))
end
end
This is my code for sha-1 hashing. where i am getting problem in for loop to generate all possible combinations in line
y(k,1)=strcat(S(i),S(j)).
its running properly. but its taking too long. i have been running this code for 2 days still its not getting over as my dictionary contains over 5000 words. please suggest me some good idea to do faster and some better way to improve and crack it.
Since you did not provide some data to test the code, I created my own test data, which is a cell array containing 400 words:
% create cell array with a lot of words
S = repmat({'lalala','bebebebe','ccececeece','ddededde'},1,100);
Here is the code with some small changes but with huge impact on the performance.
Note that the variable 'y' is here named 'yy' so that you can just copy and paste the code to compare it with your existing code:
% Preallocate memory by specifying variable and cell size
yy = cell(n^2,1);
% Measure time
tic
k = 0;
for i=1:n
for j=1:n
k=k+1;
% Replace strcat() with [] and index cell content with S{i} instead
% of indexing the cell itself with S(i)
yy{k}=[S{i},S{j}];
end
end
% Stop and output time measurement
toc
With my examplary data, your code took 7.78s to run and the improved and proposed code took 0.23s on my computer.
I would recommend to read the Matlab docs about Preallocation.

How to merge outputs that are produced in for-loops separately in Matlab

I have some code that is executed in a for loop at the moment, but I will eventually use parfor. That is why I need to save the output for each loop separately:
for Year = 2008:2016
for PartOfYear = 1:12
% some code that produces numerical values, vectors and strings
end
end
I want to save the outputs for each loop separately and in the end merge it together, so that all the outputs are vertically concatenated, starting with Year=2008, PartOfYear = 1 in the first row, then Year = 2008, PartOfYear = 2, and so on. I am stuck as how to write this code - I looked into tables, cells, the eval and the sprintf function but couldn't make it work for my case.
you can use cell (thats what i use mostly)
check out the code
a=1; %some random const
OParray=cell(1);
idx=1;colforYear=1;colforPart=2;colforA=3;
for Year = 2008:2016
for PartOfYear = 1:12
str1='monday';
a=a+1; %some random operation
outPut=strcat(str1,num2str(a));
OParray{idx,colforYear}=Year;
OParray{idx,colforPart}=PartOfYear;
OParray{idx,colforA}=outPut;
idx=idx+1;
end
end
Steer clear of eval, it makes code very difficult to debug and interpret, and either way creating dynamic variables isnt recommended in matlab as good practice. Also, always index starting from 1 going upwards because it just makes your life easier in data handling.
You're best off creating a structure and saving each output as a value in that structure that is indexed with the same value as the one in your for loop. Something like:
Years= [2008:1:2016]
for Year = 1:length(Years)
for PartofYear= 1:12
Monthly_Out{PartofYear}= %whatever code generates your output
end
Yearly_Out{year}= vertcat(Monthly_Out{:,:});
end
Total_Output= vertcat{Yearly_Out{:,:});

How to create array data-structures in MATLAB?

I basically have a large data set file and I want to write a MATLAB script that creates a data structure for it. I have tried to read about using structured arrays in MATLAB, but I haven't found a solution of how to do this. I don't really have a lot of experience in writing scripts on MATLAB.
Edited: My data set is a large list of items with, say, 10 different characteristics of each item written down. So for example, say 100,000 listings of houses and characteristics given could be price, county, state, date when sold, etc. This file is in a txt., xls., or any format you like to play with.
I would like to write a MATLAB script that creates a data structure of it say in the format:
house(i).price
house(i).county
house(i).state
house(i).date
etc
Any suggestions to the right direction or examples of teaching how to do this would be greatly appreciated.
This seems like a very reasonable question, and one that can be easily addressed.
The format of the file, really makes this problem easy or hard. I really don't like .xls files for this kind of work myself, but I realize, you get what you get. Let's assume it's in a tab delimited text file like:
Price County State Date
100000 Sherlock London 2001-10-01
134000 Holmes Dartmoor 2011-12-30
123456 Watson Boston 2003-04-15
IfI would just read the whole thing into an parse the field name row and use dynamic structure naming to make the array of structures.
fid = fopen('data.txt','r');
tline = fgetl(fid);
flds = regexp(tline,'\s*','split');
% initialize the first prototype struct
data = struct();
for ii=1:length(flds)
data.(flds{ii}) = [];
end
ii = 1;
% get the first line of data
tline = fgetl(fid);
while ischar(tline)
% parse the data
rowData = regexp(tline,'\s*','split');
% we're assuming no missing data, etc
% populate the structure
for jj=1:length(flds)
data(ii).(flds{jj}) = rowData{jj};
end
% since we don't know how many lines we have
% we could figure that out, but we won't now
% we'll just use the size extending feature of
% matlab arrays, even though it's slow, just
% to show how we would do it
tline = fgetl(fid);
ii = ii + 1;
end
fclose(fid)
Hope this gets you started!

Trying to save datenum

I have a small problem with saving the datenum in Matlab.
I have a sensors which reads data in real-time. Then I am adding the time when the reading was received by the computer.
I am constructing a matrix with first column time as given from function now, second column being the data. This is done in real-time in Matlab. Everything works perfect untill I have to save the data.
When saving data, the date is rounded automatically. If I plot now my time (da variable), I will get a function which increases.
However, If I plot mam(1,:), I get a flat line.
I have tried many things but with same result.
Do you know, how can I save the matrix (ma) in Matlab in such a way to preserve all the decimals from date?
Here is a small script simulating my problem:
s=0;
j=1;
for i=1:10
s(j)=s(end)+i;
da(j)=now;
pause(1);
j=j+1;
end
ma= [da; s];
dlmwrite('mam.dat',ma);
`
The code you provided works fine. This can be verified by looking at the difference between ma(1,1) and ma(2,1) with ma(1,1) - ma(1,2) which doesn't return 0.
The rounding is occurring when the data is displayed. By default matlab displays 6 decimal places. The command format('long') will cause all decimal places to be displayed.
Style note:
The logic in your loop is a little odd, here is more matlaby way to do what you've written above
nSample = 10;
s = nan(nSample,1); % pre allocate arrays, much faster for big arrays
da = nan(nSample,1);
for i = 1:nSample
if i==1
s(i) = 1;
else
s(i) = s(i-1) + i;
end
da(i) = now;
end
ma = [da; s];
dlmwrite('mam.dat', ma);
If you want to save the data with as much precision as stored in the variables, export to a binary MAT-file instead of textual files:
save mam.mat ma

MATLAB query about for loop, reading in data and plotting

I am a complete novice at using matlab and am trying to work out if there is a way of optimising my code. Essentially I have data from model outputs and I need to plot them using matlab. In addition I have reference data (with 95% confidence intervals) which I plot on the same graph to get a visual idea on how close the model outputs and reference data is.
In terms of the model outputs I have several thousand files (number sequentially) which I open in a loop and plot. The problem/question I have is whether I can preprocess the data and then plot later - to save time. The issue I seem to be having when I try this is that I have a legend which either does not appear or is inaccurate.
My code (apolgies if it not elegant):
fn= xlsread(['tbobserved' '.xls']);
time= fn(:,1);
totalreference=fn(:,4);
totalreferencelowerci=fn(:,6);
totalreferenceupperci=fn(:,7);
figure
plot(time,totalrefrence,'-', time, totalreferencelowerci,'--', time, totalreferenceupperci,'--');
xlabel('Year');
ylabel('Reference incidence per 100,000 population');
title ('Total');
clickableLegend('Observed reference data', 'Totalreferencelowerci', 'Totalreferenceupperci','Location','BestOutside');
xlim([1910 1970]);
hold on
start_sim=10000;
end_sim=10005;
h = zeros (1,1000);
for i=start_sim:end_sim %is there any way of doing this earlier to save time?
a=int2str(i);
incidenceFile =strcat('result_', 'Sim', '_', a, 'I_byCal_total.xls');
est_tot=importdata(incidenceFile, '\t', 1);
cal_tot=est_tot.data;
magnitude=1;
t1=cal_tot(:,1)+1750;
totalmodel=cal_tot(:,3)+cal_tot(:,5);
h(a)=plot(t1,totalmodel);
xlim([1910 1970]);
ylim([0 500]);
hold all
clickableLegend(h(a),a,'Location','BestOutside')
end
Essentially I was hoping to have a way of reading in the data and then plot later - ie. optimise the code.
I hope you might be able to help.
Thanks.
mp
Regarding your issue concerning
I have a legend which either does not
appear or is inaccurate.
have a look at the following extracts from your code.
...
h = zeros (1,1000);
...
a=int2str(i);
...
h(a)=plot(t1,totalmodel);
...
You are using a character array as index. Instead of h(a) you should use h(i). MATLAB seems to cast the character array a to double as shown in the following example with a = 10;.
>> double(int2str(10))
ans = 49 48
Instead of h(10) the plot handle will be assigned to h([49 48]) which is not your intention.