Filtering by Column in Matlab for a list or variety of values - matlab

I've produced a code which separates data within a text file into the required format, filters the data and averages the output (in this case, the value in the fourth column)
I am trying to filter the data in column one for a list of values at the same time, with no strict pattern for the values. e.g 1001, 1007, 1048, 1192, 1200 ....
Currently my code only filters by a certain value (1001) is there a way of incorporating a list of values into this function?
C_f = C(C(:,1) == 1001 , :);
Any help would be much appreciated!

See if this is what you want,
val = [1000 1001];
ind = ismember(C(:,1),val);
C_f = C(ind,:)

Related

MATLAB drop observations from a timetable not contained in another timetable

I have two timetables, each of them have 4 columns, where the first 2 columns are of my particular interest. The first column is a date and the second is an hour.
How can I know which observations (by date an hour) are in the timetable 1 but not in the timetable 2 and, therefore, drop those observations from my timetable 1?
So for example, just by looking I realized that timetable1 included the day 25/05/2015 with hours 1 and 2, but the timetable 2 did not include them, therefore I would like to drop those observations from timetable 1.
I tried using the command groups_timetable1 = findgroups(timetable1.Date,timetable1.Hour);but unfortunately this command does not tell you a lot how to distinguish between observations.
Thank you!
call ismember to find one set of data in another.
to find multiple records as a group in another composite records, you call ismember(..., 'rows').
for example
baseline=[
100, 2.1
200, 7.5
120, 11.0
];
isin=ismember(baseline,[200, 7.5],'rows');
pos=find(isin)
if you have time date strings or datetime objects, please convert those to numerical values, such as by calling datenum or posixtime first.
You can use the timetable method innerjoin to do this. Like so:
% Fabricate some data
dates1 = datetime(2015, 5, ones(10,1));
hours1 = (1:10)';
timetable1 = timetable(dates1(:), hours1, rand(10,1), rand(10,1), ...
'VariableNames', {'Hour', 'Price', 'Volume'});
% Subselect a few rows for timetable2
timetable2 = timetable1([1:3, 6:10],:);
% Use innerjoin to pick rows where Time & Hour intersect:
innerjoin(timetable1, timetable2, 'Keys', {'Time', 'Hour'})
By default, the result of innerjoin contains the table variables from both input tables - that may or may not be what you want.

How to insert a structure within a structure

I have a 1x1 structure called imu_data.txyzrxyz1. It has one field called txyzrxyz1 and the value is 4877x7 double. I just want to "copy and paste" row 62 into row 63 (double up that row) so that the structure now becomes a 4878x7 structure. I've tried the following, with other versions without success:
extra_63 = imu_data.txyzrxyz1(63,:);
imu_data2.txyzrxyz1 = [{imu_data.txyzrxyz1(1:62,:) extra_63 imu_data.txyzrxyz1(63:end,:)}]
Thanks
You can index the row to duplicate twice while matrix indexing:
row_to_duplicate = 63;
yourdata = rand(100,10);
yourstruct.data = yourdata;
yourstruct.data = yourstruct.data([1:row_to_duplicate, row_to_duplicate:end],:)
So in case of 63, 1:row_to_duplicate will create a column vector from 1:63, and row_to_duplicate:end will create a column vector from 63:100 in this example. When combining these, 63 will occur twice, hence that row is duplicated.
You were almost there, you only had to get rid of the {}'s and put the data in the right orientation by using ; instead of a space between matrix entries to vertically concatenate instead of horizontally:
extra_63 = imu_data.txyzrxyz1(63,:);
imu_data2.txyzrxyz1 = [imu_data.txyzrxyz1(1:62,:); extra_63; imu_data.txyzrxyz1(63:end,:)]

Organising large datasets in Matlab

I have a problem I hope you can help me with.
I have imported a large dataset (200000 x 5 cell) in Matlab that has the following structure:
'Year' 'Country' 'X' 'Y' 'Value'
Columns 1 and 5 contain numeric values, while columns 2 to 4 contain strings.
I would like to arrange all this information into a variable that would have the following structure:
NewVariable{Country_1 : Country_n , Year_1 : Year_n}(Y_1 : Y_n , X_1 : X_n)
All I can think of is to loop through the whole dataset to find matches between the names of the Country, Year, X and Y variables combining the if and strcmp functions, but this seems to be the most ineffective way of achieving what I am trying to do.
Can anyone help me out?
Thanks in advance.
As mentioned in the comments you can use categorical array:
% some arbitrary data:
country = repmat('ca',10,1);
country = [country; repmat('cb',10,1)];
country = [country; repmat('cc',10,1)];
T = table(repmat((2001:2005)',6,1),cellstr(country),...
cellstr(repmat(['x1'; 'x2'; 'x3'],10,1)),...
cellstr(repmat(['y1'; 'y2'; 'y3'],10,1)),...
randperm(30)','VariableNames',{'Year','Country','X','Y','Value'});
% convert all non-number data to categorical arrays:
T.Country = categorical(T.Country);
T.X = categorical(T.X);
T.Y = categorical(T.Y);
% here is an example for using categorical array:
newVar = T(T.Country=='cb' & T.Year==2004,:);
The table class is made for such things, and very convenient. Just expand the logic statement in the last line T.Country=='cb' & T.Year==2004 to match your needs.
Tell me if this helps ;)

Skip multiple points of 'x' and store and variable 'newdata in matlab

I have data stored in variable data.
data =
[43.98272955 39.55809471;
-49.51656799 28.57164726;
-9.475861028 -44.31264255;
27.14884251 2.603921223;
-2.914496888 7.864022006;
4.093025860 4.816211687;
-12.11007479 5.797539648;
-1.653535904 -12.49864642;
5.978990391 1.229984916;
0.9837133282 -2.001124423;
5.674977844 6.323209942;
-9.574459589 3.696791663;
0.3410452503 -7.338955191]
but need use only data corresponding to multiple numbers of x.
Example:
if x = 3,
want store only multiple rows of 3, so
newdata = [-9.475861028 -44.31264255;
4.093025860 4.816211687;
5.978990391 1.229984916;
-9.574459589 3.696791663]
how do I do that?
P.S I would use the command textscan.
this is straightforward with indexing:
newData = data(3:3:end,:)
If I understood the question correctly:
data(x:x:length(data),:)
You could just do just scan it row by row using the mod (modulo) function to extract rows corresponding to the desired multiples. For example:
x=3;
newdata=[];
for k=1:size(data,1)
if mod(k,x)==0
newdata=[newdata; data(k,:)];
end
end

How to read data in chunks from notepad file in Matlab?

My data is in following format:
TABLE NUMBER 1
FILE: name_1
name_2
TIME name_3
day name_4
-0.01 0
364.99 35368.4
729.99 29307
1094.99 27309.5
1460.99 26058.8
1825.99 25100.4
2190.99 24364
2555.99 23757.1
2921.99 23240.8
3286.99 22785
3651.99 22376.8
4016.99 22006.1
4382.99 21664.7
4747.99 21348.3
5112.99 21052.5
5477.99 20774.1
5843.99 20509.9
6208.99 20259.7
6573.99 20021.3
6938.99 19793.5
7304.99 19576.6
TABLE NUMBER 2
FILE: name_1
name_5
TIME name_6
day name_7
-0.01 0
364.99 43110.4
729.99 37974.1
1094.99 36175.9
1460.99 34957.9
1825.99 34036.3
2190.99 33293.3
2555.99 32665.8
2921.99 32118.7
3286.99 31626.4
3651.99 31175.1
4016.99 30758
4382.99 30368.5
4747.99 30005.1
5112.99 29663
5477.99 29340
5843.99 29035.2
6208.99 28752.4
6573.99 28489.7
6938.99 28244.2
7304.99 28012.9
TABLE NUMBER 3
Till now I was splitting this data and reading the variables (time and name_i) from each file in following way:
[TIME(:,j), name_i(:,j)]=textread('filename','%f\t%f','headerlines',5);
But now I am producing the data of those files into 1 file as shown in beginning. For example I want to read and store TIME data in vectors TIME1, TIME2, TIME3, TIME4, TIME5 for name_3, name_6, _9 respectively, and similarly for others.
First of all, I suggest you don't use variable names such as TIME1,TIME2 etc, since that gets messy quickly. Instead, you can e.g. use a cell array with five rows (one for each well), and one or two columns. In the sample code below, wellData{2,1} is the time for the second well, wellData{2,2} is the corresponding Oil Rate SC - Yearly.
There might be more elegant ways to do the reading; here's something quick:
%# open the file
fid = fopen('Reportq.rwo');
%# read it into one big array, row by row
fileContents = textscan(fid,'%s','Delimiter','\n');
fileContents = fileContents{1};
fclose(fid); %# don't forget to close the file again
%# find rows containing TABLE NUMBER
wellStarts = strmatch('TABLE NUMBER',fileContents);
nWells = length(wellStarts);
%# loop through the wells and read the numeric data
wellData = cell(nWells,2);
wellStarts = [wellStarts;length(fileContents)];
for w = 1:nWells
%# read lines containing numbers
tmp = fileContents(wellStarts(w)+5:wellStarts(w+1)-1);
%# convert strings to numbers
tmp = cellfun(#str2num,tmp,'uniformOutput',false);
%# catenate array
tmp = cat(1,tmp{:});
%# assign output
wellData(w,:) = mat2cell(tmp,size(tmp,1),[1,1]);
end