I have a vector, stdclock, which holds values that follow this pattern:
stdclock=[13 25 38 50 63 75 88 100 113 125 138 150 163 175 188 200 213 2517 2529 2542 2554 2567 2579 2592 2604 2617 2629 2642 2654 2667 2679 2692 2704 2717]
This data is generated through an encoding of 17 values that come 12 or 13 numbers apart (e.g. 25-13=12, 38-25 = 13, etc). You'll see that the first 17 values follow this pattern. Each group of 17 values encode an object, which we'll call an 'item' and are independent of the subsequent 17 values. Then, between value 17 and 18, there's a much larger difference than 12 or 13, but it could be any number higher than, say, 15. This difference represents a separation qualitative separation in the data such that the first 17 values encode one item, the next 17 values encode another item, etc etc. The difference between the 17th and 18th value will never be as small as 12 or 13. Therefore, I can check for any values >= 15, and be sure that I can separate my data in this way. Alternatively, I can reshape the vector as a 17xlength(stdclock)/17 matrix.
So far so good. The problem is that this vector is generated through hardware which can sometimes have errors such that one or more values is simply dropped and not recorded. I want to figure out an algorithm that will detect that values are missing from an 'item' and then remove all remaining values from that item.
I can't quite wrap my head around how to do this in a way that will work for all patterns of errors (e.g. if an item can have missing numbers anywhere, in any pattern, and neighboring items may also have missing numbers anywhere in any pattern, or nowhere).
Any help would be appreciated. An example of a 'corrupted' item would be like this
stdclock=[13 25 38 50 63 75 88 100 113 125 138 150 163 175 188 200 213 2529 2542 2554 2567 2579 2592 2604 2642 2654 2679 2692 2704]
where this stdclock is the same as the one on top, but I went through in the second item and randomly removed numbers, including the first and last numbers.
If you can assume that the difference between consecutive groups is always larger than some threshold, you can use the approach below: identify consecutive groups, and throw out all groups of a length less than 17. It turns out that the threshold for a new group can be set as low as 15, since a missing data point will split a group of 17 into two shorter groups, which will then both be removed.
stdclock=[13 25 38 50 63 75 88 100 113 125 138 150 163 175 188 200 213 2529 2542 2554 2567 2579 2592 2604 2642 2654 2679 2692 2704];
%# a difference of more than groupDelta indicates a new (pseudo-)group
groupDelta = 15;
groupJump = [1 diff(stdclock) > groupDelta];
%# number the groups
groupNumber = cumsum(groupJump);
%# count, for each group, the numbers.
groupCounts = hist(groupNumber,1:groupNumber(end));
%# if a group contains fewer than 17 entries, throw it out
badGroup = find(groupCounts < 17);
stdclock(ismember(groupNumber,badGroup)) = [];
stdclock =
13 25 38 50 63 75 88 100 113 125 138 150 163 175 188 200 213
Related
I have a numeric variable within my data.
sample(d$timedelta, 20)
[1] 601 561 44 162 554 443 604 68 140 446 178 506 348 402 401 700 127 717 669 68
My target is a binary variable (Popularity = 1/0)
I want to drop the variable if there is no statistically significant difference between $timedelta among the two groups
pop.1.time = d$timedelta[d$Popularity==1]
pop.0.time = d$timedelta[d$Popularity==0]
t.test(pop.1.time,pop.0.time, var.equal = F, paired = F)
Can I drop Timedelta altogether if the above test shows that there is no difference among the two groups?
Is that a valid approach? Or am I misinterpreting the meaning of a T-test?
It is the result of GLCM matrix. What is the meaning of black horizontal and vertical lines in GLCM image? Are they a problem?
N = numel(unique(img)); % img is uint8
glcm = graycomatrix(img, 'NumLevels', N);
imshow(glcm)
I suspect this is the problem: For the function graycomatrix, You have supplied a 'NumLevels' argument which is larger than the number of unique graylevels in your image. For instance, a 256-level (8-bit) image will have only 256 graylevels. Asking for 1000 levels in the output means 744 levels will have no data! i.e. Yes, this is a problem. You can check how many graylevels your image has using numel(unique(I)).
p.s. In the future, please attach the code you used to generate the problem.
graycomatrix calculates the GLCM from a scaled version of the image. Due to round-off errors in the scaling process the number of different intensity levels in the scaled image may be less than the number of different intensity levels in the original image .
Consider the following sample image:
img = uint8([ 48 161 209 64 133 240 166 227;
184 54 181 33 107 252 242 255
217 191 125 112 204 252 135 201
163 222 66 125 229 140 38 97
252 214 201 191 10 102 242 74
191 74 77 8 163 51 189 186]);
From the documentation (emphasis mine):
[glcms,SI] = graycomatrix(___) returns the scaled image, SI, used to calculate the gray-level co-occurrence matrix. The values in SI are between 1 and NumLevels.
If you set NumLevels to the number of different intensity levels (which in this example is 39)
N = numel(unique(img))
[glcm_scaled, img_scaled] = graycomatrix(img, 'NumLevels', N);
the returned GLCM has 39*39 elements. The issue is that the scaled image has only 28 different intensity levels:
>> img_scaled
img_scaled =
8 25 32 10 21 37 26 35
29 9 28 6 17 39 38 39
34 30 20 18 32 39 21 31
25 34 11 20 36 22 6 15
39 33 31 30 2 16 38 12
30 12 12 2 25 8 29 29
>> numel(unique(img_scaled))
ans =
28
As a consequence, the GLCM will have 11 rows and 11 columns in which all the entries are zero (black lines).
If you do not wish this to happen, you can map the intensity levels through a lookup table:
levels = unique(img);
N = numel(levels);
lut = zeros(256, 1);
for i=1:N;
index = uint16(levels(i)) + 1;
lut(index) = i;
end
img_lut = lut(uint16(img) + 1);
[glcm_mapped, img_mapped] = graycomatrix(img_lut, 'NumLevels', N, 'GrayLimits', []);
By doing so, img_mapped is exactly the same as img_lut and there is no black lines in the GLCM. Notice that by specifying empty brackets for the GrayLimits parameter, graycomatrix uses the minimum and maximum grayscale values in the input image as limits.
I was seeing the same behaviour under my own GLCM implementation.
The issue was that I was implementing the histogram equalization given a number of gray levels.
I compute the discretization of the image before dividing first and then enter to review if any row or column is given inly zeros values.
Chemical composition of a certain material
Hi,
I am trying to import the below mentioned data in CSV format in matlab, which is [1000x10] in dimensions.
HCL;H2SO4;CH4; SULPHUR;CHLORINE;S2O3;SO2;NH3;CO2;O2
144 2 3 141 140 6 7 137 136 10 11 133
13 131 130 16 17 127 126 20 21 123 122 24
25 119 118 28 29 115 114 32 33 111 110 36
108 38 39 105 104 42 43 101 100 46 47 97
96 50 51 93 92 54 55 89 88 58 59 85
61 83 82 64 65 79 78 68 69 75 74 72
73 71 70 76 77 67 66 80 81 63 62 84
60 86 87 57 56 90 91 53 52 94 95 49
48 98 99 45 44 102 103 41 40 106 107 37
109 35 34 112 113 31 30 116 117 27 26 120
121 23 22 124 125 19 18 128 129 15 14 132
12 134 135 9 8 138 139 5 4 142 143 1
I am able to import this data through my code
fid = fopen(uigetfile('.csv'),'rt');
FileName = fopen(fid);
headers = fgets(fid); %get first line
headers = textscan(headers,'%s','delimiter',';'); %read first line
format = repmat('%f',1,size(headers{1,1},1)); %count columns n makeformat string
data = textscan(fid,format,'delimiter',';'); %read rest of the file
data = [data{:}];
I am getting data in matrix form in variable data [1000x10] and name of all the components like HCL, H2SO4 in a cell array named headers{1x1}.
Now I have two questions like the built in import feature in matlab you have flexibility to import data as separate column vectors, numeric matrix,cell array and table format. Is it possible to do as such through code, like i get column vectors with their name HCL with [1000x1] and H2sO4 with [1000x1] in my workspace after import and so on all the column vectors with their names with [1000x1]dimensions.
if yes then help me please...?
If above mentioned is not possible then i can do alternatively that now I have names of column vectors in headers cell array, how I can extract those name and use those names as column vector names through code and I can assign data from data matrix [1000x10] to each column vector with their corresponding names.
like if i say
x = headers {1*1}{1*1}; i will get x = "HCL"
x = genvarname(x); I will get x= x0x22HCL0x2 BUT
I want that x get replaced with HCL.and then I assign
HCL = data(:,1) and same like this other variables H2SO4,SULPHUR, CHLORINE.
You can say i try to implement the import feature of column vector through my code.
Kindly help me to solve this issue. thanks
Have you tried the built-in readtable function?
You can access each column of the table by using the named column header.
If you'd like, you can use the two data types to create a table in MatLab. I'm not terribly familiar with its use, but it seems to be well documented. I'm sure someone else can expand upon this.
Edit:
After re-reading your question, I think this is closer to what you are after.
n=10;
what='HCL';%change this to any of the strings you interested in
numstr = repmat('%f',1,n);
hdrstr = repmat('%s',1,n);
headers = textscan(headers,hdrstr,'delimiter',';');
headers = headers(1,:)
data = cell2mat(textscan(fid,numstr,'delimiter',';'));
datout = data(:,strcmp(headers,what));%datout will be 1000x1 HCL data
Depending on what you want to do, you can loop through these appropriately
I know this is not what you asked for, but I would convert to a struct:
x=cell2struct(num2cell(data),headers,2)
reason is simple, selecting for example the third row with individual variables is not possible. With a struct simply use x(3)
If at some point you need the vectors you originally asked for and you can't use the strcut, use [x.HCL]
Just started MATLAB 2 days ago and I can't figure out a non-loop method (since I read they were slow/inefficient and MATLAB has better alternatives) to perform a simple task.
I have a matrix of 5 columns and 270 rows. What I want to do is:
if the value of an element in column 5 of matrix goodM is below 90, I want to take that element and and subtract it from 90.
So far I tried:
test = goodM(:,5) <= 90;
goodM(test) = 999;
It changes all goodM values within column 1 not 5 into 999, in addition this method doesn't allow me to perform operations on the elements below 90 in column 5. Any elegant solution to doing this?
edit:: goodM(:,5)(test) = 999; doesn't seem to work either so I have no idea to specify the target column.
I am assuming you are looking to operate on elements that have values below 90 as your text in the question reads, rather than 'below or equal to' as represented by '<=' as used in your code. So try this -
ind = find(goodM(:,5) < 90) %// Find indices in column 5 that have values less than 90
goodM(ind,5) = 90 - goodM(ind,5) %// Operate on those elements using indices obtained from previous step
Try this code:
b=90-a(a(:,5)<90,5);
For example:
a =
265 104 479 13 176
26 110 447 208 144
379 163 179 366 464
301 48 274 391 26
429 374 174 184 297
495 375 312 373 82
465 272 399 447 420
205 170 373 122 84
1 417 63 65 252
271 277 412 113 500
then,
b=90-a(a(:,5)<90,5);
b =
64
8
6
I would like to add a constant value of 360 to a vector of values after the maximum value is reached. That is, if H=[12 26 67 92 167 178 112 98 76 85], how do I write a matlab code so that 180 is added to all values after 178? The answer should be H=[12 26 67 92 167 178 292 278 256 265].
This should work on earlier Matlab versions as well:
H=[12 26 67 92 167 178 112 98 76 85]
[n, n] = max(H);
H(n+1:end) = H(n+1:end) + 180
Try following:
n=find(H==max(H));
H(n+1:end)=H(n+1:end)+180;
Since desired vector values are in increasing order, idea here is to find the index of maximum value and increment all the subsequent elements with 180.
EDIT
Better approach for finding max index, as suggested by #LeonidBeschastny
[~,n]=max(H);