Splitting up number by certain amount - matlab

I'm trying to split up numbers by a given value (4000) and have the numbers placed in an array
Example:
max value given is: 8202
So the split_array should be split by 4000 unless it gets to the end and it's less than 4000
in which case it just goes to the end.
start_pos, end_pos
0,4000
4001,8001
8002,8202
so the first row in the array would be
[0 4000]
second row would be
[4001 8001]
third row would be
[8002 8202]
please note that the max value can change from (8202) to be any other number like (16034) but never a decimal
How can I go about doing this using matlab / octave

This should produce what you want
n = 8202;
a = [0:4001:n; [4000:4001:n-1 n]]'
returns
a =
0 4000
4001 8001
8002 8202

Related

Information on XXX.cnt obtained by an RSEM analysis

I obtained a "XXX.cnt" in a newly created "XXX.stat" directory after an RSEM-1.3.3 analysis.
Shown below is the content of the XXX.cnt.
0 2726098 0 2726098
1534055 1192043 1993977
9793897 1
0 0
1 732121
2 410181
3 513309
4 610475
5 90206
6 81551
7 63620
8 44947
9 33029
10 21745
11 22282
12 21545
13 13324
14 17247
.
.
.
What do these numbers mean?
Thank you in advance for your kindness.
The format and meanings of each field are described in "cnt_file_description.txt" under RSEM directory.
http://deweylab.github.io/RSEM/rsem-calculate-expression.html#OUTPUT
https://github.com/bli25broad/RSEM_tutorial
Here is the transcript.
# '#' marks the start of comments (till the end of the line)
# *.cnt file contains alignment statistics based purely on the alignment results obtained from aligners
N0 N1 N2 N_tot
# N0, number of unalignable reads; N1, number of alignable reads; N2, number of filtered reads due to too many alignments; N_tot = N0 + N1 + N2
nUnique nMulti nUncertain
# nUnique, number of reads aligned uniquely to a gene; nMulti, number of reads aligned to multiple genes; nUnique + nMulti = N1;
# nUncertain, number of reads aligned to multiple locations in the given reference sequences, which include isoform-level multi-mapping reads
nHits read_type
# nHits, number of total alignments.
# read_type: 0, single-end read, no quality score; 1, single-end read, with quality score; 2, paired-end read, no quality score; 3, paired-end read, with quality score
# The next section counts reads by the number of alignments they have. Each line contains two values separated by a TAB character. The first value is number of alignments. 'Inf' refers to reads filtered due to too many alignments. The second value is the number of reads that contain such many alignments
0 N0
...
number_of_alignments number_of_reads_with_that_many_alignments
...
Inf N2

Lotto code,the previous number cannot appear again,how do i improve it

I use matlab to write this code,and it seems there is something wrong with logic,but i don't know where am i wrong and how to improve this.
i want to write a lotto code,and there are six numbers in it,the range of first six numbers is 1 to 38,the range of last number is 1 to 8.Here is my code
previous_number=randi([1,38],1,6)
last=randi([1,8],1,1) %produce the last number
for k =1:6
while last== previous_number %while that last number is the same as the value of one of the previous number
last=randi([1,8],1,1)%then produce the last number again,until the different value produce
end
end
ltto=[previous_number last]
but i found that the last number will still generate the same number as the first six numbers,for example,
"1" 2 33 55 66 10 "1"
1 "2" 33 55 66 10 "2"
Why?i have already said
while last==previous_number(k)
last=randi([1,8],1,1)
end
if i want to write the code in c or other program language,i think i can just use if ,while and loop,etc,like this basic loop,i can't use the "ismemeber"or randperm. how can i rewrite the code?
if i rewrite as
previous_number=randi([1,38],1,6)
last=randi([1,8],1,1) %produce the last number
for k =1:6
if last== previous_number(k) %while that last number is the same as the value of one of the previous number
last=randi([1,8],1,1)%then produce the last number again,until the different value produce
end
end
ltto=[previous_number last]
the result will also show me "1" 2 21 12 13 22 "1" sometimes
This occures because you first iterate over the numbers, then replace last according to the specific current iteration, without regarding the previous ones.
For example, in your example data, think that last = 10 so you get to the sixth iteration, find that last is equal to b(k) that is 10, so you replace it. But now it can generate 1, and you will finish the while loop and the for loop.
The solution is to compare last to all your vector, not iterate over it:
previous_number = b(1:6);
last = previous_number(1);
while ismember(last, previous_number)
last = randi(8); %produce the last number
end
[As of comments discussion:]
If you still want to compare each element separately, you can do it like that:
previous_number=randi([1,38],1,6)
last=randi(8)
k=0;
while k <= 5
k = k + 1;
if last == previous_number(k)
last = randi(8);
k = 0;
end
end
ltto=[previous_number last]

Perl: getting count for range of values

I want to perform a count for a range of values, i.e, I have 900 values of X between 1 to 75x10^6. I need to count the number of times these X's fall in range like 1-1000000, 1000001-2000000, 2000001-3000000 ... 750 ranges, then return the counts of these ranges.
I have the values of X stored in an array so I could have done it with for loop and if..else, but giving 750 if-else's is no solution and I don't know how to implement value range in hash-keys. Please help
Thank you in advance :)
For each value, you can subtract 1, divide by 1000000, and cut off any decimals. That gives you the index of the range as a number between 0 and 749 (inclusive).
Example:
use strict;
use warnings;
my #values = (...); # filled from somewhere
my #range_count;
for my $value (#values) {
my $x = int(($value - 1) / 1e6);
$range_count[$x]++;
}
Now $range_count[0] contains the number of values in the first range, $range_count[1] the number of values in the second range, etc.
However, if there were no values in some range, the count will be undef, not 0. If this difference is important, define #range_count as
my #range_count = (0) x 750;
instead.

Rows without repetitions - MATLAB

I have a matrix (4096x4) containing all possible combinations of four values taken from a pool of 8 numbers.
...
3 63 39 3
3 63 39 19
3 63 39 23
3 63 39 39
...
I am only interested in the rows of the matrix that contain four unique values. In the above section, for example, the first and last row should be removed, giving us -
...
3 63 39 19
3 63 39 23
...
My current solution feels inelegant-- basically, I iterate across every row and add it to a result matrix if it contains four unique values:
result = [];
for row = 1:size(matrix,1)
if length(unique(matrix(row,:)))==4
result = cat(1,result,matrix(row,:));
end
end
Is there a better way ?
Approach #1
diff and sort based approach that must be pretty efficient -
sortedmatrix = sort(matrix,2)
result = matrix(all(diff(sortedmatrix,[],2)~=0,2),:)
Breaking it down to few steps for explanation
Sort along the columns, so that the duplicate values in each row end up next to each other. We used sort for this task.
Find the difference between consecutive elements, which will catch those duplicate after sorting. diff was the tool for this purpose.
For any row with at least one zero indicates rows with duplicate rows. To put it other way, any row with no zero would indicate rows with no duplicate rows, which we are looking to have in the output. all got us the job done here to get a logical array of such matches.
Finally, we have used matrix indexing to select those rows from matrix to get the expected output.
Approach #2
This could be an experimental bsxfun based approach as it won't be memory-efficient -
matches = bsxfun(#eq,matrix,permute(matrix,[1 3 2]))
result = matrix(all(all(sum(matches,2)==1,2),3),:)
Breaking it down to few steps for explanation
Find a logical array of matches for every element against all others in the same row with bsxfun.
Look for "non-duplicity" by summing those matches along dim-2 of matches and then finding all ones elements along dim-2 and dim-3 getting us the same indexing array as had with our previous diff + sort based approach.
Use the binary indexing array to select the appropriate rows from matrix for the final output.
Approach #3
Taking help from MATLAB File-exchange's post combinator
and assuming you have the pool of 8 values in an array named pool8, you can directly get result like so -
result = pool8(combinator(8,4,'p'))
combinator(8,4,'p') basically gets us the indices for 8 elements taken 4 at once and without repetitions. We use these indices to index into the pool and get the expected output.
For a pool of a finite number this will work. Create is unique array, go through each number in pool, count the number of times it comes up in the row, and only keep IsUnique to 1 if there are either one or zero numbers found. Next, find positions where the IsUnique is still 1, extract those rows and we finish.
matrix = [3,63,39,3;3,63,39,19;3,63,39,23;3,63,39,39;3,63,39,39;3,63,39,39];
IsUnique = ones(size(matrix,1),1);
pool = [3,63,39,19,23,6,7,8];
for NumberInPool = 1:8
Temp = sum((matrix == pool(NumberInPool))')';
IsUnique = IsUnique .* (Temp<2);
end
UniquePositions = find(IsUnique==1);
result = matrix(UniquePositions,:)

Using SUM and UNIQUE to count occurrences of value within subset of a matrix

So, presume a matrix like so:
20 2
20 2
30 2
30 1
40 1
40 1
I want to count the number of times 1 occurs for each unique value of column 1. I could do this the long way by [sum(x(1:2,2)==1)] for each value, but I think this would be the perfect use for the UNIQUE function. How could I fix it so that I could get an output like this:
20 0
30 1
40 2
Sorry if the solution seems obvious, my grasp of loops is very poor.
Indeed unique is a good option:
u=unique(x(:,1))
res=arrayfun(#(y)length(x(x(:,1)==y & x(:,2)==1)),u)
Taking apart that last line:
arrayfun(fun,array) applies fun to each element in the array, and puts it in a new array, which it returns.
This function is the function #(y)length(x(x(:,1)==y & x(:,2)==1)) which finds the length of the portion of x where the condition x(:,1)==y & x(:,2)==1) holds (called logical indexing). So for each of the unique elements, it finds the row in X where the first is the unique element, and the second is one.
Try this (as specified in this answer):
>>> [c,~,d] = unique(a(a(:,2)==1))
c =
30
40
d =
1
3
>>> counts = accumarray(d(:),1,[],#sum)
counts =
1
2
>>> res = [c,counts]
Consider you have an array of various integers in 'array'
the tabulate function will sort the unique values and count the occurances.
table = tabulate(array)
look for your unique counts in col 2 of table.