I have an model that can be represented by multiple linear segments as below:
Y
| _/_________\_
| / \
| / \
| / \
|/ \
|
|________________________ X
I would need to find the value of Y for a given value of X
My initial though was to store each segment as a relational line type {A, B, C}. However I'm not sure what would that buy me in terms of finding a proper query to retrieve the Y value.
Since you are working with linear segments, you should use the lseg data type (the line data type represents a line of infinite length). Once you have your data in that format you can find the intersection of the segments with a vertical line of infinite length at the desired value of X and extract the Y value of the intersection.
CREATE TABLE segments (id int, seg lseg);
INSERT INTO segments VALUES
(1, '[(4,3), (12,15)]'), -- positively inclined line segment
(2, '[(2,19), (24,-4)]'), -- negatively inclined line segment
(3, '[(4,3), (12,3)]'), -- horizontal line segment
(4, '[(5,3), (5,15)]'), -- vertical line segment, collinear at X=5
(5, '[(4,3), (4,15)]'); -- vertical line segment, no intersection at X=5
and then:
test=# SELECT id, 5 AS x, (seg # '((5,-999999999), (5,999999999))'::lseg)[1] AS y
test-# FROM segments;
id | x | y
----+---+------------------
1 | 5 | 4.5
2 | 5 | 15.8636363636364
3 | 5 | 3
4 | 5 |
5 | 5 |
(5 rows)
As is obvious from the above, collinear line segments (i.e. vertical line segments with the same value for X) and segments without intersection return NULL for Y.
Related
I'm trying to figure out how the Graphite summarize function works. I've the following data points, where X-axis represents time, and Y-axis duration in ms.
+-------+------+
| X | Y |
+-------+------+
| 10:20 | 0 |
| 10:30 | 1585 |
| 10:40 | 356 |
| 10:50 | 0 |
+-------+------+
When I pick any time window on Grafana more than or equal to 2 hours (why?), and apply summarize('1h', avg, false), I get a triangle starting at (9:00, 0) and ending at (11:00, 0), with the peak at (10:00, 324).
A formula that a colleague came up with to explain the above observation is as follows.
Let:
a = Number of data points for a peak, in this case 4.
b = Number of non-zero data points, in this case 2.
Then avg = sum / (a + b). It produces (1585+356) / 6 = 324 but doesn't match with the definition of any mean I know of. What is the math behind this?
Your data is at 10 minute intervals, so there are 6 points in each 1hr period. Graphite will simply take the sum of the non-null values in each period divided by the count (standard average). If you look at the raw series you'll likely find that there are also zero values at 10:00 and 10:10
I need to calculate the Z-Index (Morton) of a point on a plane from its 2 coordinates x, y.
Traditionally this is just solved by the bit interleaving.
However I have boundaries, and I want the z-index of the point to only increase the morton count when it's inside the active area, and skip the count when outside.
To be clear, the typical z order in a 4x4 square is:
| 0 1 4 5 |
| 2 3 6 7 |
| 8 9 12 13 |
| 10 11 14 15 |
However if I have a 3x3 active area, I want the index to be calculated like this:
| 0 1 4 x |
| 2 3 5 x |
| 6 7 8 x |
| x x x x |
As you can see the 00-11 quad is full, the 02-13 is skipping the count for the 2 points that fall outside of the active area, same for 20-31, and for 22-33.
Important: I want to do this without iterating.
Is there a known solution for this problem?
I was able to get answer for the question on https://fgiesen.wordpress.com/2009/12/13/decoding-morton-codes/
To handle rectangular regions, round up all dimensions to the nearest power of 2 and pack major axis linearly.
For example, endcoding point (2,3) in 5x4 rectangle as follows,
Rounding up 5x4 to nearest power of 2 results in 8x4 i.e. 3 and 2 bits
Encoding point 2,3
First interleave 2bits of 0b010, 0b11 we get 0b1110, and 3rd bit from x dimension becomes 5th bit of result.
Encoding 4,2,
0b100, 0b11 becomes 0b11010
In order to find z-order of 3x3 region, find inverse mapping for 4x4 region using above reverse of above method
while generating map skip any points that fall outside 3x3 region.
mapping would look like
(0,0) -> (0,0)
(0,1) -> (1,0)
(0,2) -> (0,1)
(0,3) -> (1,1)
(1,0) -> (2,0)
(1,2) -> (2,1)
(2,0) -> (0,2)
(2,1) -> (1,2)
(3,0) -> (2,2)
python code might be useful, https://gist.github.com/kannaiah/4eb936b047a987b32555b2642a0979f7
I have two sets of data from different instruments that have common X-variables (XThompsons) but various Y-variables (YCounts) due to various experimental conditions. The data resemble the example below:
[Table1]
XThompsons | YCounts (1) | YCounts (2) | YCounts (3) | .... | ....
------------------------------------------------------------------
[Table2]
XThompsons | YCounts (1) | YCounts (2) | YCounts (3) | .... | ....
------------------------------------------------------------------
When I have two sets of data that are like this, I have written a script to take a single Y-column information from Table1 and do some math to all Y-columns in Table2. However, when comparing two table columns if either column has a value of a specific threshold (0.10) I want to delete that value. In the example below I want to delete row 4 and row 6 because either column has a value containing 0.10 or less
XThompsons | Table1.YCounts(1) | Table2.YCounts(2)
--------------------------------------------------
1 1.00 0.50
2 0.22 0.12
3 0.29 0.14
4 0.29 0.09 (delete row)
5 0.11 0.49
6 0.02 0.83 (delete row)
How can I carry this out in Matlab? My current code is below; I convert each table row to an array first. How can I make it so that if Y < 0.10 delete the row?
datax = readtable('table1.xls'); % Instrument 1
datay = readtable('table2.xls'); % Instrument 2
SIDATA = [];
for idx=2:width(datay);
% Read the indexed column of datax (instrument 1) then normalize to 1
x = table2array(datax(:,idx));
x = x ./ max(x);
% Read indexed column of datay (instrument 2) and carry out loop
for idy=2:width(datay);
% Normalize y data to 1
y = table2array(datay(:,idy));
y = y ./ max(y);
% Calculate similarity index (SI) at using the datax index for all collision energies for datay
xynum = sum(sqrt(x) .* sqrt(y));
xyden = sqrt(sum(x) .* sum(y));
SIDATA(idy,idx) = (xynum/xyden);
end
end
Help would be appreciated.
Thanks!
Generally when looping through and pruning values you want to increment from the end of the matrix back to one; this way, if you delete any rows, you don't skip. (If you delete row 2, then advance to row 3, you skip the data formerly in row 3).
To me, the easiest way to do this is that if all your data is in one matrix A, with columns Y1 Y2,
APruned = A((A(:,1) > 0.1) & (A(:,2) > 0.1),:)
This takes the A matrix, finds the rows where Y1 > 0.1, finds the rows where Y2 > 0.1, finds the overlap, and then outputs only the rows in A where both of these are true.
You should read about logical indecies for more on this topic
EDIT: It looks like you could also clean up your earlier code using element-wise operations;
A = [datax./max(datax) datay./max(datay)];
I'm using the following code to get specgram2D from np array:
specgram2D, freq, time = mlab.specgram(samples, Fs=11025, NFFT=1024, window=mlab.window_hanning, noverlap=int(1024 * 0.5))
Then I print out specgram2D like
print len(specgram2D) # returns 513
I got 513 instead of expected 512 which is half the window size.
What am I doing wrong?
Can I just ignore specgram2D[512]?
I got 513 instead of expected 512 which is half the window size.
What am I doing wrong?
For a real-valued signal, the frequency spectrum obtained from the Discrete Fourier Transform (DFT) is symmetric and hence only half of the spectrum is necessary to describe the entire spectrum (since the other half can be obtained from symmetry). That is probably why you are expecting the size to be exactly half the input window size of 1024.
The problem is that with even sized inputs, the midpoint of the spectrum falls exactly on a frequency bin. As a result, that frequency bin is its own symmetry. To illustrate this, the symmetry can be seen from the following graph:
frequency: 0 fs/N ... fs/2 ... fs
bin number: 0 1 ... 511 512 513 ... 1023 1024
^ ^ ^ ^ ^ ^ ^ ^
| | | |-| | | |
| | | | | |
| | |--------| | |
| | | |
| |----------------------------| |
| |
|--------------------------------------|
Where N is the size of the FFT (as determined by the NFFT=1024 parameter) and fs is the sampling frequency. As you can see the spectrum is fully specified by taking bins 0 to 512, inclusive. Correspondingly you should be expecting the size to be floor(N/2)+1 (simply N/2 + 1 with integer division, but I included the floor to emphasis the round down operation), or 513 in your case.
Can I just ignore specgram2D[512]?
As previously shown it is an integral part of the spectrum, but many applications do not specifically require every single frequency bins (i.e. ignoring that bin depends on whether your application is mostly interested in other frequency components).
I use combnk to generate a list of combinations. How can I generate a subset of combinations, which always includes particular values. For example, for combnk(1:10, 2) I only need combinations which contain 3 and/or 5. Is there a quick way to do this?
Well, in your specific example, choosing two integers from the set {1, ..., 10} such that one of the chosen integers is 3 or 5 yields 9+9-1 = 17 known combinations, so you can just enumerate them.
In general, to find all of the n-choose-k combinations from integers {1, ..., n} that contain integer m, that is the same as finding the (n-1)-choose-(k-1) combinations from integers {1, ..., m-1, m+1, ..., n}.
In matlab, that would be
combnk([1:m-1 m+1:n], k-1)
(This code is still valid even if m is 1 or n.)
For a brute force solution, you can generate all your combinations with COMBNK then use the functions ANY and ISMEMBER to find only those combinations that contain one or more of a subset of numbers. Here's how you can do it using your above example:
v = 1:10; %# Set of elements
vSub = [3 5]; %# Required elements (i.e. at least one must appear in the
%# combinations that are generated)
c = combnk(v,2); %# Find pairwise combinations of the numbers 1 through 10
rowIndex = any(ismember(c,vSub),2); %# Get row indices where 3 and/or 5 appear
c = c(rowIndex,:); %# Keep only combinations with 3 and/or 5
EDIT:
For a more elegant solution, it looks like Steve and I had a similar idea. However, I've generalized the solution so that it works for both an arbitrary number of required elements and for repeated elements in v. The function SUBCOMBNK will find all the combinations of k values taken from a set v that include at least one of the values in the set vSub:
function c = subcombnk(v,vSub,k)
%#SUBCOMBNK All combinations of the N elements in V taken K at a time and
%# with one or more of the elements in VSUB as members.
%# Error-checking (minimal):
if ~all(ismember(vSub,v))
error('The values in vSub must also be in v.');
end
%# Initializations:
index = ismember(v,vSub); %# Index of elements in v that are in vSub
vSub = v(index); %# Get elements in v that are in vSub
v = v(~index); %# Get elements in v that are not in vSub
nSubset = numel(vSub); %# Number of elements in vSub
nElements = numel(v); %# Number of elements in v
c = []; %# Initialize combinations to empty
%# Find combinations:
for kSub = max(1,k-nElements):min(k,nSubset)
M1 = combnk(vSub,kSub);
if kSub == k
c = [c; M1];
else
M2 = combnk(v,k-kSub);
c = [c; kron(M1,ones(size(M2,1),1)) repmat(M2,size(M1,1),1)];
end
end
end
You can test this function against the brute force solution above to see that it returns the same output:
cSub = subcombnk(v,vSub,2);
setxor(c,sort(cSub,2),'rows') %# Returns an empty matrix if c and cSub
%# contain exactly the same rows
I further tested this function against the brute force solution using v = 1:15; and vSub = [3 5]; for values of N ranging from 2 to 15. The combinations created were identical, but SUBCOMBNK was significantly faster as shown by the average run times (in msec) displayed below:
N | brute force | SUBCOMBNK
---+-------------+----------
2 | 1.49 | 0.98
3 | 4.91 | 1.17
4 | 17.67 | 4.67
5 | 22.35 | 8.67
6 | 30.71 | 11.71
7 | 36.80 | 14.46
8 | 35.41 | 16.69
9 | 31.85 | 16.71
10 | 25.03 | 12.56
11 | 19.62 | 9.46
12 | 16.14 | 7.30
13 | 14.32 | 4.32
14 | 0.14 | 0.59* #This could probably be sped up by checking for
15 | 0.11 | 0.33* #simplified cases (i.e. all elements in v used)
Just to improve Steve's answer : in your case (you want all combinations with 3 and/or 5) it will be
all k-1/n-2 combinations with 3 added
all k-1/n-2 combinations with 5 added
all k-2/n-2 combinations with 3 and 5 added
Easily generalized for any other case of this type.