Pulling microarray data by MNI coordinates - allen-sdk

How can I write a query in RESTful to get human micro-array data by MNI coordinate?
I would like to pull a CSV of all the microarray measured gene expression levels within a volume of MNI space. Or by structure but with the MNI coordinate each micro array sample came from.

These are provided in the zipped spreadsheets you can download here:
http://human.brain-map.org/static/download
There are mni_x, mni_y, and mni_z columns in the samples spreadsheet.
Via RMA, you can download the T1->MNI affine transform for a given a donor (e.g. name H0351.2001, id 9861) as follows:
http://api.brain-map.org/api/v2/data/Specimen/query.xml?criteria=[name$eq'H0351.2001']&include=alignment3d
The return can be reshaped into a matrix like this (MATLAB syntax):
M = [ x.tvr_00 x.tvr_01 x.tvr_02 x.tvr_09;
x.tvr_03 x.tvr_04 x.tvr_05 x.tvr_10;
x.tvr_06 x.tvr_07 x.tvr_08 x.tvr_11 ];
Then you can take a sample's T1 XYZ coordinates, premultiply with that matrix, and you have coarse MNI coordinates.

Related

PySpark Feature Transformation: QuantileTransformer with uniform distribution of the output

Link to the document on scikit-learn:
link
What it essentially does is, it normalizes the data such that each data point falls under a bucket between 0 and 1 (percentile rank?) and I assume each of these buckets would have equal number of data points. This image describes what I am trying to do.
image
I would like to use this Quantile transformation with PySpark. There is a QuantileDiscretizer link in PySpark, but it doen't exactly do what I am looking for. It also returns less number of buckets than given in the input parameters. The below line of code returns only 81 distinct buckets on a data set with millions of rows, and min(col_1) as 0 and max(col_1) as 20000.
discretizer_1 = QuantileDiscretizer(numBuckets=100, inputCol="col_1", outputCol="result")
So is there a way I can uniformly normalize my data, either using QuantileDiscretizer or otherwise using PySpark?

MATLAB: how to generate all possible coalition-formations within M*M matrix

Assume that, there are 'M' objects aiming to form coalitions together. I need to know how to exhaustively generate all possible formations of coalitions using an M*M binary matrix given the following properties:
1- The elements of main diagonal are set to 1 (each object is in the same coalition with itself)
2- The matrix is symmetrical (being in the same coalition for two objects is a mutual relationship)
3- if objects (i,j) are in the same coalition, and (j,k) are in the same coalition, thus (i,k) are in the same coalition as well.
A simple formation of the coalitions with 4 objects is given by this example:
You can use another data structure which is easier to generate, then convert it to the matrix you want. Use a list with the coalition ids, where the coalition id is the minimum of all object ids. For your example this would be [1 2 1 1]. Using this data structure it's easier to describe a generator.
For each object you have the choice between joining one of the existing coalitions opened by objects with a smaller id or to open a new coalition.
There is probably no vectrorized solution, to implement this use a recursion or dynamic programming.

IBM ESSL: DFT - Real to complex & Complex to real - Final array bigger than initial one

I have a real 2D double precision array. I want to perform a FFT on it, some operations on the result, and an inverse FFT. I am using IBM ESSL library on Blue Gene Q.
The function DRCFT2 is doing the real to complex transform (http://www-01.ibm.com/support/knowledgecenter/SSFHY8_5.3.0/com.ibm.cluster.essl.v5r3.essl100.doc/am5gr_hsrcft2.htm?lang=en). The function DCRFT2 is doing the complex to real transform (http://www-01.ibm.com/support/knowledgecenter/SSFHY8_5.3.0/com.ibm.cluster.essl.v5r3.essl100.doc/am5gr_hscrft2.htm?lang=en).
Beginning real array size is (nx,nz). After DRCFT2, the complex array size is (nx/2+1,nz). After DCRFT2, the final real array size is (nx+2,nz).
Beginning and final real arrays have a different size, how can I compare them?
ps: If I put the first real array in a complex one and perform complex to complex DFTs (DCFT2), then the final result and the first one will have the same size and I can compare them. Anyway to do something similar with DRCFT2 and DCRFT2?
According to the DCRFT2 documentation you link to:
x
is the array X, containing n2 columns of data to be transformed. Due to complex conjugate symmetry, the input consists of only the first ((n1)/2)+1 rows of the array
[...]
On Return
y
[...]
is the array Y, containing n1 rows and n2 columns of results of the real discrete Fourier transform of X.
Where in you case n1=nx and n2=nz. In other words, if you put in a complex array of size (nx/2+1,nz) as input argument to DCRFT2 you should get a real array output of size (nx,nz), so you can readily compare your beginning and final real arrays.

Using SURF algorithm to match objects on MATLAB

The objective is to see if two images, which have one object captured in each image, matches.
The object or image I have stored. This will be used as a baseline:
item1 (This is being matched in the code)
The object/image that needs to matched with-this is stored:
input (Need to see if this matches with what is stored
My method:
Covert images to gray-scale.
Extract SURF interest points.
Obtain features.
Match features.
Get 50 strongest features.
Match the number of strongest features with each image.
Take the ratio of- number of features matched/ number of strongest
features (which is 50).
If I have two images of the same object (two images taken separately on a camera), ideally the ratio should be near 1 or near 100%.
However this is not the case, the best ratio I am getting is near 0.5 or even worse, 0.3.
I am aware the SURF detectors and features can be used in neural networks, or using a statistics based approach. I believe I have approached the statistics based approach to some extent by using 50 of the strongest features.
Is there something I am missing? What do I add onto this or how do I improve it? Please provide me a point to start from.
%Clearing the workspace and all variables
clc;
clear;
%ITEM 1
item1 = imread('Loreal.jpg');%Retrieve order 1 and digitize it.
item1Grey = rgb2gray(item1);%convert to grayscale, 2 dimensional matrix
item1KP = detectSURFFeatures(item1Grey,'MetricThreshold',600);%get SURF dectectors or interest points
strong1 = item1KP.selectStrongest(50);
[item1Features, item1Points] = extractFeatures(item1Grey, strong1,'SURFSize',128); % using SURFSize of 128
%INPUT : Aquire Image
input= imread('MakeUp1.jpg');%Retrieve input and digitize it.
inputGrey = rgb2gray(input);%convert to grayscale, 2 dimensional matrix
inputKP = detectSURFFeatures(inputGrey,'MetricThreshold',600);%get SURF dectectors or interest
strongInput = inputKP.selectStrongest(50);
[inputFeatures, inputPoints] = extractFeatures(inputGrey, strongInput,'SURFSize',128); % using SURFSize of 128
pairs = matchFeatures(item1Features, inputFeatures, 'MaxRatio',1); %matching SURF Features
totalFeatures = length(item1Features); %baseline number of features
numPairs = length(pairs); %the number of pairs
percentage = numPairs/50;
if percentage >= 0.49
disp('We have this');
else
disp('We do not have this');
disp(percentage);
end
The baseline image
The input image
I would try not doing selectStrongest and not setting MaxRatio. Just call matchFeatures with the default options and compare the number of resulting matches.
The default behavior of matchFeatures is to use the ratio test to exclude ambiguous matches. So the number of matches it returns may be a good indicator of the presence or absence of the object in the scene.
If you want to try something more sophisticated, take a look at this example.

Preserving matrix columns using Matlab brush/select data tool

I'm working with matrices in Matlab which have five columns and several million rows. I'm interested in picking particular groups of this data. Currently I'm doing this using plot3() and the brush/select data tool.
I plot the first three columns of the matrix as X,Y, Z and highlight the matrix region I'm interested in. I then use the brush/select tool's "Create variable" tool to export that region as a new matrix.
The problem is that when I do that, the remaining two columns of the original, bigger matrix are dropped. I understand why- they weren't plotted and hence the figure tool doesn't know about them. I need all five columns of that subregion though in order to continue the processing pipeline.
I'm adding the appropriate 4th and 5th column values to the exported matrix using a horrible nested if loop approach- if columns 1, 2 and 3 match in both the original and exported matrix, attach columns 4/5 of the original matrix to the exported one. It's bad design and agonizingly slow. I know there has to be a Matlab function/trick for this- can anyone help?
Thanks!
This might help:
1. I start with matrix 1 with columns X,Y,Z,A,B
2. Using the brush/select tool, I create a new (subregion) matrix 2 with columns X,Y,Z
3. I then loop through all members of matrix 2 against all members of matrix 1. If X,Y,Z match for a pair of rows, I append A and B
from that row in matrix 1 to the appropriate row in matrix 2.
4. I become very sad as this takes forever and shows my ignorance of Matlab.
If I understand your situation correctly here is a simple way to do it:
Assuming you have a matrix like so: M = [A B C D E] where each letter is a Nx1 vector.
You select a range, this part is not really clear to me, but suppose you can create the following:
idxA,idxB and idxC, that are 1 if they are in the region and 0 otherwise.
Then you can simply use:
M(idxA&idxB&idxC,:)
and you will get the additional two columns as well.