Can I use JIT from JAX with NetworkX algorithms? For instance, if were to compute the average clustering coefficient for a NetworkX graph object, is it possible to use the #jit decorator to speed up my analysis pipeline?
No, JAX's JIT and other transforms only work with functions implemented via JAX primitives (generally operations defined in jax.lax, jax.numpy, and related submodules). They cannot be used to compile/transform arbitrary Python code.
Related
In order to leverage FCN facilities for an image processing project, firstly I headed to use MatConvNet. However, in the preparation steps I found that MATLAB provided a new function (fcnLayers) to do so.
Can fcnLayers and its functionality be compared with MatConvNet? Specifically, I mean that is it possible to train models or use pre-trained ones?
Finally, may I achieve same result by using each of them?
I checked the descriptions of
pagerank,
pagerank_numpy and
pagerank_scipy
from NetworkX documentation. I can't see the difference.
pagerank(G, alpha=0.85, personalization=None, max_iter=100, tol=1e-06, nstart=None, weight='weight', dangling=None)
pagerank_numpy(G, alpha=0.85, personalization=None, weight='weight', dangling=None)
pagerank_scipy(G, alpha=0.85, personalization=None, max_iter=100, tol=1e-06, weight='weight', dangling=None)
What are the differences among them?
They all compute the same thing but with slightly different methods to compute the largest eigenvalue/eigenvector (the pagerank scores).
pagerank is a pure-Python implementation
pagerank_numpy uses the dense linear algebra subpackage of numpy
pagerank_scipy uses the sparse linear algebra subpackage of scipy
The pagerank_scipy implementation should be fastest and use the least memory for large graphs.
I am new in Data Mining analytic and Machine Learning. I have been trying to compare the use of Predictive analysis and Clustering analysis using RapidMiner and Weka for my college assignment.
Just after I study the advantages and disadvantages from both tools and starting to do the analyzing process I found some problems. I tried doing Clustering using K-means and simpleKmeans for Weka and Regression analysis using LinearRegression and I am not quite satisfied with the result, since they contain result that significantly different. all of that I used a same datasets. numerical datasets.
I have been spending a lot of my time trying to figure something out by studying the initialization for each algorithm each tools since the interface is different and there are some parameter that is on RapidMiner but not in Weka or otherwise, so I am a bit confused. (is it the problem?)
Despite that what do you think is wrong? is there some initialization process that I missed? or is it because the code is different in each tools even they use the same algorithm?
Thank you for your answer!
Weka often uses built-in normalization at least in k-means and other algorithms.
Make sure you have disabled this if you want to make results comparable.
Also understand that k-means is a randomized algorithm. Different results even from the same package are to be expected (and desirable).
did you use WEKA itself or rapidminer's WEKA extension? Did you try to compare the results of WEKA with RM WEKA?
In the Matlab Statistics toolbox there are several functions for handling Hidden Markov Models (HMM), but they all work with discrete observation symbols. Does anyone know if there are toolboxes or functions (perhaps from a third party) that can handle continuous observation variables?
We came to an acceptable solution in the comments, so I'll post it here for future reference:
WEKA has appropriate functions for handling HMMs, and as it has a Java API it is an ideal candidate for use with MATLAB. MATLAB itself is a Java interpreter, so you can make direct calls to the WEKA api, passing and retrieving data.
Here is a matlab fileexchange example demonstrating the use of WEKA through MATLAB.
Here is a Java example showing how to use a generic WEKA classifier, which should be applicable to the third party HMM
Prof. Zoubin Ghahramani has written code for the EM algorithm:
http://mlg.eng.cam.ac.uk/zoubin/software.html
I am trying to do some text classification with SVMs in MATLAB and really would to know if MATLAB has any methods for feature selection(Chi Sq.,MI,....), For the reason that I wan to try various methods and keeping the best method, I don't have time to implement all of them. That's why I am looking for such methods in MATLAB.Does any one know?
svmtrain
MATLAB has other utilities for classification like cluster analysis, random forests, etc.
If you don't have the required toolbox for svmtrain, I recommend LIBSVM. It's free and I've used it a lot with good results.
The Statistics Toolbox has sequentialfs. See also the documentation on feature selection.
A similar approach is dimensionality reduction. In MATLAB you can easily perform PCA or Factor analysis.
Alternatively you can take a wrapper approach to feature selection. You would search through the space of features by taking a subset of features each time, and evaluating that subset using any classification algorithm you decide (LDA, Decision tree, SVM, ..). You can do this as an exhaustively or using some kind of heuristic to guide the search (greedy, GA, SA, ..)
If you have access to the Bioinformatics Toolbox, it has a randfeatures function that does a similar thing. There's even a couple of cool demos of actual use cases.
May be this might help:
There are two ways of selecting the features in the classification:
Using fselect.py from libsvm tool directory (http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/#feature_selection_tool)
Using sequentialfs from statistics toolbox.
I would recommend using fselect.py as it provides more options - like automatic grid search for optimum parameters (using grid.py). It also provides an F-score based on the discrimination ability of the features (see http://www.csie.ntu.edu.tw/~cjlin/papers/features.pdf for details of F-score).
Since fselect.py is written in python, either you can use python interface or as I prefer, use matlab to perform a system call to python:
system('python fselect.py <training file name>')
Its important that you have python installed, libsvm compiled (and you are in the tools directory of libsvm which has grid.py and other files).
It is necessary to have the training file in libsvm format (sparse format). You can do that by using sparse function in matlab and then libsvmwrite.
xtrain_sparse = sparse(xtrain)
libsvmwrite('filename.txt',ytrain,xtrain_sparse)
Hope this helps.
For sequentialfs with libsvm, you can see this post:
Features selection with sequentialfs with libsvm