Which algorithm/implementation is best for link prediction in an evolving heterogeneous knowledge graph? - knowledge-graph

I want to do link prediction in a heterogeneous knowledge graph which will be evolving over time. The entity types and relations will be the same. But there will be unseen nodes.
I would like to know whether graphsage is the right choice for this purpose?. Does pytorch geometric support Graphsage for heterogeneous graphs?

Related

How to do Hierarchical Heteroskedastic Sparse GPs in GPflow?

Is is possible to model a general trend from a population using GPflow and also have individual predictions, as in Hensman et al?
Specifically, I am trying to fit spatial data from a bunch of individuals from a clinical assessment. For each individual, am I dealing with approx 20000 datapoints (different number of recordings for each individual), which definitely restricts myself to a sparse implementation. In addition to this, there also seemes that I need an input dependent noise model, hence the heteroskedasticity.
I have fitted a hetero-sparse model as in this notebook example, but I am not sure how to scale it to perform the hierarchical learning. Any ideas would be welcome :)
https://github.com/mattramos/SparseHGP may be helpful. This repo is gives GPFlow2 code for modelling a sparse hierarchical model. Note, there are still some rough edges in the implementation that require an expensive for loop to be constructed.

How to implement three-way clustering in python

I am relatively a learner in the field of datascience. Recently I came across these concepts and I am really keen to implement them - i.e. the concept of multimodal clustering applications. (From here I got the idea - https://scikit-learn.org/stable/modules/biclustering.html)
I know about different clustering algorithms like DBSCAN, OPTICS, K-Means (which is very popular), etc. I understand that in all these algorithms, a single column data-points are considered for clustering a dataset.
Suppose someone has a dataset like: http://archive.ics.uci.edu/ml/datasets/Iris
How can one use 3 or more columns to cluster the distinct classes in this Iris dataset i.e. as per the defined terms how to implement multi-modal classification on this kind of dataset.
Or being new - I doubt that I am confusing it with multi-dimensional cube concepts.
It would be a great help if someone could clarify and explain this to me.

What is the type or family of recsys algorithms for recommending similar users based on their interests?

I am learning recommendation systems from Coursera MooC. I see there are majorly three types of filtering methods (in introduction course).
a. Content-based filtering
b. Item-Item collaborative filtering
c. User-User collaborative filtering
Having understood this, I am not sure where does the - similar users recommendation based on the interests/preferences belong to? For example, consider I have User->TopicsOfInterest0..n relation. I want to recommend other similar users based on their respective TopicsOfInterest (vector).
I'm not sure that these three types are an exhaustive classification of all recommender systems.
In fact, any matrix-factorization based algorithm (SVD, etc.) is both item-based and user-based at the same time. But the TopicsOfInterest (factors) are inferred automatically by the algorithm. For example, Apache Spark includes an implementation of the alternating least squares (ALS) algorithm. Spark's API has the userFeatures method, which returns (roughly) a matrix, predicting users's attitude to each feature.
The only thing left to do is to compute a set of most similar users to a given one (e.g. find vectors, that are closest to a given one by cosine similarity).

Fusion Classifier in Weka?

I have a dataset with 20 features. 10 for age and 10 for weight. I want to classify the data for both separately then use the results from these 2 classifiers as an input to a third for the final result..
Is this possible with Weka????
Fusion of decisions is possible in WEKA (or with any two models), but not using the approach you describe.
Seeing as your using classifiers, each model will only output a class. You could use the two labels produced as features for a third model, but the lack of diversity in your inputs would most likely prevent the third model from giving you anything interesting.
At the most basic level, you could implement a voting scheme. Give each model a "vote" and then take assume that the correct class is the majority voted class. While this will give a rudimentary form of fusion, if you're familiar with voting theory you know that majority-rules somewhat falls apart when you have more than two classes.
I recommend that you use Combinatorial Fusion to fuse the output of the two classifiers. A good paper regarding the technique is available as a free PDF here. In essence, you use the Classifer::distributionForInstance() method provided by WEKA's classifiers and then use the sum of the distributions (called "scores") to rank the classes, choosing the class with the highest rank. The paper demonstrates that this method is superior to doing just voting alone.

Which predictive modelling technique will be most helpful?

I have a training dataset which gives me the ranking of various cricket players(2008) on the basis of their performance in the past years(2005-2007).
I've to develop a model using this data and then apply it on another dataset to predict the ranking of players(2012) using the data already given to me(2009-2011).
Which predictive modelling will be best for this? What are the pros and cons of using the different forms of regression or neural networks?
The type of model to use depends on different factors:
Amount of data: if you have very little data, you better opt for a simple prediction model like linear regression. If you use a prediction model which is too powerful you run into the risk of over-fitting your model with the effect that it generalizes bad on new data. Now you might ask, what is little data? That depends on the number of input dimensions and on the underlying distributions of your data.
Your experience with the model. Neural networks can be quite tricky to handle if you have little experience with them. There are quite a few parameters to be optimized, like the network layer structure, the number of iterations, the learning rate, the momentum term, just to mention a few. Linear prediction is a lot easier to handle with respect to this "meta-optimization"
A pragmatic approach for you, if you still cannot opt for one of the methods, would be to evaluate a couple of different prediction methods. You take some of your data where you already have target values (the 2008 data), split it into training and test data (take some 10% as test data, e.g.), train and test using cross-validation and compute the error rate by comparing the predicted values with the target values you already have.
One great book, which is also on the web, is Pattern recognition and machine learning by C. Bishop. It has a great introductory section on prediction models.
Which predictive modelling will be best for this? 2. What are the pros
and cons of using the different forms of regression or neural
networks?
"What is best" depends on the resources you have. Full Bayesian Networks (or k-Dependency Bayesian Networks) with information theoretically learned graphs, are the ultimate 'assumptionless' models, and often perform extremely well. Sophisticated Neural Networks can perform impressively well too. The problem with such models is that they can be very computationally expensive, so models that employ methods of approximation may be more appropriate. There are mathematical similarities connecting regression, neural networks and bayesian networks.
Regression is actually a simple form of Neural Networks with some additional assumptions about the data. Neural Networks can be constructed to make less assumptions about the data, but as Thomas789 points out at the cost of being considerably more difficult to understand (sometimes monumentally difficult to debug).
As a rule of thumb - the more assumptions and approximations in a model the easier it is to A: understand and B: find the computational power necessary, but potentially at the cost of performance or "overfitting" (this is when a model suits the training data well, but doesn't extrapolate to the general case).
Free online books:
http://www.inference.phy.cam.ac.uk/mackay/itila/
http://ciml.info/dl/v0_8/ciml-v0_8-all.pdf