Why does my Google Optimize experiment show no clear winner - ab-testing

I did a very simple text, the footer contact form on the left of the website or the right of the website. The results showed "no clear winner". But the below data shows that one has 5 conversions vs 1, which I consider to be significant (albeit low numbers). It also says there is a 95% probability that this one will be better.
What am I not understanding about this data? Is it that the numbers are too low in volume to give a reading or is it a bug or is there something I've missed?

Its probably because your AB Test did not have a lot of traffic, in each variant. So 5 conversions vs 1, is not really a big difference between the two.

Related

What to do with not enough training data?

I have a problem that I don't have enough training data for my NN. It is trying to predict the result of a soccer game given the last games which I woulf say is a regression task.
The training data are results of soccer games of the last 15 seasons (which are about 4500 games). Getting to new data would be hard and would take a lot of time.
What should I do now?
Is it good to duplicate the data?
Should I input randomized data? (Maybe noise but I'm not quite sure what that is)
If there is no way of creating more data,
I should probably turn up the learning rate right? (I have it sitting at 0.01 and the momentum at 0.9)
I am using mini batches consisting of 32 training datas in training. Since I don't have a lot of training I don't have a lot of mini batches. Should I stop using them?
To start from the beginning: This is a very theoretical question and is not directly related to programming, which I recommend (in future) to post over at the Data Science Stackexchange.
To go into your problem: 4500 samples is not as bad as it sounds, depending on the exact task at hand. Are you trying to predict the match results (i.e. which team is the winner?), are you looking for more specific predictions (across a lot of different, specific teams)?
If you can make sure that you have a reasonable amount of data per class, one can work with a number of samples lower than what you have. Simply duplicating the data will not help you much, since you are very likely to just overfit on the samples you are seeing, without much of an improvement; Or rather, you will get the same results as training over a longer period (since essentially you see every sample twice per epoch, instead of one).
Again, what usually happens after long training periods is overfitting, so nothing gained here.
Your second suggestion is generally called data augmentation. Instead of simply copying samples, you alter them enough to make it look "different" to the network. But be careful! Data augmentation works well for some inputs, like images, since the change in input is significant enough to not represent the same sample, but still contains meaningful information about the class (a horizontally mirrored image of a cat still shows a "valid cat", unlike a vertically mirrored image, which is more unrealistic in the real world).
Essentially, it depends on your input features to determine where it makes sense to add noise. If you are only changing the results of the previous game, a minor change in input (adding/subtracting one goal at random) can significantly change the prediction you make.
If you slightly scramble ELO scores by a random number, on the other hand, the input value will not be too different, "but different enough" to use it as a novel example.
Turning up the learning rate is not a good idea, since you are essentially just letting the network converge more towards the specific samples. On the contrary, I would argue that the current learning rate is still too high, and you should certainly not increase it.
Regarding mini batches, I think I have referenced this a million times now, but always consider smaller minibatches. From a theoretical point of view, you are more likely to converge to a local minimum.

Determine sample size for A/B testing, more than 2 variants

What R function should we use if we want to decide the sample size for such a test:
10 ads, we want to use a test to decide which ads has the best click through rate. We are able to count the flow and click throughs.
I don’t think the number of variant experiences makes a difference. In each, you compare a metric to the same metric in control, so in each you’ll have its own significant sample size: the smaller the difference with the control, the larger the sample size.
The point of active debate in recent years is something related: how, at run time, to optimize the traffic split between the experiences so that by the time all the variants are called, the most has gone through your winning experience. Google (Experiments) have devised something they call the Multi-Arm Bandid algorithm for that, but as far as I know it hasn't been published in a peer-reviewed journal, and probably for a reason.
Good Luck!

Grouping similar words (bad , worse )

I know there are ways to find synonyms either by using NLTK/pywordnet or Pattern package in python but it isn't solving my problem.
If there are words like
bad,worst,poor
bag,baggage
lost,lose,misplace
I am not able to capture them. Can anyone suggest me a possible way?
There have been numerous research in this area in past 20 years. Yes computers don't understand language but we can train them to find similarity or difference in two words with the help of some manual effort.
Approaches may be:
Based on manually curated datasets that contain how words in a language are related to each other.
Based on statistical or probabilistic measures of words appearing in a corpus.
Method 1:
Try Wordnet. It is a human-curated network of words which preserves the relationship between words according to human understanding. In short, it is a graph with nodes as something called 'synsets' and edges as relations between them. So any two words which are very close to each other are close in meaning. Words that fall within the same synset might mean exactly the same. Bag and Baggage are close - which you can find either by iteratively exploring node-to-node in a breadth first style - like starting with 'baggage', exploring its neighbors in an attempt to find 'baggage'. You'll have to limit this search upto a small number of iterations for any practical application. Another style is starting a random walk from a node and trying to reach the other node within a number of tries and distance. It you reach baggage from bag say, 500 times out of 1000 within 10 moves, you can be pretty sure that they are very similar to each other. Random walk is more helpful in much larger and complex graphs.
There are many other similar resources online.
Method 2:
Word2Vec. Hard to explain it here but it works by creating a vector of a user's suggested number of dimensions based on its context in the text. There has been an idea for two decades that words in similar context mean the same. e.g. I'm gonna check out my bags and I'm gonna check out my baggage both might appear in text. You can read the paper for explanation (link in the end).
So you can train a Word2Vec model over a large amount of corpus. In the end, you will be able to get 'vector' for each word. You do not need to understand the significance of this vector. You can this vector representation to find similarity or difference between words, or generate synonyms of any word. The idea is that words which are similar to each other have vectors close to each other.
Word2vec came up two years ago and immediately became the 'thing-to-use' in most of NLP applications. The quality of this approach depends on amount and quality of your data. Generally Wikipedia dump is considered good training data for training as it contains articles about almost everything that makes sense. You can easily find ready-to-use models trained on Wikipedia online.
A tiny example from Radim's website:
>>> model.most_similar(positive=['woman', 'king'], negative=['man'], topn=1)
[('queen', 0.50882536)]
>>> model.doesnt_match("breakfast cereal dinner lunch".split())
'cereal'
>>> model.similarity('woman', 'man')
0.73723527
First example tells you the closest word (topn=1) to words woman and king but meanwhile also most away from the word man. The answer is queen.. Second example is odd one out. Third one tells you how similar the two words are, in your corpus.
Easy to use tool for Word2vec :
https://radimrehurek.com/gensim/models/word2vec.html
http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf (Warning : Lots of Maths Ahead)

Training tesseract to use with iPhone

I am trying to use tesseract-2.04 in my iPhone application and just want to detect the numbers. What I am doing here is first I am cross compiling tesseract to generate lib file using this post http://robertcarlsen.net/2009/07/15/cross-compiling-for-iphone-dev-884 and then using the the demo application at http://robertcarlsen.net/2010/01/12/ocr-for-iphone-source-1080 , but the results far away than realistic.
I am not able to resolve the issue or how to train tesseract so that it comes closure for practical usage.
Please help.
Thanks,
Madhup
I get quite good results setting
TessBaseAPI::SetVariable("tessedit_char_whitelist", "0123456789");
while gently urging the user to let the numbers fit in a certain box. This makes locating the numbers easier for me, and ensures the user keeps the image steady and at a reasonable distance leading to a sharper image.
I have thought about altering valid_word() in tesseract-2.04/dict/permute.cpp, but there seems to be no need for that.
The next step will be to hardcode a minimum/maximum char size so recognition time can become way less than the 500 ms it is now. Then the next step will be to add some code that keeps track of results in time, so that reading 5 90% of the time and 8 only 10% will lead the code to remember the 5.
It all depends on the use case you have. I'm lucky in the sense that I'm allowed to just show a 200x50 box which will contain the number.

Can an artificial neural network predict the outcome of sports games? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I was trying to find something original and fun to do with artificial neural networks (ANNs) as a personal/learning project and I though it would be cool if I could predict the results of sports games (especially NHL games).
I'm pretty sure it would be easy to evolve an ANN that can predict which team is most likely to win (usually the team with the better record). However, what I would like to do is create an ANN that would tell how likely the outcome is, similar to bookmaker odds.
Is this something an ANN can do? In the affirmative, what kind of success can I expect? I know I can't beat the bookmaker (at least not with a software solution). I want do this as a recreational project/challenge to myself. I don't expect to bet money on sports games with this project.
Way back in the days of the IBM XT I played with a shareware ANN program to try and improve my chances on the British football (soccer) pools. This is a form of betting where you try and predict which football matches will result in draws. I assigned each team a number then looked back thorough past results and from them generated a single digit for the result. From memory it was 0 from a home win , 1 for an away win and 2 for a draw. Each result went on a single line in a training file. I would then run the training file through the program and generate the ANN settings. I would then look up the following Saturdays matches and feed them into the ANN then look for matches predicted as draws.
As the weeks went on my predictions of draws did definetly become more and more accurate. However ...
1) The XT was so slow that by Christmas it was taking 24 hours to generate the ANN settings from the training data. I really had better things to do with my precious (and expensive) PC.
2) Although it was better at predicting draws it wasn't predicting enough to actually win any money. Looking back I suppose the program had just worked out that Manchester United would always beat Sheffield United. This was more football knowledge than I had but not enough to win any money.
3) Entering the results into the training data and then generating the forthcoming matches data was taking me ages and to be honest sport bores me rigid.
So I gave up and didn't become a millionaire.
These days however PC's are much faster and much of the training data could be scraped from the web. But I still doubt it is a route to a fortune but its certainly an interesting project.
Ian
A reply above stated:
I know that if the bookmakers odds could be beaten by an ANN,
bookmakers would already be using one to fix their odds.
Bookmakers don't set the line based on their analysis of the teams - they set it based on their analysis of the betting public's opinion of the teams. An ideal line for the bookie is where he has exactly the same amount bet on each side of the line - then he is guaranteed a profit = the 'juice' on the losers' bets. They move the line as game approaches to try to keep that 50/50 split. Bookie may think Home team -5 is accurate line based on game analysis, but if he expects that will draw 2x $$ on the Home team he will not set the line at -5 - he will set at -7 or -8 - to where he expects to draw equal $$ for both -5 and +5 bets.
ANNs are really good at pattern matching and prediction, so yes, odds are you could build an ANN that does what you want.
You'll need more than just team win/loss ratio to make it really effective however. Feed it stats for the players, too. For real effectiveness, try to include game-flow information... like which players are on the line for each play (for football, for example).
Ultimately, the biggest problem you'll run into (aside from the whole "writing the ANN" issue) is getting the data you need to feed it.
I've done some stock market predictions with an AI and my conclusion is that it is not very hard to make an AI that gets good results with the historical data.
Making winning transactions in the future is a different ballgame.
I have just worked on this very problem (predicting English Premier League games) for the past 10 days, and ended up with very similar results using 3 different methods: SVM, Logistic Regression, and NN.
LR and NN will give probabilities. SVM outputs 0/1 (but it can be tweaked for probas too (I haven't tried yet).
I needed a "massive" (by my standards at least) feature set though (almost 300) and a good chunk of data (13 years worth).
Re. data, I got it from the web, simply.
Conclusion: I can just about match the bookies in terms of accuracy (predicting victories in my case). If I add the pre-match odds to the feature set, I get the exact same accuracy as the bookies (as expected), but no better (surely meaning my feature set is summarized in the bookies odds, and they have a little extra knowledge on top).
I'm sure there is a way to get better accuracy, either by improving the algos, or more likely by having extremely granular data (as in which players play which games, for how many minutes, and a lot of player-level historical stats, so as to build bottom-up models of team performance).
But bottom line is I can testify NNs work quite well for that purpose. SVM is slightly better though, in my limited experience.
I think it's indeed all about data, but there's no end to what you could feed it with in order to be more accurate : winning/loosing streaks, players biorhythms, player's girlfriends mood before the game, minor/major injuries they suffered in the recent past, extra-sportive events that are bothering the players, etc, etc, etc.
But I don't think you can accurately predict which team is more likely to win, it would be just a more-or-less educated guess.
In my opinion and experience, because of the excessively large number of factors in play, designing and training the ANN will be unreasonably complex and time-consuming. ANNs are good at pattern matching, and game prediction takes much deductive reasoning rather than mere pattern matching.
But if you want to enjoy learning neural networks, it will be a good adventure. If you are successful, you might want to host your code somewhere for others to see and learn!
For game prediction, it would be much easier and faster with decision trees or a rules engine and so on. This will be no easy task either, but it will be another interesting activity.
My belief is that the unpredictability of an event is due to lack of information and understanding...If you have all the knowledge, then yes it could be done. Or, the more knowledge you have, the better it can be done.
So in theory, the answer is yes.
However, in practice, you can get a PhD and have a whole career working on this question and you still may not succeed.