Netlogo: Built-in function to calculate the expected profit - netlogo

Sorry for long post. I am newbie in agent-based modelling. So please accept my apology in advance if my question sounds stupid. I am trying to model a scenario where framer (i.e. agent) decides which type of crop should be harvest in different types of fields to increase the profit. The farmer agent has a budget i.e. the amount of money that can be spent on farming each time step equal to $100.
The farmer operates a farm that is subdivided into nine fields, which are arranged in a 3x3
cellular grid. Each field is of the same size. Water availability varies spatially across the fields with a rating of either 1 (driest), 2 (moderate),
or 3 (wettest). The manner in which water availability varies across the fields (i.e. randomly).
The farmer must choose among three crops. As initial parameter settings, the crops have the
following characteristics:
Yield Price Costs Minimum Water Req.
Crop 1 300 20 15 3
Crop 2 200 12 10 2
Crop 3 100 7 5 1
Each crop requires a certain amount of water to grow. Crop yields will only be realized if the crop is
planted in a field with at least the crop’s minimum water requirement.
Now the problem is that I couldn't find any function in Netlogo that calculates the permutation or combination of crop, field, and water requirements to calculate the expected profit. Any help would be high appreciated.

I believe you describe a linear programming problem.
Useful functions for solving Simplex Linear Programming problems are in NumAnal extension, which does not come bundled with NetLogo but which you can get as follows:
In NetLogo, under Tools / Extensions ... you can find NumAnal, probably with no green check-mark. Select it. On the right, you have buttons to install it, and then one to add it to your code. When you click those, it should now get a green checkmark and you should have a new line in your code "extensions [ numanal ]", and you are now able to use those commands, with the "numanal:" prefix, for example, numanal:simplex.
The documentation for it is in the folder where it was installed. But where is that?
Sadly, the documentation for where extensions are downloaded is not current.
https://ccl.northwestern.edu/netlogo/docs/extensions.html#where-extensions-are-located
After exhaustive search by date-modified, I actually found the folder on my Windows 10 laptop here: c:\Users\condor\AppData\Roaming\NetLogo\6.1\extensions
( Note the "\Roaming\" ).
That folder has a README.md text file, and a pdf document named "NumAnal-v3.4.0" explaining how to use it, and an examples folder with code. It is a little dense.
Here's a link to the basics of how to describe a Linear Programming problem, which is beyond the scope of StackOverflow. You can find help via Google.
Here's one 8 minute video ( as of 24-Nov-2019) that might help you figure out if this is what you need.
Simplex Algorithm Explanation (How to Solve a Linear Program)
https://www.youtube.com/watch?v=RO5477EKlXE

Related

FastText window size

I'm currently working on fastText unsuperived learning. I wanted clarify something of context window present in fastText documentation.
In the description of the fasttext library for python https://github.com/facebookresearch/fastText/tree/master/python for training a fastText model there are different arguments, one of the arguments is,
ws: size of the context window
My input file contains lines with 2 - 3 tokens.
Eg.,
Senior Database Administrator
Senior DotNet programmer
Network administrator
Head Programmer (Mainframe)
The default window size 5. Here, in the above example, I have lines with token count less than the window size. What will happen if the window size is bigger than the document length?
FastText (& related algorithms like word2vec) will simply use as much of the context window as is possible.
For example, assume a window-size of 5 and the input tokens:
['Senior', 'Database', 'Administrator']
When training with the 'center' word 'Senior', the algorithm would be ready to consult up-to-5 words in either direction.
But, there are 0 words preceding 'Senior', and only 2 words succeeding 'Senior', so only those 2 following words will be considered as neighbors.
(No 'plug values' will be used as if they were blank-neighbors, nor will any 'bleed-through' to beighboring texts occur.)
Two other related notes to keep in mind:
These algorithms do need neighboring words for any training to occur, so any texts with just a single word are essentially no-ops. (If there happens to be a word that only ever appears alone, you might still see a vector for it at the end of training, but in the implementations with which I am familiar, that will just be a randomly-initialized starting vector, completely untrained by real usage examples.)
Most implementations will simulate a weighting-of-neighboring-words by not *always using exactly your declared window-size, but rather, for each pass over a specific target center word, choosing a random window-size, from 1 to your chosen window-size. In this way, immediate-neighbors are always part of training, while words further away are more-often skipped.

Structuring an experiment with psychtoolbox

I have a design for an experiment that I would like to code in psychtoolbox using MATLAB. I understand the basics of how to use this toolbox however it would be extremely helpful if someone has designed a similar experiment before and could provide me with some code that could help me to carry out the following:
The experimental procedure will be composed of 80 trials divided into 5 blocks with 16 trials in each block. The experiment consists of the participant selecting a number from the screen. There is a desirable number (target number) and an associated less desirable number (lure number). I won't go into further detail about the reasoning behind this experiment as it is not relevant to my question.
I have attached an image that shows 1 block of trials (16 trials). The other 4 blocks are the same as this block.
Target and lure numbers will be presented on the screen to choose from (an example can be seen in image below).
In some of the trials as can be seen from the trials table only one target number and one lure number are presented for the participant to choose from (instead of two targets and two lures).
The lure(s) that appears with each target(s) should not always be the same. I want the lure that is shown with the target to be randomly selected in each trial (as can be seen in the trials image there is more than one possible lure). In the trials image I have attached the trial numbers are presented just for clarity, in each block the presentation of the targets needs to be randomized.
You can use the BalanceTrials function that comes with psychtoolbox. You use all the possible lures and targets as inputs and it returns a random order of all possible combinations. You can also specify a minimum length of the list it returns, but if there are more combinations it will make the list longer to make it balanced. Here is an example:
numberOfTrials = 80;
targetNumbers = {'3','4','5','6','7','4 5','4 6','4 7'};
lureNumbers = {'3','4','5','6','7','4 7'};
[targets, lures] = BalanceTrials(numberOfTrials, 1, targetNumbers, lureNumbers);
You can split this up into 5 blocks or you do it each time for each block.

Am I using PCA in Orange in a correct way?

I am analysing if 15 books can be grouped according to 6 variables (of the 15 books, 2 are written by an author, 6 by an other one, and 7 by an other one). I counted the number of occurrences of the variables and I calculated the percentage. Then I used Orange software to use PCA. I uploaded the file. selected the columns and rows. And when it comes to PCA the program asks me if I want to normalize the data or not, but I am not sure about that because I have already calculated the percentage - is normalize different from calculating the percentage? Moreover, below the normalize button it asks me to show only:... and I have to choose a number between 0 and 100 but I don’t really know what it is.
Could you help me understand what I should do? Thank you in advance

Grouping similar words (bad , worse )

I know there are ways to find synonyms either by using NLTK/pywordnet or Pattern package in python but it isn't solving my problem.
If there are words like
bad,worst,poor
bag,baggage
lost,lose,misplace
I am not able to capture them. Can anyone suggest me a possible way?
There have been numerous research in this area in past 20 years. Yes computers don't understand language but we can train them to find similarity or difference in two words with the help of some manual effort.
Approaches may be:
Based on manually curated datasets that contain how words in a language are related to each other.
Based on statistical or probabilistic measures of words appearing in a corpus.
Method 1:
Try Wordnet. It is a human-curated network of words which preserves the relationship between words according to human understanding. In short, it is a graph with nodes as something called 'synsets' and edges as relations between them. So any two words which are very close to each other are close in meaning. Words that fall within the same synset might mean exactly the same. Bag and Baggage are close - which you can find either by iteratively exploring node-to-node in a breadth first style - like starting with 'baggage', exploring its neighbors in an attempt to find 'baggage'. You'll have to limit this search upto a small number of iterations for any practical application. Another style is starting a random walk from a node and trying to reach the other node within a number of tries and distance. It you reach baggage from bag say, 500 times out of 1000 within 10 moves, you can be pretty sure that they are very similar to each other. Random walk is more helpful in much larger and complex graphs.
There are many other similar resources online.
Method 2:
Word2Vec. Hard to explain it here but it works by creating a vector of a user's suggested number of dimensions based on its context in the text. There has been an idea for two decades that words in similar context mean the same. e.g. I'm gonna check out my bags and I'm gonna check out my baggage both might appear in text. You can read the paper for explanation (link in the end).
So you can train a Word2Vec model over a large amount of corpus. In the end, you will be able to get 'vector' for each word. You do not need to understand the significance of this vector. You can this vector representation to find similarity or difference between words, or generate synonyms of any word. The idea is that words which are similar to each other have vectors close to each other.
Word2vec came up two years ago and immediately became the 'thing-to-use' in most of NLP applications. The quality of this approach depends on amount and quality of your data. Generally Wikipedia dump is considered good training data for training as it contains articles about almost everything that makes sense. You can easily find ready-to-use models trained on Wikipedia online.
A tiny example from Radim's website:
>>> model.most_similar(positive=['woman', 'king'], negative=['man'], topn=1)
[('queen', 0.50882536)]
>>> model.doesnt_match("breakfast cereal dinner lunch".split())
'cereal'
>>> model.similarity('woman', 'man')
0.73723527
First example tells you the closest word (topn=1) to words woman and king but meanwhile also most away from the word man. The answer is queen.. Second example is odd one out. Third one tells you how similar the two words are, in your corpus.
Easy to use tool for Word2vec :
https://radimrehurek.com/gensim/models/word2vec.html
http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf (Warning : Lots of Maths Ahead)

Newbie to Neural Networks

Just starting to play around with Neural Networks for fun after playing with some basic linear regression. I am an English teacher so don't have a math background and trying to read a book on this stuff is way over my head. I thought this would be a better avenue to get some basic questions answered (even though I suspect there is no easy answer). Just looking for some general guidance put in layman's terms. I am using a trial version of an Excel Add-In called NEURO XL. I apologize if these questions are too "elementary."
My first project is related to predicting a student's Verbal score on the SAT based on a number of test scores, GPA, practice exam scores, etc. as well as some qualitative data (gender: M=1, F=0; took SAT prep class: Y=1, N=0; plays varsity sports: Y=1, N=0).
In total, I have 21 variables that I would like to feed into the network, with the output being the actual score (200-800).
I have 9000 records of data spanning many years/students. Here are my questions:
How many records of the 9000 should I use to train the network?
1a. Should I completely randomize the selection of this training data or be more involved and make sure I include a variety of output scores and a wide range of each of the input variables?
If I split the data into an even number, say 9x1000 (or however many) and created a network for each one, then tested the results of each of these 9 on the other 8 sets to see which had the lowest MSE across the samples, would this be a valid way to "choose" the best network if I wanted to predict the scores for my incoming students (not included in this data at all)?
Since the scores on the tests that I am using as inputs vary in scale (some are on 1-100, and others 1-20 for example), should I normalize all of the inputs to their respective z-scores? When is this recommended vs not recommended?
I am predicting the actual score, but in reality, I'm NOT that concerned about the exact score but more of a range. Would my network be more accurate if I grouped the output scores into buckets and then tried to predict this number instead of the actual score?
E.g.
750-800 = 10
700-740 = 9
etc.
Is there any benefit to doing this or should I just go ahead and try to predict the exact score?
What if ALL I cared about was whether or not the score was above or below 600. Would I then just make the output 0(below 600) or 1(above 600)?
5a. I read somewhere that it's not good to use 0 and 1, but instead 0.1 and 0.9 - why is that?
5b. What about -1(below 600), 0(exactly 600), 1(above 600), would this work?
5c. Would the network always output -1, 0, 1 - or would it output fractions that I would then have to roundup or rounddown to finalize the prediction?
Once I have found the "best" network from Question #3, would I then play around with the different parameters (number of epochs, number of neurons in hidden layer, momentum, learning rate, etc.) to optimize this further?
6a. What about the Activation Function? Will Log-sigmoid do the trick or should I try the other options my software has as well (threshold, hyperbolic tangent, zero-based log-sigmoid).
6b. What is the difference between log-sigmoid and zero-based log-sigmoid?
Thanks!
First a little bit of meta content about the question itself (and not about the answers to your questions).
I have to laugh a little that you say 'I apologize if these questions are too "elementary."' and then proceed to ask the single most thorough and well thought out question I've seen as someone's first post on SO.
I wouldn't be too worried that you'll have people looking down their noses at you for asking this stuff.
This is a pretty big question in terms of the depth and range of knowledge required, especially the statistical knowledge needed and familiarity with Neural Networks.
You may want to try breaking this up into several questions distributed across the different StackExchange sites.
Off the top of my head, some of it definitely belongs on the statistics StackExchange, Cross Validated: https://stats.stackexchange.com/
You might also want to try out https://datascience.stackexchange.com/ , a beta site specifically targeting machine learning and related areas.
That said, there is some of this that I think I can help to answer.
Anything I haven't answered is something I don't feel qualified to help you with.
Question 1
How many records of the 9000 should I use to train the network? 1a. Should I completely randomize the selection of this training data or be more involved and make sure I include a variety of output scores and a wide range of each of the input variables?
Randomizing the selection of training data is probably not a good idea.
Keep in mind that truly random data includes clusters.
A random selection of students could happen to consist solely of those who scored above a 30 on the ACT exams, which could potentially result in a bias in your result.
Likewise, if you only select students whose SAT scores were below 700, the classifier you build won't have any capacity to distinguish between a student expected to score 720 and a student expected to score 780 -- they'll look the same to the classifier because it was trained without the relevant information.
You want to ensure a representative sample of your different inputs and your different outputs.
Because you're dealing with input variables that may be correlated, you shouldn't try to do anything too complex in selecting this data, or you could mistakenly introduce another bias in your inputs.
Namely, you don't want to select a training data set that consists largely of outliers.
I would recommend trying to ensure that your inputs cover all possible values for all of the variables you are observing, and all possible results for the output (the SAT scores), without constraining how these requirements are satisfied.
I'm sure there are algorithms out there designed to do exactly this, but I don't know them myself -- possibly a good question in and of itself for Cross Validated.
Question 3
Since the scores on the tests that I am using as inputs vary in scale (some are on 1-100, and others 1-20 for example), should I normalize all of the inputs to their respective z-scores? When is this recommended vs not recommended?
My understanding is that this is not recommended as the input to a Nerual Network, but I may be wrong.
The convergence of the network should handle this for you.
Every node in the network will assign a weight to its inputs, multiply them by their weights, and sum those products as a core part of its computation.
That means that every node in the network is searching for some coefficients for each of their inputs.
To do this, all inputs will be converted to numeric values -- so conditions like gender will be translated into "0=MALE,1=FEMALE" or something similar.
For example, a node's metric might look like this at a given point in time:
2*ACT_SCORE + 0*GENDER + (-5)*VARISTY_SPORTS ...
The coefficients for each values are exactly what the network is searching for as it converges.
If you change the scale of a value, like ACT_SCORE, you just change the scale of the coefficient that will be found by the reciporical of that scaling factor.
The result should still be the same.
There are other concerns in terms of accuracy (computers have limited capacity to represent small fractions) and speed that may enter this, but not being familiar with NEURO XL, I can't say whether or not they apply for this technology.
Question 4
I am predicting the actual score, but in reality, I'm NOT that concerned about the exact score but more of a range. Would my network be more accurate if I grouped the output scores into buckets and then tried to predict this number instead of the actual score?
This will reduce accuracy, although you should converge to a solution much faster with fewer possible outputs (scores).
Neural Networks actually describe very high-dimensional functions in their input variables.
If you reduce the granularity of that function's output space, you essentially state that you don't care about local minima and maxima in that function, especially around the borders between your output scores.
As a result, you are sacrificing information that may be an essential component of the "true" function that you are searching for.
I hope this has been helpful, but you really should break this question down into its many components and ask them separately on different sites -- potentially some of them do belong here on StackOverflow as well.