Classifying chemical raw materials to introduce in Schema.org - schema.org

I should introduce the products of a company that creates chemical, mineral and agricultural raw materials.
To explain more clearly, some examples are Ammonium Sulfate (as an agricultural product), Phosphate (as Minerals) and Sulphuric acid as a chemical raw material.
Reviewing http://schema.org/Product, http://schema.org/Thing, https://schema.org/docs/meddocs.html and also https://schema.org/MedicalEntity, I don't know which one (or maybe another template to use) are the best template to make use of.

As you said, these are products (i.e., they offer these, right?), so the Product type seems to be appropriate.
MedicalEntity (or one of its sub-types) would only be appropriate for those products that are "related to health and the practice of medicine". If that’s the case, you could add it as additional type.

Related

Watson Assistant: Can I define Intent using Entities in the Examples?

How to I create an #Intent which looks something like this:
How much is a #ProductType?
Whereas the #ProductType is an simple Entity which consists of:
Soft Drinks: Coke, Pepsi, Sprite, Fanta
Fruits: Apple, Banana, Watermelon
I tried adding an Intent with above settings, but it doesn't seem to work. Is such ability natively supported in IBM Watson? Or otherwise, do I need to manually handle in the Dialog, using Conditions and stuffs? Please kindly advise.
The training is based on regular language and typical sentences or phrases. So #ProductType is not what you want in the phrase, but any of the fruits or drinks.
By defining the entities, Watson Assistant later learns the connection and to identify the entities and intents.
To get started, you define the intents and entities. Both can be imported from lists. Then you add the dialog which references the different types.
This blog should give insight to all the ways to train an entity and how it is used within intents.
https://medium.com/ibm-watson/all-about-entities-dictionaries-and-patterns-with-watson-assistant-part-1-5ef7254df76b
There are a number of possible pipelines you can choose from.
1. Indirect references: this is the preferred method.
Use natural language in your intent training data. "I want to buy a pear"
Watson will automatically see the other values you have related to pear and use those as intent training as well. This will be the fastest and simplest way to manage your data
2. Direct references: this should only be used if absolutely necessary
Directly reference the entity in your intent data. "I want to buy an #pear"
Nothing is done in the UI to tell you this works, but it does. This tells Watson the entity is a very important term and will increase the weight, as well as reference all synonyms with high weight. This is more effort for you to go through your entire workspace and relabel everything this way, hence why it is not recommended unless absolutely necessary. By doing this, you also tell watson that when the system sees various fruits without the # symbol, to ignore them as entities which is not ideal
3. Contextual entities. This is highlighting them like in your screenshot.
Note the UI has been updated so there is no an annotation mode instead of just highlighting. This builds a model around the entity, and is good for things like names or locations, but not necessary for a small list of items like crayons in a box, or fruit in a store. This will ignore all of the dictionary values youve created and only look at the model. It should be used according to the blog above when the use case is ideal.
What #data_henrik answered was partially correct. But it doesn't seem like Watson Assistant "automatically" learns the preferred #Entity just by simply inputting the pure (plain-text) Examples into the #Intent. In fact, that step was required. But we still need to do one more step.
After keying in the good plain-text Examples into the #Intent, we then still need to "right click" on the text-string of the possible #Entity entry, and then choose (teach Watson) the correct #Entity name from the dropdown list appeared.
Only then Watson starts to understand such; this #Intent uses that #Entity, I suppose.
Thank you #data_henrik, and appreciate your hint.

Translating a State Transition System to properties LTL formulae

In the context of bounded model checking, one describes the system as a State Transition System and the properties that need to be checked.
When one needs to provide multiple system descriptions and properties to the Model Checker Tool, it can become tedious to write the property by hand. In my case, I use some temporal logic.
How does one automate the process of translating/parsing the system description and deriving verifiable properties from it (ideally, a set of Initial states, Transitions, Set of States).
For example, consider the Microwave Example given here Given such a system description, how can I arrive at the specifications in an efficient manner?
There is no such open source tool that I know of, that can do this. Any approaches in terms of ideas, theories are welcome.
You can't automatically derive LTL formulae from automata as you suggest, because automata are more expressive than LTL formulae.
That leaves you with mainly two options: 1. find a verification tool which accepts specifications that are directly expressed as automata (I'm not sure which ones do, but I suspect it is worth checking SPIN and NuSMV for this feature.), or 2. use a meta-specification language that makes the writing of specifications easier; for example, https://www.isp.uni-luebeck.de/salt (doi: 10.1007/11901433_41) or IEE1850/PSL. While PSL is more a language definition for tool-implementors, SALT already offers a web front-end that translates your input straight into LTL.
(By the way, I find your approach methodologically challenging though: you're not supposed to derive formulae from your model, but from your initial system description as it is this very model which you're going to verify. But I am not a 100% sure, if I understood this point in your question correctly.)
I think properties of a system, e.g. Microwave system, come from technical and common sense expectation and requirements, not the model. E.g. microwave is supposed to cook the food. But it is not supposed to cook with door open. Nevertheless a repository of typical LTL pattern can be useful to define properties. It also lists properties along with more familiar regex and automata properties.
If you certain you still want to translate automata to LTL automatically check
https://mathoverflow.net/questions/96963/translate-a-buchi-automaton-to-ltl
Kansas Specification Property Repository
http://patterns.projects.cs.ksu.edu/documentation/patterns.shtml

Schema.org type for online encyclopedia entries - article or dataset?

I am looking for recommendations on what schema.org type to use for entries in an online encyclopedia.
My initial thoughts were to define entries as 'article', but I am now leaning towards 'dataset' as I don't feel these are articles in a news sense.
Looking for any feedback.
Article should be appropriate. It’s not only for news (for which there is the more specific NewsArticle type), but also for scientific papers, microblog/blog posts, guides/tutorials, forum posts, etc.
Dataset is for "structured information". The typical encyclopedia entry probably isn’t structured in that sense (e.g., I wouldn’t use it for Wikipedia, but maybe for DBpedia).
If you think both types could be appropriate for your case, you could use them together (i.e., each entry is an Article and a Dataset).

Is it possible to get the index of a exchange using Finance::Quote?

I need to get the index of a exchange like NASDAQ rather than the price of a specific stock in that exchange. I suppose that Finance::Quote will come to the rescue , but after a quick go-through of the document, I find it the way one can use the module for query is like:
%info = $q->fetch("australia","CML")
which means both the exchange and the stock should be specified in the query. then the question is: does the index itself can be treated as a stock and has a symbol name which can be used in the query?
Of course, if you have other way can meet my needs rather than using Finance::Quote, please feel free to write down your solution.
The problem with your question is that you are assuming that there is just one index for a particular exchange. Whilst there may well be a particular index that is dominant (eg. for stocks primarily traded on the London Stock Exchange, the FTSE 100 might be considered the main index; similarly for the NYSE it would be the Dow Jones Industrial Average) other exchanges may have a less clear leader in their collection of associated indicies (eg. for the Australian Stock Exchange, the S&P/ASX 200 and the All Ordinaries index are both frequently quoted side-by-side in the evening broadcast news).
Symbology of stocks, indicies, option chains, futures, etc is quite a complicated field in financial IT. Many of the symbology standards are backed by a data vendor (eg. Reuters, Bloomberg) and use of their standards requires a commercial license. On the other hand there are other efforts aiming to make symbology more open (Bloomberg themselves are behind one of these efforts).
I'm not familiar with the data sources of the Finance::Quote package you reference, but if you are serious about accessing market data (ie. prepared to pay for it) but don't need the cost/complexity/speed of a solution from Reuters, Bloomberg, etc, you could do alot worse than check out what Xignite offers in the way of market data accessible via web services.
the symbol for the nasdaq composite is "^IXIC". For nyse composite it's "^NYA".
each quote provider might have a different syntax though.

How do I adapt my recommendation engine to cold starts?

I am curious what are the methods / approaches to overcome the "cold start" problem where when a new user or an item enters the system, due to lack of info about this new entity, making recommendation is a problem.
I can think of doing some prediction based recommendation (like gender, nationality and so on).
You can cold start a recommendation system.
There are two type of recommendation systems; collaborative filtering and content-based. Content based systems use meta data about the things you are recommending. The question is then what meta data is important? The second approach is collaborative filtering which doesn't care about the meta data, it just uses what people did or said about an item to make a recommendation. With collaborative filtering you don't have to worry about what terms in the meta data are important. In fact you don't need any meta data to make the recommendation. The problem with collaborative filtering is that you need data. Before you have enough data you can use content-based recommendations. You can provide recommendations that are based on both methods, and at the beginning have 100% content-based, then as you get more data start to mix in collaborative filtering based.
That is the method I have used in the past.
Another common technique is to treat the content-based portion as a simple search problem. You just put in meta data as the text or body of your document then index your documents. You can do this with Lucene & Solr without writing any code.
If you want to know how basic collaborative filtering works, check out Chapter 2 of "Programming Collective Intelligence" by Toby Segaran
Maybe there are times you just shouldn't make a recommendation? "Insufficient data" should qualify as one of those times.
I just don't see how prediction recommendations based on "gender, nationality and so on" will amount to more than stereotyping.
IIRC, places such as Amazon built up their databases for a while before rolling out recommendations. It's not the kind of thing you want to get wrong; there are lots of stories out there about inappropriate recommendations based on insufficient data.
Working on this problem myself, but this paper from microsoft on Boltzmann machines looks worthwhile: http://research.microsoft.com/pubs/81783/gunawardana09__unified_approac_build_hybrid_recom_system.pdf
This has been asked several times before (naturally, I cannot find those questions now :/, but the general conclusion was it's better to avoid such recommendations. In various parts of the worls same names belong to different sexes, and so on ...
Recommendations based on "similar users liked..." clearly must wait. You can give out coupons or other incentives to survey respondents if you are absolutely committed to doing predictions based on user similarity.
There are two other ways to cold-start a recommendation engine.
Build a model yourself.
Get your suppliers to fill in key information to a skeleton model. (Also may require $ incentives.)
Lots of potential pitfalls in all of these, which are too common sense to mention.
As you might expect, there is no free lunch here. But think about it this way: recommendation engines are not a business plan. They merely enhance the business plan.
There are three things needed to address the Cold-Start Problem:
The data must have been profiled such that you have many different features (with product data the term used for 'feature' is often 'classification facets'). If you don't properly profile data as it comes in the door, your recommendation engine will stay 'cold' as it has nothing with which to classify recommendations.
MOST IMPORTANT: You need a user-feedback loop with which users can review the recommendations the personalization engine's suggestions. For example, Yes/No button for 'Was This Suggestion Helpful?' should queue a review of participants in one training dataset (i.e. the 'Recommend' training dataset) to another training dataset (i.e. DO NOT Recommend training dataset).
The model used for (Recommend/DO NOT Recommend) suggestions should never be considered to be a one-size-fits-all recommendation. In addition to classifying the product or service to suggest to a customer, how the firm classifies each specific customer matters too. If functioning properly, one should expect that customers with different features will get different suggestions for (Recommend/DO NOT Recommend) in a given situation. That would the 'personalization' part of personalization engines.