Recommendation Sytem without rating data - cluster-analysis

I'm trying to figure out how to solve a task given to me by a customer.
The customer is a company that distributes books to final shops (books shop). It would like to know in which way it can give a service to its sales division starting from the assumpition that we can identify cluster of shops that are similar in terms of customers they sell to, and then similar in sub-market demand.
The data which the recommendation system relies on is data on the features of the single book (author, genre, date of pubblication, etc...) and the historical data of the past book orders made by the shops to the company.
Is it possible, with this data, to setup a recommendation system? If yes, which approach should I use?
Thanks for any helps,
Filippo.

Related

How to handle static data in ES/CQRS?

After reading dozens of articles and watching hours of videos, I don't seem to get an answer to a simple question:
Should static data be included in the events of the write/read models?
Let's take the oh-so-common "orders" example.
In all examples you'll likely see something like:
class OrderCreated(Event):
....
class LineAdded(Event):
itemID
itemCount
itemPrice
But in practice, you will also have lots of "static" data (products, locations, categories, vendors, etc).
For example, we have a STATIC products table, with their SKUs, description, etc. But in all examples, the STATIC data is never part of the event.
What I don't understand is this:
Command side: should the STATIC data be included in the event? If so, which data? Should the entire "product" record be included? But a product also has a category and a vendor. Should their data be in the event as well?
Query side: should the STATIC data be included in the model/view? Or can/should it be JOINED with the static table when an actual query is executed.
If static data is NOT part of the event, then the projector cannot add it to the read model, which implies that the query MUST use joins.
If static data IS part of the event, then let's say we change something in the products table (e.g. typo in the item description), this change will not be reflected in the read model.
So, what's the right approach to using static data with ES/CQRS?
Should static data be included in the events of the write/read models?
"It depends".
First thing to note is that ES/CQRS are a distraction from this question.
CQRS is simply the creation of two objects where there was previously only one. -- Greg Young
In other words, CQRS is a response to the idea that we want to make different trade offs when reading information out of a system than when writing information into the system.
Similarly, ES just means that the data model should be an append only sequence of immutable documents that describe changes of information.
Storing snapshots of your domain entities (be that a single document in a document store, or rows in a relational database, or whatever) has to solve the same problems with "static" data.
For data that is truly immutable (the ratio of a circle's circumference and diameter is the same today as it was a billion years ago), pretty much anything works.
When you are dealing with information that changes over time, you need to be aware of the fact that that the answer changes depending on when you ask it.
Consider:
Monday: we accept an order from a customer
Tuesday: we update the prices in the product catalog
Wednesday: we invoice the customer
Thursday: we update the prices in the product catalog
Friday: we print a report for this order
What price should appear in the report? Does the answer change if the revised prices went down rather than up?
Recommended reading: Helland 2015
Roughly, if you are going to need now's information later, then you need to either (a) write the information down now or (b) write down the information you'll need later to look up now's information (ex: id + timestamp).
Furthermore, in a distributed system, you'll need to think about the implications when part of the system is unavailable (ex: what happens if we are trying to invoice, but the product catalog is unavailable? can we cache the data ahead of time?)
Sometimes, this sort of thing can turn into a complete tangle until you discover that you are missing some domain concept (the invoice depends on a price from a quote, not the catalog price) or that you have your service boundaries drawn incorrectly (Udi Dahan talks about this often).
So the "easy" part of the answer is that you should expect time to be a concept you model in your solution. After that, it gets context sensitive very quickly, and discovering the "right" answer may involve investigating subtle questions.

Creating custom daily evalutations in Moodle

I've been looking for a solution for kindergarten teachers to submit daily student evaluations (different criteria) in Moodle. So far, the closest solution that I've found is the Attendance plugin.
Does anyone know of a plugin that allows the teacher to submit a daily evaluation?
Another option that I'm looking into is Moodle Competency, which can actually fit the need, however, it looks like competency is not cumulative ... if I can find a way to make it cumulative that will be awesome.
For example, one of the competencies we have is "able to read sentences" and the scale is "1 - non-developed", "2- being developed" and "3- fully developed". At any point, the teacher or school admin would like to know how competent the student is. In our case, if this is an indicator that is being responded daily, we should be able to take the average and be able to evaluate the student.
The competency framework (to my understanding) doesn't calculate the average, rather it relies on being rated by the teacher.
Any thoughts where I should continue to look?
Attendance could be a great solution to your needs.
It could be hidden to the ones acting like students (I'm not shure if the kindergarden kids be interested in see this, maybe their parents)
Attendance have a full compatibility with course grading.
It could be configured to have diferent percentaje of final grading, so far, you can use one attendance activity for have a registry for their personal clairliness, another to record assessment in math, one more to social assessment and so on.
Finally all users with minimun acces as teacher (or another role you defined: example: school administration, scholar control) Could have facilities to export every grading to spreadsheet.
I've several years using it in a similar way you are asking to.
I hope this helps you.

Practical usage of noSQL [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I’m starting a new web project and have to decide what database to use. I know, the question is very long but please bear with me on this.
I am very familiar with relational databases and have used frameworks like hibernate to get my data from the DB into Objects. But I have no experience with noSQL DBs. I am aware of the concepts of Document, Key-Value, etc. types.
While I do my research one question pops out every time and I don’t know how someone would handle this in noSQL DBs like MongoDB or any other Document-Typed noSQL DB where consistency takes top priority.
For example: let’s assume that we are creating a small shopping management system where customers can buy and sell stuff.
We have:
CUSTOMERs
ORDERs
PRODUCTs
A single CUSTOMER can have multiple ORDERs and an ORDER can have multiple PRODUCTs.
In a traditional RDBMS I would of course have 3 tables.
In the first version of our application, the front end for the customer should display his/her personal data, ORDERs and all the PRODUCTs he or she bought per order. Also which products are available for sale. So I guess in noSQL I would model the CUSTOMER class like this:
{
"id": 993784,
"firstname": "John",
"lastname": "Doe",
"orders": [
{
"id": 3234,
"quantity": 4,
"products": [
{
"id:" 378234,
"type": "TV",
"resolution": "1920x1080",
"screenSize":37,
"price": 999
}
]
}
],
"products": [
{
"id:" 7932,
"type": "car",
"sold": false,
"horsepower": 90
}
]
}
But later I want to extend my application to have 3 different UIs instead of only the first one:
The CUSTOMER Dashboard where a customer can view all his/her orders.
The PRODUCT Dashboard where a customer can add or remove products in his/her store.
THE SOLD Dashboard where a customer can view all sold PRODUCTs ready for shipping.
One very important thing to consider (the reason why I even bother asking this question): I want to be flexible with the classes like PRODUCT because products can have different properties. For Example: A TV has screen size and resolution while a car has horsepower and other properties. And if a user adds a new product, he or she should be able to dynamically add those properties depending on what he/she knows about it.
Now to some practical use cases of two fictional users Jane and John:
Let's say, Jane buys from John. Does that mean i have to create the PRODUCTs two times? One time as a child of Jane's ORDER and another time to stay in the "products" property of John?
Later Jane wants to view all products that are available from any user. Do i have to load every user to query the "products" property to generate a list of all products?
In version 2 of the application i want to enable John to view all outgoing orders (not orders he made but orders from other users who bought stuff from him) instead of viewing all sold products. How would this be done in noSQL? Would i now need to create an "outgoing" array of orders and duplicate them? (an outgoing order of Jane is an incoming order of John)
Some of you may say that noSQL is not right for this use case but isn’t that very common? Especially when we do not know what the future brings? If it does not fit for this use case, what use case would it fit into? Only baby applications (I guess not)? Wasn’t noSQL designed for more complex and flexible data?
Thank you very much for your advises and opinions!
EDIT 1:
Because this question was put on hold because of the unprecise question:
I made a very clear and simple example. So my question is not general about the use of noSQL but how to handle this specific example. How would a experienced noSQL user handle this use case? How to model this data? A recommendation to simply not use noSQL at all for this use case is also a valid answer to me.
I simply want to know how to use a noSQL database but still be able to manage entities and avoid redundancy.
For example: Are MongoDB's DBRefs/Manual refs a good way to achieve this? Performance issues because of multiple queries? What else to think about? I guess these questions can probably be answered quite well.
There probably isn't the one right answer to your question. But I'll make a start.
While it is technically possible in NoSQL to store some business entity together with all entities that are transitively linked with it (like Customer, Order, Product), it is't always clever to do so. The traditional reasons for separating entities, namely redundancies and therefore update and delete anomalies, don't just go away because a different platform is used.
So if you stored the product description with every customer who buys or sells this product, you will get update anomalies. If you have to change the screen size from 37 to 35, you'll have to find all customer records containing this product, which can be quite cumbersome.
Also, building up such a deep nested structure favors one direction of evaluating those structures over all other directions. If you put all orders and products into the customer document, this is very fine for getting a comprehensive view for a customer: whatever she bought throughout her lifetime. But if you want to query your database by orders (which orders need to be fulfilled tonight?) or products (who ordered product 1234?) you'll have to load tons of data that are of no interest to this query.
Similar questions are due to storing all orders with a customer. Old orders will sometimes still be of interest, so they may not be deleted. But do you want to load lots of orders everytime you load the customer?
This doesn't mean not to make use of the complex structuring made possible by a document store. As a rule of thumb, I would suggest: As long as the nested information belongs to the same business entity, put it into one document. If, e.g., the product description has some hierarchic structure, like nested sections consisting of text, pics, and videos, they may all go into one document. But entities with a totally different life cycle, like customers, orders, and suppliers, should be kept separate. Another indicator is references: A product will frequently be referenced as a whole, e.g. when it is ordered by a customer or ordered from a supplier. But the different parts of the product description may possibly never be referenced from the outside.
This rule of thumb wasn't completely precise, and it's not supposed to be. One person's business entity is another person's dumb attribute. Imagine the color of a car: For the car owner, it's just a piece of information describing a car. For the manufacturer, it's a business entity, having an availability, a price, one or more suppliers, a way of handling it, etc.
Your question also touches the aspect of dynamically adding attributes. This is often praised as one of the goodies of NoSQL, but it's no free lunch. Let's assume, as you mentioned, that the user may add attributes. That's technically possible, but how will these attributes be processed by the system? There won't be a specific view, nor specific business rules, for those attributes. So the best the system can do is offer some generic mechanism for displaying those attributes that were defined at runtime and never reflected in the program code.
This doesn't mean the feature is useless. Imagine your product description may be complex, as described above. You might build a generic mechanism to display (and edit) descriptions made up of sections, texts, images, etc., and afterwards the users may enter descriptions of unlimited width and depth. But in contrast, imagine your user will add a tiny delivery date attribute to the order. Unless the system knows specifically how to interpret this date, it will just be a dumb piece of information without any effect.
Now imagine not the user, but the developer adds new attributes. She has the opportunity to enhance the code at the same time, e.g. building some functionality around delivery dates. But this means that, although the database doesn't require it by its own, a new release of the software needs to be rolled out to make use of the new information.
The absence of a database scheme even makes the programmer's task more complicated. When a relational table has a certain column, you may be sure that each of its records has this column. If you want to make sure that it has a meaningful value, make it not null, and you may be sure that each record contains a value of the correct data type. Nothing like that is guaranteed by schemaless databases. So, when reading a record, defensive programming is needed to find out which parts are present, and whether they have the expected content. The same holds for database maintenance via administrative tools. Adding an attribute and initializing it with a default value is a 2-liner in SQL, or a couple of mouse clicks in pgadmin. For a schemaless database, you will write a short program on your own to achieve this.
This doesn't mean that I dislike NoSQL databases. But I think the "schemaless" characteristic is sometimes overestimated, and I wouldn't make it the main, or only, reason to employ such a database.

Help design a class diagram of a booking system

I got a request from my friend to write a php booking system module for his bowling club website, I am thinking to make this module as generic as possible, ie can also be used for booking pool tables etc.
So I started to draw up my UML class diagram:
I have 2 interfaces IBookingHandler(has implementation like BowlingBookingHandler) to handle different types of bookings and IPriceOption(has implementation like BowlingNormalPrice) to handle different types of prices. IBookingHandler uses IPriceOption to generate the total cost of the booking.
A data class "Booking" which represent a booking record in object
A ata parent data class "Type" and subclass "Lane" which has methods like etCurrentStock" to get instances of types for the booking.
Could anyone please review this design, and let me know what things are wrong or missing?
Much appreciated.
James Lin
You probably want a separate class for the customer. One customer could possibly have multiple bookings.
Is it wise to ha a implemenation for normal price? what's normal price? what if they want senior price during weekdays and disco bowling price during the evenening, and on new years eve they want another price. You don't want to release a new version everytime the price changes.
If you want to connect it to the bowling lane system ( there are plenty of them on the market) you probably want to have knowledge of all the players not just the one making the booking.
The more customer info you collect the better for your friend. Since he then have a cheap and easy way of advertising.

Is it possible to get the index of a exchange using Finance::Quote?

I need to get the index of a exchange like NASDAQ rather than the price of a specific stock in that exchange. I suppose that Finance::Quote will come to the rescue , but after a quick go-through of the document, I find it the way one can use the module for query is like:
%info = $q->fetch("australia","CML")
which means both the exchange and the stock should be specified in the query. then the question is: does the index itself can be treated as a stock and has a symbol name which can be used in the query?
Of course, if you have other way can meet my needs rather than using Finance::Quote, please feel free to write down your solution.
The problem with your question is that you are assuming that there is just one index for a particular exchange. Whilst there may well be a particular index that is dominant (eg. for stocks primarily traded on the London Stock Exchange, the FTSE 100 might be considered the main index; similarly for the NYSE it would be the Dow Jones Industrial Average) other exchanges may have a less clear leader in their collection of associated indicies (eg. for the Australian Stock Exchange, the S&P/ASX 200 and the All Ordinaries index are both frequently quoted side-by-side in the evening broadcast news).
Symbology of stocks, indicies, option chains, futures, etc is quite a complicated field in financial IT. Many of the symbology standards are backed by a data vendor (eg. Reuters, Bloomberg) and use of their standards requires a commercial license. On the other hand there are other efforts aiming to make symbology more open (Bloomberg themselves are behind one of these efforts).
I'm not familiar with the data sources of the Finance::Quote package you reference, but if you are serious about accessing market data (ie. prepared to pay for it) but don't need the cost/complexity/speed of a solution from Reuters, Bloomberg, etc, you could do alot worse than check out what Xignite offers in the way of market data accessible via web services.
the symbol for the nasdaq composite is "^IXIC". For nyse composite it's "^NYA".
each quote provider might have a different syntax though.