How to cluster preferences of customers? - filtering

Briefly, i have data set of sales for the year. I need to cluster users preferences. I have vectors these sales for each user. For example,
i have vector with purchases of John this consist of 1 or zero(bought this thing or not).
[1 ,0 ,0 ,1 ,0 ]
[product1,product2, product3,product4,product5]
it means that John bought product1 and product4. May be someone saw good articles abouts this. I need to cluster and offer other customers different things, for example if some customers have the nearest neighbor(with enough small distance) then i will able to offer him things from this neighbor.
in advance, sorry for my bad english and thanks
I am interested in ideas or articles!
size of real matrix of these customers is 10^8*10^6

Use market basket analysis, not cluster analysis.
In particular, it does not assume that every customer is typical (part of a cluster) or that a customer can only be part of one.
Rules of the kind
butter, bread -> marmelade
are excellent for product offering.

Related

Product Classification

This is my first post on Stackoverflow but I've consulted this website very often. I was hoping someone could help me out.
I'm trying to do a mapping of a product catalogue (not sure if it's the right term) in order to simplify it in the end. What I would like to achieve is to get a overview of the current product structure.
Variables:
Product_Name: definition of the product, total of 27 unique products
Product_Type: either data, voice, add-on or non-services
Customer_segment: 6 distinct segments
Carrier_type: how is the product delivered? (fiber/coax etc.)
Optional variables:
Account_ID: customer number
Revenue
What am I trying to achieve?
I would like to see how the columns relate to each other. So if I have for example product A, then I would like to see which options are related to this particular product.
Product A
Segment A Segment B Segment C
Carrier A Carrier B Carrier A Carrier C Carrier B Carrier D
I'm looking for ways on how to make such decision trees in R since I do not want to make these on my own. Even more awesome would be to include the account ID and revenue somehow. I've already looked into Market Basket Analysis but this only gives me the most common combinations of products per account (basket). Now I want to see which products belong together and to see if there is any overlap between them in order to simplify it in the end.
I'm not asking for a complete set of code but more like a push into the right direction. Does anyone know which method of analysis would be suited to do this?
Thanks!

Round-Robin between Customers

So here is the deal...
Assume that you have multiple clients that sell their items at different prices. As the owner, you decided to give a fair chance to every client that uses your platform. Therefore, you need some sort of a way to fill the purchase orders between the sellers in such a way that everyone gets their items sold.
Round-Robin like method is required! So for example here is a way to describe the data:
Ohad, items: 5, price: 1$
Daniel, items: 2, price: 1$
Jim, items: 3, price: 1$
Tim, items: 1, price: 1.05$
If a client wishes to buy 4 items in Bulk, he should receive the items of Ohad -> Daniel -> Jim -> Ohad
(And the next time a bulk purchase will be executed, Daniel would start the lead)
If the client wishes to buy 11 items, first it will go around the people that share the same lowest price, and then it will add Tim's on top to match the requirement of 11 total items.
And of course, if the list of sellers was longer, the round-robin principles should still exist.
I am trying to think about an efficient way to get this done... I find most solutions to be very consuming or not 1000% working.
seriously don't want to limit people with their ideas... so I would love to hear anything and we will take it from there! :/
Cheers!
Go with KISS and don't make it more complicated than necessary.
Let the law of large numbers work for you and randomly pick a seller for each order you get, then fill the whole order from that seller. If there are enough orders and you use a good enough random number generator (random() should do), it will even out in time.
This will ignore price differences, so maybe you could use
ORDER BY random() * price LIMIT 1
to pick the seller to serve an order.

tools for visualising money flows in a complex company structure

I'm working on a project where we want to visualise money flows in complex company structures. The context is real-estate investment in Berlin, Germany, which faces currently an acute shortage in apartments.
Berlin is a highly attractive market for investors, as the market is still very affordable compared to other European capitals. But as authorities do not support building affordable housing, and don't build themselves, private investors have a relative dominance.
Among private investors, a big part use company networks spread across Europe to aggressively avoid taxes. They use every legal possibility, with mother companies European tax havens like Jersey, Luxemburg and Cyprus. The finance minister of Berlin estimates losses due to tax avoidance by RE investors to 100 to 200 Mio € yearly.
Our project aims to show how these investors work. The goal is to make this highly technical subject understandable for non-specialists, first step to raise public awareness.
We want to show in detail how money flows across company structures that can have 30+ subsidiaires spread across Europe. Here are examples: https://ibb.co/album/kPT6Jv
Example of a complex company structure: the investor Taliesin, Berlin/Jersey
You can see that as an organigram, these structures are more like a puzzle and need a specialist to explain what they mean.
The representation should be dynamic, so that you can see the rents rising from the houses, trickling through the network, ending at the beneficiairies. Ideally, you'd see how much money remains at each station of the network.
A user, or a tenant, could enter the amount of his rent in a field, say 1000 €, and then follow its path through the companies. At the end he'd see that of his rent, 140 € remain in the house for upkeeping, 220 € are used for management costs of the owners, 240 € go to the shell companies and the managers companies, 380 € go the banks as interest rates on the company's loans, 20 € are paid in taxes.
This example is taken from the analysis of the investor Taliesin in Berlin, one of the most agressive tax avoiders, and are actual figures. I co-wrote an article on Taliesin for the Berlin daily Tagesspiegel, 8 oct 2016: https://www.tagesspiegel.de/berlin/share-deals-auf-dem-berliner-immobilienmarkt-wie-investoren-den-kreuzberger-buechertisch-ausbooteten/14658204.html
In our project, "Who do I actually pay rent to", we'll instruct tenants how to research and transmit the data to an analyst-team, who'll then verify and reconstruct the money flow.
Hope this question is not too long.
So back to the actual question:
is there a tool to visualise this type of flows? I've found a couple of blockchain/bitcoin visualisation and ready-mades, like Key lines
do you know developers who could help in finding and adapting the visualisation tools? There will be a couple of other visualisations to do.
The project will be funded and there will be a large chunk of the money reserved to the visualisation tool, as this is main aspect.
Any questions, please ask.
Looking forward for your answers.
Best,
Adrian
Is there a tool to visualise this type of flows? I've found a couple
of blockchain/bitcoin visualisation and ready-mades, like Key lines.
You need a data visualisation app.
In short Yes, this can be built with Python and the use of d3.js to display the data in the way you want.
Python : https://www.python.org/
D3.JS : https://d3js.org/
do you know developers who could help in finding and adapting the
visualisation tools? There will be a couple of other visualisations to
do
Yes, me.

Recommendation Sytem without rating data

I'm trying to figure out how to solve a task given to me by a customer.
The customer is a company that distributes books to final shops (books shop). It would like to know in which way it can give a service to its sales division starting from the assumpition that we can identify cluster of shops that are similar in terms of customers they sell to, and then similar in sub-market demand.
The data which the recommendation system relies on is data on the features of the single book (author, genre, date of pubblication, etc...) and the historical data of the past book orders made by the shops to the company.
Is it possible, with this data, to setup a recommendation system? If yes, which approach should I use?
Thanks for any helps,
Filippo.

OpenCart: Flat Option Cost + Per Item Cost for Products

In Opencart, I have a product which you select colors of.
Basically the pricing should be: Each additional printing color costs a flat rate of $50 + $0.25 for each.
So if a person were ordering 1000 items, with 2 colors, the cost would need to be BASECOST + $100($50x2) + $250(1000x$.25)
Right now I'm only able to set up the cost for each product. Since people are going to be both ordering large and huge quantities, there's no easy way to build it into the each price.
I could have sworn I saw a free extension awhile ago that allowed you to set both a flat price for an option, and a price for each on the quantity. Trying searching everything I could think of, but the only thing that I could find is for shipping (we already have a pretty complex setup for the shipping, so can't mess with that).
Has anyone came across a solution, or simple extension for this problem. Seems like a simple thing, but still can't find a solution for the life of me.
Thanks!
The easiest way I can see to do this would be to have fixed costs for certain price breaks which can be done through the Discount tab of each product, and can even be set based on customer group if you have wholesale as well as regular customers or other customer groups