Most efficient voting system in Firebase - google-cloud-firestore

I'm working on an app that lets users upload pictures and rate (so vote with stars 1-5) each others pictures. But I keep thinking that the database system that I'm building is not efficient enough. I hope that there is someone who can give me advice.
Right now I have a database table called "images". It has the following fields:
doc_id
user_id
image
amount_of_votes
rating
Everytime a user votes the amount_of_votes gets a +1 and the rating gets calculated by (rating*old amount of votes + new vote) / new amount of votes
But this gives me the following problems:
If 2 users look at the same image at the same time (for example an iamge with 2 votes with 1 star) then if the first user votes 5 stars a new rating is calculated. But if the second user votes 1 star again then the last user who votes still gets 2 votes with 1 star and doesn't see the new 5 star rating.
Also I think that it's not safe to do it this way. A user can in theorie just change the amount to 1 and the rating to 1.
I want to build a new voting system where there is a new table called votes and it holds the user_id and the vote that the user gives. But that gives me the following problem:
if a user clicks on a really popular image then if has to load a lot of votes and calculate an average. That isn't efficient.
Does anyone know what the best system is that I can use? I have spend days on the internet but I haven't found the best system yet...

I didn’t try implementing it myself but my approach for this will be the use of a Snapshot listener.
It can help you to have updated information in the application whenever something changes in the document. The voting system which I use is collecting 5 stars, 4 stars , 3 stars, etc, separately. For example: how many 5 stars ratings a user got, how many 4 stars a user got and so on.
Then, to calculate the average rating I use the following formula:
((no_of_votes_for_5stars * 5) + (no_of_votes_for_4stars * 4) + (no_of_votes_for_3stars * 3) + (no_of_votes_for_2stars * 2) + (no_of_votes_for_1star * 1)) / (no_of_votes_for_5stars +no_of_votes_for_4stars + no_of_votes_for_3stars + no_of_votes_for_2stars + no_of_votes_for_1star )
You can save the number of stars data separately in the Firestore database and whenever someone rates a photo, you can increment the current count. And you can implement the above formula inside the app code so your application gets the number of stars info and calculates the average using the above formula, and then displays it.

Related

Last touch marketing campaign view in tableau

In campaign analysis I want to view in Tableau what is the last campaigns that a lead has seen in a particular month and region. A lead may saw multiple campaigns.
For e.g. in sample data set below Lead id abc has seen two campaigns – webnair, email and last one is webnair.
Lead id efg also has seen two campaigns webnair and email and webnair is the last one. Lead id fgh has seen one campaign - Tradeshow
So in filter(month and region will be in filter) when February and US is selected the view will be a bar plot showing webnair as count 2 and tradeshow as count 1. This will give an idea of which campaigns happens mostly before the lead converts to customer.
Had some insights here Campaign performance in Tableau. Tried replicating here with some changes but no luck so far.
Like earlier ones this one is also not difficult.
Create Desired_field by adding a calculated field as
if [Month] = {fixed[Lead Id]: MAX([Month])} then 1 else 0 END
Create a view like the screenshot. (Note: Don't forget to add filters to context!)
Please note that in case of UK, and if FYU the two campaigns are on same date. This will count both campaigns in the view in case of Feb and UK are selected.

Finding Difference from Previous Day Volume to Next Day Volume

I have been struggling to find the new incoming volume per day.
I have categories as : - total ticket, Resolved, closed and Daily left.
So the calc is everyday resolved and closed are moved from the queue and
'daily left = Total Ticket- (Pending + Closed)'
Now there is some carry forward everyday hence the total ticket for the next day includes some volume i.e. Daily left of previous day.
I am not able to figure out how to show that number, I have tried using previous value but it is not helping. Please suggest. Attaching a print screen of the data.
For 3rd the # of records are 33 however there is 1 carry forward from previous
day hence the Fresh Vol should be 32. I have used the formula to calc but it is
not giving correct result
sum([Number of Records]) - (PREVIOUS_VALUE([Daily Left Volume]))
This is taking the left over of current day and not previous day.
I am also using look Up function but that also does not show the current output.
The output from tableau after using the lookup function is below attached as well
I am new to this community and dont have enought reputation to comment :P. So writing few possible solutions here:
1) Make sure the data is sorted by date and is unique on date level. If it is not then Previous or lookup might not work
2) Another solution will be take running_sum of every field and then apply the operations. This should give right answer
3) If this does not will it possible to change the way you import the data?
a) Simply create another filed as Date_past = Date-1 in your raw data.
b) Duplicate your data.
c) join the two data sets on Date = Date_past
d) Now you have all the data of today's date and last day in one field and you can perform operations as you need

Tableau - Multiple data into one graph, with double dimension on the x-axis

I'm very new to Tableau, and (maybe because of that) struggling with a graph setting. I need to plot a simple line graph showing the ratio between the number of users that returned after registered x days ago and the total number of users that registered x days ago (regardless on the fact that they returned or not). To do this, I have two tables: TableA having (simplifying) USER_ID and DATE_REGISTRATION, and TableB USER_ID and VISIT_DATE. Both table are joined by USER_ID.
I'm able, of course, to plot each individually (i.e. count distinct of USER_ID with DATE_REGISTRATION on the x axis to get the number of new users registered per day), but not able to combine them. I guess the problem is that I'm using either DATE_REGISTRATION or VISIT_DATE on the x-axis, but in this case I can get one or the other info, but not the two combined.
Ultimately, I would like to be able to have, for each date, both the number of users visiting and the number of user who registered.
Thanks a lot in advance.
Raffaele
Well, problem is your database is not ready to generate those analysis. Your table is user_id oriented, meaning you can do lots of analysis centered on the user_id. To do date oriented you need a table like:
Date User_id Type of event
01/01/2014 1234 Registration
02/01/2014 1234 Visit
Then you can drag Date and Type of event to Columns, and COUNTD(User_id) to rows, to get a bar chart that will show, for each day, how many people registered that day and how many people visited that same day.
Additionally, you can still join this table with the one you have, to have the registration date for each user_id. That way you can, for instance, calculate how many days have passed since registration.

Is this an approach to user-item recommendations that could work

I am designing an application that incorporates a recommendation system base on user interactions (collaborative filtering). The user on his homepage is presented a set of 6 items to interact with. There will be between 50 and 300 items. The following actions are possible:
click on an item (strong interest)
refresh an item (some interest)
open a read-more dialog (some interest)
don't do anything an move on (no interest)
This data is collected and stored. The system should recommend items of interest to the user. I'am thinking about turning this data into a rating system.
Option A) if the user clicks on an item, this is translated into a implicit lifetime rating of 5. refreshing an item it a 4 and so on. So my user->item matrix would look like this:
item 1 | item 2 | item 3
john 5 4
jane 4
In this example john has clicked on item 1 and refreshed item 3. The rating can only go up really, i.e. if a user has previously refreshed an item I write a 4 and update only to a 5 if the item is clicked later.
Option B) each time the user does one of the above actions, I'll increment a scalar value for the item, which means it can grow unbounded.
item 1 | item 2 | item 3
john 55 1 30
jane 41 9
Maybe this is a problem, since now the numbers are harder to translate into a rating scale from 1 to 10
Option C) I count every interaction separately
item 1 click | item 1 refresh | item 1 read
john 3 1
jane 1 1
Here the problem is that "reading about" an item is probably only done once.
Independent of whatever option I choose, my idea is to first find similar users using something like cosine similarity or pearson correlation. Then pick the top 10 to 30 users from that list and compile a toplist of their favorite items. From that list, I will then recommend items that the current user has had little interaction with in the past.
Is this something that could work? I am worried that finding similar users will eliminate the chance of finding interesting (new) items for the current user.
What you suggest sounds reasonable. Your concern about not finding new items is a reflection of the collaborative filtering method which is metadata-based. To find new items you would have to undoubtedly do some content analysis which would be a separate stage. For example, if your items are news articles you might try to identify important keywords for each user.

SQLite vs Memory

I have a situation with my app.
Suppose I have 6 users, each user can have up to 9 score entries (i.e score 1000 points at 8:00pm with gold collected 3, silver 4 etc etc), say score per stage and 9 stages.
All these scores are being taken from an API call, so it can update with an interval of 3+minutes.
Operations I need to do on this data is
find the nearest min, max record from stage 4.
and some more operations like add or subtract two scores etc
All these 6 users, and their score records are already in database, being updated in needed after the API call.
Now my questions is :
Is this a better way for such kind of data (data of scores here) to keep all the data for all the 6 users in memory in NSArray or NSDictionary, and find min and max in that array by a min-max algorithm.
OR
It should be taken from Database by a query like " WHERE score<=200 " AND " WHERE score >=200", in short, 2 database queries which return nearest min and max record each, and not keeping all the data in memory.
What we are focusing on is speed, and memory usage both. The point is, Would a DB call be fast and efficient to find min and max OR a search for min,max in an Array of all the records from DB.
All records can be 6users * 9scores for each = 54.
Update time for records can be 3+ minutes.
Frequency of finding min max for certain values are high.
Please ask, if any more details are required.
Thanks in advance.
You're working with such a small amount of data that I wouldn't imagine it would be worth worrying about. Do whichever method makes your development process easiest!
Edit:
If I had a lot of data (hundreds of competitors) I'd use SQLite. You can do queries like the following:
SELECT MIN(`score`) FROM `T_SCORE` WHERE `stage` = '4';
That way you can let the database handle doing the calculation for you, so you never have to fetch all the results.
My SQL-fu isn't the most awesome, but I think you can also do this:
SELECT `stage`, MIN(`score`) AS min, MAX(`score`) AS max FROM `T_SCORE` GROUP BY `stage`
That would do all the calculations in one single query.