I am trying to implement infinite scrolling for documents stored in a MongoDB collection. Every document is a restaurant that has a numeric field rating, so I am using the rating field for sorting and showing restaurants with the highest rating first.
The problem is that the collection of the restaurants is not static. The ratings of the restaurants change in real time, therefore the order of the restaurants in the collection changes constantly. As a result, although I formally have the sorting key, it does not make much sense.
I am thinking of 2 solutions of the problem:
Accept that the order of the restaurants may change slightly while someone is doing the infinite scrolling. Make the front end responsible for getting rid of possible duplicates. Accept that some of the restaurants may not appear during a scrolling at all. But that looks more like working around the problem instead of solving it.
Only perform infinite scrolling against a static copy of the collection of restaurants. Update the static copy periodically (e.g., once a day) with the rating updates. But this approach seems overengineered. Also, what happens with the infinite scrolling at the moment when the static copy of the restaurants gets updated with the new ratings? Such scrolling will be broken as well because the problem with the changing order is still here, the order just does not change that frequently.
I am sure I am far not the first one who have faced this problem. After all, there are a lot of examples of infinite scrolling implementations out there, like Facebook or Instagram feeds. At the same time, all the articles I have read so far seem too superficial and covering only cases with infinite scrolling through static collections.
What is the right approach to deal with infinite scrolling for a collection that may change its order any time?
Thank you.
Infinite scrolling, as commonly implemented, isn't a precision navigation method to begin with. Who are your users?
Power users are likely to hate it (I do on github, facebook, etc.) hence won't be using it too much.
Non-power users won't be able to tell that data is missing. If they happen to be looking for a particular restaurant and it vanishes, telling them to reload the page will be a sufficient explanation for most.
Users who scrape your data will do it without delays between requests to get all of your data.
When you show the same restaurant twice people will notice so check for those cases in the frontend.
You may also consider having a high-precision rating field for sorting. For example, if normally your UI shows integer rating, keep the floating-point rating used during the calculation and sort by that. This will produce a more stable sort.
Related
Context: We are implementing a news app. For now, you can assume the news to be the same across all users, and maintains an order based on the parameters we set (according to trends, and date).
Problem: We are not sure what the best implementation for keeping track of what users read is. We want to be able to configure a way in which we can track what users read and what they didn’t.
Assumption: You can assume that the posts in the database are in a descending order, based on time.
So, the ideal scenario is that: when there are posts: A,B,C,D,E fetched from the server in the app, and the user read A,B. Now the user only gets to see C,D,E when they check for next posts. If they do previous, they see posts in the following order B-> A.
Furthermore, when P,Q is added to the database, now, the user must see next posts in the order of P->Q->C->D->E and so on.
Example: Let us assume there are 20 news in our app right now, and Gavin picks up his phone and starts reading from our app. In midst of his usage, he finds himself occupied with some other work, so quits the app after reading 5 news posts.
The challenge for us now is to figure the best way to make sure Gavin doesn’t have to re-read the 5 posts he already did.
One way we thought we could solve this problem is through use of index. We can assume uniform ordering for our posts as mentioned in the context, so we could use an index to track where Gavin was last in the order of news and show him news based on that index.
However, one problem with that approach is, we could easily have 5 new posts when Gavin picks up his phone and uses our app again. So, if we have the news based on date, technically that indexing approach means that we omit 5 unread new posts instead of the 5 read old ones.
We've also thought of maintaining three lists: Read, Unread and New so that we fetch only posts that are not in our lists. For example, in my initial example: A-B-C-D-E is in unread initially. Then, after user reads A-B, read becomes A-B. Meanwhile, when P-Q is added in the database, P-Q is added to the list of unread posts as P-Q-C-D-E.
How do you solve this problem? Any suggestions are welcome as we kind of think we're not thinking out of box when it comes to a solution for the problem. Thank you! :)
As i first read problem the solution ends up in my mind is also having 2 different list read unread and new ones are added to end of unread ones and unread list is shown in reverse order so most recent ones are on the top. However is it the most efficient way? Discussible. For example if number of new number increases a lot, then will be memory inefficient. But i assume small numbers in general.
I have a requirement where we need to show around 24k records which has 84 cols in one go, as user wants filtering on entire set of data.
So can we have virtual scrolling mechanism with ag-grid without lazy loading?? If so could you please here. Any example are most welcome for reference.
Having tried this sort of thing with a similar number of rows and columns, I've found that it's just about impossible to get reasonable performance, especially if you are using things like "framework" renderers. And if you enable grouping, you're going to have a bad time.
What my team has done to enable filtering and sorting across an entire large dataset includes:
We used the client-side row model - the grid's simplest mode
We only load a "page" of data at a time. This involves trial and error with a reasonable sample of data and the actual features that you are using to arrive at the maximum page size that still allows the grid to perform well with respect to scrolling / rendering.
We implemented our own paging. This includes display of a paging control, and fetching the next/previous page from the server. This obviously requires server-side support. From an ag-grid point of view, it is only ever managing one page of data. Each page gets completely replaced with the next page via round-trip to the server.
We implemented sorting and filtering on the server side. When the user sorts or filters, we catch the event, and send the sort/filter parameters to the server, and get back a new page. When this happens, we revert to page 0 (or page 1 in user parlance).
This fits in nicely with support for non-grid filters that we have elsewhere in the page (in our case, a toolbar above the grid).
We only enable grouping when there is a single page of data, and encourage our users to filter their data to get down to one page of data so that they can group it. Depending on the data, page size might be as high as 1,000 rows. Again, you have to arrive at page size on a case-by-case basis.
So, in short, when we have the need to support filtering/sorting over a large dataset, we do all of the performance-intensive bits on the server side.
I'm sure that others will argue that ag-grid has a lot of advanced features that I'm suggesting that you not use. And they would be correct, for small-to-medium sized datasets, but when it comes to handling large datasets, I've found that ag-grid just can't handle it with reasonable performance.
I've gone through several tutorials on Flutter and I find that they cover basics just fine but there are some nagging aspects of good design and good architecture that are consistently missing. I'm writing my first actual (not toy) application in Flutter and find myself running into these missing points.
Global data. Once a person installs the application and tries to use it, I ask them to log in / create an account, since this is an application specifically for managing groups of people. I'm using Firebase on the back end, and the package to do authentication winds up returning Future<FirebaseUser> from everything. So, yes, when it comes to building a Widget that uses the user's data, I can use a FutureBuilder. That said, it seems weird to have to keep typing boilerplate FutureBuilder code to dereference the user every place I want to use the user's ID to look up their data (what groups are they part of, what actions do they have pending, etc.). I really feel like there ought to be a way to invoke the future, get the actual user object, and then store it somewhere so that anything that wants a user ID for a query can just go get it. What's the right solution? I can't believe I'm the only person who has this problem.
Updatable data. I've got a page where I list the groups the current user is a member of. The user, though, can create a new group, join an existing group, or leave a group. When they do that, I need to redraw the page. The list of groups comes from running a Firebase query, so performing an action (join, leave, etc.) should signal the app to redraw the page, which will have the side effect of re-running the query. Conceivably, one might make the page dependent (how?) on the query results and have it redraw whenever they update, and instead have some widget somewhere that keeps track of the query. There's another answer here that hints that this might be the right way to go, but that's really concerned with relatively invariant data (locale doesn't change all that often for a single user). So, again, I can't believe I'm the only one who does this sort of thing. What's the best practice in this case?
So basically I want to implement the same functionality as StackOverflow's:
viewed 59344 times
So here is some background information:
I want to count only unique visits. The assumption that registered users will read the article many times (it is evolving)
I use MongoDB as a store
I would like it to be close to real-time
My system will have a registration, but I want to count the views of anonymous users as well
I understand that the best way to count unique visits is through registration, but the thing is that a big chunk of users will be just passive readers who do not need to create an account to read the information from the application. As far as I understand, the most convenient way is to save the IP address of every user, who reads the post. I also understand that IP addresses will not provide uniqueness (some different users will have the same IP, because they are behind the same ISP and one user can have different IPs, by using proxies, tor, etc)
The use of Mongo is not absolutely essential, just the thing is that everything is written in Mongo right now, so I will switch only if it will be much faster/convenient.
Background
Are you certain you need to track "unique" views?
I actually wouldn't expect popular sites to try to keep the view counts unique - bigger is better and re-visits for new comments are still additional "views" in the the sense of showing new content/comments/ads. There are other possible subtleties to "correctness" that may or may not be important for your use case, such as excluding crawlers or your own company's users/IPs.
Instead of spending time tracking unique views (which isn't overly meaningful), I would look at counting unique user interactions such as voting/liking/commenting on the page. You can then determine "popularity" of a page with some formula based on those metrics. There is an interesting example of this approach in the Radioactivity module for Drupal, where a "hotness" metric is calculated based on activity based on recency of user interactions.
Approaches to consider
1) For a simple view counter in MongoDB, I would just use $inc to bump up the view count when the page is loaded. You can exclude logging users by role as needed (for example admin users).
2) For a more accurate view counter I would pass off the problem to a web analytics platform (which you should be using with your site for more detailed analysis anyway). For example, you can use Google Analytics API or an open source application like Piwik. Web analytics systems already have solutions in place for determining unique users/views, and the API calls for these can be asynchronous via JavaScript.
3) If implementing your own unique view tracking a definite requirement, I would use a separate collection for tracking views and upsert based on your uniqueness criteria (unique view per user,article pair for registered users or session_id,article pair for anon users). I would combine this with approach #1 (incrementing a view counter for the article views) by incrementing a counter of article views if the upsert results in an insert.
One of the way that you can solve the problem is using the cookies , once a user has visited the page , you can have one cookie added saying that he is already visited the page and you do not need to count him again. You can keep on appending some key to know what all pages he had visited. I know cookies can be deleted but in any solution there will be tradeoff.
From the mongoDB prospective , if you want very fast insert and read , i would suggest couple of things you can do.
1) As you create a article , create a document like this in your may be log collection
{"_id" : "Article URL" , {"Hit" : 0}}
Why i am not suggesting to add IP address or any other information because , as you will add IP addresses , the size of the document going to change mongoDB need to find new allocated space. Which is bad from performance angle. As you are only incrementing the counter it will not increase the size of the document and it will no need to change it place. + You have limitation on the maximum size of the document you can have.
2) Creating document in advance will give direct update statement and no worry to check for the existence of the document for the article Id or not.
I have an iOS app that presents content in a tableView. I've added a 'like/dislike' feature that interacts with my database (I use Parse.com). Every time someone likes/dislikes a piece of content, the specifics are sent to the Parse database. For each piece of content, I'd like to calculate and display the percentage of 'likes' over 'likes' + 'dislikes'. This is pretty simple math, but I can't wrap my head around the best way of designing my database table and the most efficient way to calculate the 'liked' percentage for each piece of content before the tableView physically appears.
As it is, I already have a loop in my tableView's viewDidLoad which compares the content from another database table to the 'like/dislike' table to restore the 'like/dislike' button state of the user (if they already liked/disliked a piece of content).
At first, I thought of creating an array in the initial viewDidLoadloop. However, using the whereKey: equalTo: type of query for each piece content to simply find the amount of likes/dislikes takes forever. As predicted, it is very slow in cellForRowAtIndexPath as well.
Worst case, I can make these calculations server-side and just pull the 'liked' percentage. However, I'd like to implement this in the app somehow. I'm a complete beginner, so I may be going about this all wrong.
Here is the basis of my database table:
Edit: I've managed to build a server-side program that calculates the percentage of users that 'like' pieces of content. My app pulls this percentage from the database at runtime. To make the percentage change more responsive when the user 'likes' something, I locally calculate an updated percentage. The problem here is when the user exits the app and reopens, the data reloads. If the server-side program had not run recently, the app will display an old 'liked' percentage (the most up to date % would not be calculated yet). The two solutions I see to fix this are:
Run the server-side program every 1-3 min
Post more data to the database when someone likes content (this would involve additional database queries for every single 'like').
I think both of these options are way too expensive for what I'm trying to accomplish.
I'd suggest leaving the calculations to the server side, and responding with the information to utilize in the app. This will save you from processing and parsing the incoming results.
You have greater processing power on a Server than on a device.