Load a mutable list dynamically (fe. for recyclerview) from a backend. Probably used in Facebook, GMail, 9GAG, Instagram and co - rest

I got a question for loading a extremely long mutable list of posts.
First of all, for people who don't know android's recyclerview: you load a batch of posts from a server, like 25 posts, and you show them in a scrollable list. when you scroll further, and you want to access the item on position 25, you load the next batch from the server. The recyclerview is my concrete problem, but the same problem would occur on websites and so on.
So this is the problem: Lets say the posts are ordered by likes and we have posts in the order {a, b, c, d, e, f, g, h} with decreasing likes. Our batchsize size is 3, so we always load 3 posts. Our screen can show 2 posts at a time. We begin with loading {a, b, c}. We are waiting 10 minutes before we scroll down and load pos3, pos4 and pos5. In this time the likes are going different and the new order is {a, b, d, c, f, e, g, h}. Our next batch will be {c, f, e}, since our backend request will be sth like backend.com/get3items/?batch=1. Since our screen can show 2 posts, it could show pos2 and pos3, which is {c, c}.
Same could happen if we remove or add an item. batchsize=1, list={a, b, c}. We load a. A new item g is added to the beginning of the list. We load a again.
Other facts: The number of items on the screen can be higher or lower then the batchsize. The changes of the list can happen at any position. Maybe we are loading batch#500 but there was a change at pos0. The code behind backend.com/get3items/?batch=1 could be done with "select * from items order by likes desc limit 3, 3"
For the answer I thought of this approaches:
1) The server notifies the client that a change happened via observer pattern. not useful, because if there are like 1 million subscribers, the server dies on every change (+ it is not restful-friendly, since the server has to track subscribers).
2) The server can calculate how the list was at a previous time. WOAH, is there a backend technique thing that I never heard of?
3) Everytime a new batch is getting loaded, the other batches in the buffer and the list are getting invalidated. The recyclerview will call notifyondatasetchanged then and all shown items are getting reloaded (so only the batches that are needed are loaded again). when loading the batches here, we have to make sure, the other batches are not getting invalidated again, since we would have an endless loop of refreshes. on the other hand - what happens when there is a change between reloading the batches here? same again, so unfortunately not the solution.
If i look at the posts at 9gag: you can scroll but you are getting the items of the time you loaded the site (and in the old order). you will only see new posts, if you refresh the page. so is there anything as described in 2)?

Related

Pattern for updating a recursive linear tree?

Let's say you have a factory, where you put together different things. There are requests for those things, and you want to monitor what's needed to construct the thing.
For example for a simplified car you can have a request for:
car 2 (1)
chassis 2 (1)
wheels 8 (4)
tyres 8 (1)
rims 8 (1)
motor 2 (1)
The numbers next to the parts are indicating the amounts needed in real time, the numbers in parentheses are indicating the amounts needed to construct one parent, and the indentications are showing the tree structure. The children of a specific part are showing how much is needed of which to construct the parent.
At any time a wheel could come in available to the inventory, and it would update the amount of wheels needed to 7, and that would update the amount of tyres, and rims amount needed to 7.
Similarly a whole car could come in available, reducing chassis to 1. motor to 1, and wheels to 3 from 7.
It may seem like a simple problem, but I've spent months with it now to figure out a secure way to do so.
The inventories are tracked, and each inventory has different properties like created at, which item is it, and how much is available. Inventories can also be dedicated to a specific request.
When a new "shipment" comes in, it contains new inventories. When new inventories come in, a check runs if any request needs of that inventory.
Once an inventory is dedicated to a request, the request's amount needed updates, and all the children's amount needed is updated as well.
When an inventory is dedicated to a request, a new inventory is created, with the dedicated amount, and the same properties except that it's being dedicated to a request. The original inventory's amount is decreased with the amount used by the request.
There are a lot of possible problems with this.
Let's start with the main problem. Multiple inventories can come in parallel, trying to dedicate themselves to the same request. A recursive function runs which needs to update all the children of the subtree of the request. The parent request is read, given the amount it has got from inventory, and the children is being updated.
To understand:
1. one shipment of `car` comes in
2. checking if any requests needing `car`
3. assigning general inventory of `1 car` as dedicated inventory to request
4. `car` request amount needed is reduced with `1`
5. `car` request reads children, and for each children:
5.1. read available inventory for child request
5.2. update child request amount needed with `parentRequest.amountNeeded * childRequest.amountNeededPerParent - childRequestAvailableInventory`
5.3. run step 5. for children of children recursively
So every request has a field that shows how much inventory is needed to construct the parent request. The formula for it is parentRequest.amountNeeded * request.amountNeededPerParent - requestAvailableInventory.
At every given point any request can get inventory, and if that happens, the tree of the request much be updated cascading down, updating their amount needed.
First issue:
Between reading children, and reading the child's available inventory, the child request may get updated.
Second issue:
Between reading child's available inventory, and updating the child's amount needed the child request, and available inventory for it can update.
Third issue:
I'm using mongodb, and cannot update request's amount needed, and create dedicated inventory at the exact same time. So it's not guaranteed that the request's amount needed value will be in sync with the request's dedicated inventory amount.
Draft function:
const updateChildRequestsAmountNeeded = async (
parentRequest: Request & Document,
) => {
const childRequests = await RequestModel.find({
parentRequestId: parentRequest._id,
}).exec();
return Promise.all(
childRequests.map(async (childRequest) => {
const availableInventory = await getAvailableInventory({
requestId: childRequest._id,
});
const amountNeeded =
(parentRequest.amountNeeded * childRequest.amountNeededPerParent)- availableInventory;
childRequest.set({ amountNeeded });
await childRequest.save();
await updateChildRequestsAmountNeeded(childRequest)
}),
);
};
See examples of when it can go wrong:
initial state for each case:
A amountNeeded: 5
B amountNeeded: 5 (amountNeededPerParent: 1)
A available: 0
B available: 0
1. parent amount needed decreases (A1, and A2 are the same requests, the number is indicating the ID of the parallel processes)
1. A1 gets inventory (1)
1. A2 gets inventory (2)
2. A1 amount needed updated (4)
2. A2 amount needed updated (2)
3. A2's children read (B2) (needed 5)
3. A1's children read (B1) (needed 5)
6. B2 amount needed updated (to 2)
6. B1 amount needed updated (to 4)
2. request gets inventory while updating:
1. A gets inventory (1)
2. A amount needed updated (4)
3. A's children read (B)
4. B available inventory read (0)
5. B gets inventory (1)
6. B amount needed updated (4)
7. B amount needed updated (4) (should be 3)
I've tried to find a way to solve the issue, and never overwrite amount needed with outdated data, but couldn't find a way. Maybe it's mongodb that is a wrong approach, or the whole data structure, or there is a pattern for updating recursive data atomically.

Moodle-progress bar

In moodle,I could see the default course progress for the courses in the moodle on the front end. But when tried to show the progress like 10% completed when chapter1 gets completed, 20% completed when chapter2 gets completed and so on. I could not find any module or could not figure out how to modify the code.
In other words:1. How to track the progress of course completion based on course subsections completion? Because default tracking based on courses based only.2. It is possible to track the courses without (refer https://i.stack.imgur.com/GUqwT.png) ticking the course completion checkbox?3. Based on the URL viewing of course sections, is it possible to track the course progress?Thanks in advance.
You can sometimes track specific page views and interactions via the mdl_logstore_standard_log table. Different modules/activities in Moodle log different types/amounts of data, but views of typical course topics/sections are usually logged regardless of completion.
For example, imagine a course with id=10 where you visit section/topic 3. The URL usually looks something like this: <yourdomain>/course/view.php?id=10&section=3
In this case, the view should be logged in mdl_logstore_standard_log with an eventname value of \core\event\course_viewed. The course id should be in the courseid column and the section viewed should be in the "other" column, although that data is an array stored with PHP serialization, so it's helpful to use unserialize and array parsing functions to get the "3" quickly if needed.
Again, keep in mind different activities/modules log data differently - for example, an assignment activity is logged differently - but hope this helps you find what you need. Good luck!

How does pages work if the DB is manipulated between next

The below code i have is working as intended, but is there a better way to do it?
I am consuming a db like a queue and process in batches of a max number. I'm thinking on how i can refactor it to use page.hasNext() and page.nextPageable()
However I can't find any good tutorial/documentation on what happens if the DB is manipulated between getting a page and getting the next page.
List<Customer> toBeProcessedList = customerToBeProcessedRepo
.findFirstXAsCustomer(new PageRequest(0, MAX_NR_TO_PROCESS));
while (!toBeProcessedList.isEmpty()) {
//do something with each customer and
//remove customer, and it's duplicates from the customersToBeProcessed
toBeProcessedList = customerToBeProcessedRepo
.findFirstXAsCustomer(new PageRequest(0, MAX_NR_TO_PROCESS));
}
If you use the paging support for each page requested a new sql statement gets executed, and if you don't do something fancy (and probably stupid) they get executed in different transactions. This can lead to getting elements multiple times or not seeing them at all, when the user moves from page to page.
Example: Page size 3; Elements to start: A, B, C, D, E, F
User opens the first page and sees
A, B, C (total number of pages is 2)
element X gets inserted after B; User moves to the next page and sees
C, D, E (total number of pages is now 3)
if instead of adding X, C gets deleted, the page 2 will show
E, F
since D moves to the first page.
In theory one could have a long running transaction with read stability (if supported by the underlying database) so one gets consistent pages, BUT this opens up questions like:
When does this transaction end, so the user gets to see new/changed data
When does this transaction end, when the user moves away?
This approach would have some rather high resource costs, while the actual benefit is not at all clear
So in 99 of 100 cases the default approach is pretty reasonable.
Footnote: I kind of assumed relational databases, but other stores should behave in basically the same way.

How to get Goal Funnel Step data such as "entered" and "proceeded" through Query API?

When looking at Goal Funnel report in the Google Analytics website. I can see not only the number of goal starts and completion but also how many visits to each step.
How can I find the step data through the Google Analytics API?
I am testing with the query explorer and testing on a goal with 3 steps, which 1st step marked as Required
I was able to get the start and completion by running by using goalXXStarts and goalXXCompletions:
https://www.googleapis.com/analytics/v3/data/ga?ids=ga%3A90593258&start-date=2015-09-12&end-date=2015-10-12&metrics=ga%3Agoal7Starts%2Cga%3Agoal7Completions
However I can't figure out a way to get the goal second step data.
I tried using ga:users or ga:uniquePageViews with the URL of the step 2, and previousPagePath as step 1 (required = true) and add to that the ga:users or ga:uniquePageViews from the next stage with ga:previousPagePath of step 1 (since its required=true) for backfill.
I also tried other combinations, but could never reach the right number or close to it.
One technique that can be used to perform conversion funnel analysis with the Google Analytics Core Reporting API is to define a segment for each step in the funnel. If the first step of the funnel is a 'required' step, then that step must also be included in segments for each of the subsequent steps.
For example, if your funnel has three steps named A, B, and C, then you will need to define a segment for A, another for B, and another again for C.
If step A is required then:
Segment 1: viewed page A,
Segment 2: viewed page A and viewed page B,
Segment 3: viewed page A and viewed page C.
Otherwise, if step A is NOT required then:
Segment 1: viewed page A,
Segment 2: viewed page B,
Segment 3: viewed page C.
To obtain the count for each step in the funnel, you perform a query against each segment to obtain the number of sessions where that segment matches. Additionally, you can query the previous and next pages, including entrances and exits, for each step (if you need to); in which case, query previousPagePath and pagePath as dimensions along with metrics uniquePageviews, entrances and exits. Keep in mind the difference between 'hit-level' vs 'session-level' data when performing, constructing and interpreting the results of each query.
You can also achieve similar results by using sequential segmentation which will offer you finer control over how the funnel steps are counted, as well as allowing for non-sequential funnel analysis if required.

REST pagination content duplicates

When creating REST application which will return a collection of items (topic with collection of posts) with sorting from new to old ones.
If there will be HATEOAS principles performed and all content will be chunked on pages client will get a current page id, offset, data limits and links to first, current and next page for example.
There is no problem to get data from next page, but if somebody has been added content while client is reading current page - data will be pushed on the start of collection and last item of current page will be moved to the next page.
If you will just skip posts which already has been loaded before, you will get lower amount of items on the next page. There is a way to get a count of pushed items in start of list and increment offset.
What is a best practices for this?
Not using offsets indexes, but instead skip tokens that indicate the first value not to include (or first value to include) is a good technique provided the value can be unique for every item in your result set and is an orderable field based on the current sort. But it's not flawless. Usually this just doesn't matter.
If it really does matter you have to put IDs of everything that's in the first page in the call to 2nd page, and again and again. HATEOAS helps you do stuff like this...but it can get very messy and still things can pop up on page 1 given the current sorting when you make a request for page 5...what do you do with that?
Another trick to avoid dupes in a UI is to use the self or canonical link relationships to uniquely identify resources in a page and compare those to existing resources in the UI. Updating the UI with the latest matching resources is usually a simple task. This of course puts some burden on the client.
There is not a 1 size fits all solution to this problem. You have to design for the UX you intend to fulfill.