How should modified pages be written out to secondary memory using Page Buffering? - operating-system

I am studying for my final OS exam and am currently stuck in a question:
Assume a system uses demand paging as its fetch policy.
The resident size is 2 pages.
Replacement policy is Least Recently Used (LRU).
Initial free frame list: 10, 20. 30, 40, 50
Assume a program runs with the following sequence of page references:
3(read), 2(read), 3(write), 1(write), 1(write), 0(write), 3(read)
I am asked to show the final contents of the free frame list, modified list, and the page table.
Here is the model answer.
This is what I managed to do.
The final Resident Set is correct, but the free frame list and the modified list are not. I just cannot see how the modified list does not contain page number 0 (as in it got written to memory), while page number 1 was not written even though it was referenced before it.
Any help would be appreciated.

Why do you recycle 3(10) to the free list in step 4? It was the least recently used (and is dirty) so you would want to keep it, and get rid of 2(20). That appears to be what the model answer is based on.

Related

Can you calculate active users using time series

My atomist client exposes metrics on commands that are run. Each command is a metric with a username element as well a status element.
I've been scraping this data for months without resetting the counts.
My requirement is to show the number of active users over a time period. i.e 1h, 1d, 7d and 30d in Grafana.
The original query was:
count(count({Username=~".+"}) by (Username))
this is an issue because I dont clear the metrics so its always a count since inception.
I then tried this:
count(max_over_time(help_command{job=“Application
Name”,Username=~“.+“}[1w]) -
max_over_time(help_command{job=“Application name”,Username=~“.+“}[1w]
offset 1w) > 0)
which works but only for one command I have about 50 other commands that need to be added to that count.
I tried the:
"{__name__=~".+_command",job="app name"}[1w] offset 1w"
but this is obviously very expensive (timeout in browser) and has issues with integrating max_over_time which doesn't support it.
Any help, am I using the metric in the wrong way. Is there a better way to query... my only option at the moment is the count (format working above for each command)
Thanks in advance.
To start, I will point out a number of issues with your approach.
First, the Prometheus documentation recommends against using arbitrarily large sets of values for labels (as your usernames are). As you can see (based on your experience with the query timing out) they're not entirely wrong to advise against it.
Second, Prometheus may not be the right tool for analytics (such as active users). Partly due to the above, partly because it is inherently limited by the fact that it samples the metrics (which does not appear to be an issue in your case, but may turn out to be).
Third, you collect separate metrics per command (i.e. help_command, foo_command) instead of a single metric with the command name as label (i.e. command_usage{commmand="help"}, command_usage{commmand="foo"})
To get back to your question though, you don't need the max_over_time, you can simply write your query as:
count by(__name__)(
(
{__name__=~".+_command",job=“Application Name”}
-
{__name__=~".+_command",job=“Application name”} offset 1w
) > 0
)
This only works though because you say that whatever exports the counts never resets them. If this is simply because that exporter never restarted and when it will the counts will drop to zero, then you'd need to use increase instead of minus and you'd run into the exact same performance issues as with max_over_time.
count by(__name__)(
increase({__name__=~".+_command",job=“Application Name”}[1w]) > 0
)

PowerApps datasource to overcome 500 visible or searchable items limit

For PowerApps, what data source, other than SharePoint lists are accessible via Powershell?
There are actually two issues that I am dealing with. The first is dynamic updating and the second is the 500 item limit that SharePoint lists are subject to.
I need to dynamically update my data source, which I am currently doing with PowerShell. My data source is not static and updating records by hand is time-consuming and error prone. The driving force behind my question is that the SharePoint list view threshold is 5,000 records however you are limited to 500 visible and searchable records when using SharePoint lists in the Gallery View and my data source contains greater than 500 but less than 1000 records. If you have any items beyond the 500th record that should match the filter criteria, they will not be found. So SharePoint lists are not optional for me until that limitation is remediated
Reference: https://powerapps.microsoft.com/en-us/tutorials/function-filter-lookup/
To your first question, Powershell can be used for almost anything on the Microsoft stack. You could use SQL server, Dynamics 365, SP, Azure, and in the future there will be an SDK for the Common Data Service. There are a lot of connectors, and Powershell can work with a good majority of them.
Take note that working with these data structures through Powershell is independent from Powerapps. Powerapps just takes the data that the data connector gives it, and if you have something updating the data in the background (Powershell, cron job, etc.), In order to get a dynamic list of items, you can use a Timer control and a Refresh function on your data source to update the list every ~5-20 seconds.
To your second question about SharePoint, there is an article that came out around the time you asked this regarding working with large lists. I wouldn't say it completely solves your question, but this article seems to state using the "Filter" function on basic column types would possibly work for you:
...if you’d like to filter the set of items that you are showing in the gallery control, you will make use of a “Filter” expression, rather than the “Search” expression, which is the default that existing apps used. With our changes, SharePoint connector now supports “equals” type of queries on columns that support filtering (Single line of text, choice, numbers, dates and people), so make sure that the columns and the expressions you use are supported and watch for the same warning to avoid reverting back to the top 500 items.
It also notes that if you want to pull from a list larger than the 5k threshold, you would need to use indexes, I have not fully tested this yet but it seems that this could potentially solve your problem.

Popularity of each wikipedia article

I would like to store a list of all en.wikipedia articles in my database. For each article I want to store the pageid, title and the popularity. I thought about using the view count (over the last month) as a measurement for popularity but if that is not possible, I could imagine going for something else (maybe use the number of revisions). I'm aware of http://dumps.wikimedia.org/enwiki/latest/ and that I can get a full list of articles from there (current count 36508337). However, I can not find a clever way to get the view count for each article.
// Updates, Edits, ...
The suggested duplicate does not help me because
a) I was looking for a popularity measurement. The answer to the other questions just states that it is not possible to get the number of watchers for a page, which is fine with me.
b) There is no answer there that gives me the page views (or any other metric) for every page.
Okay I'm finally done. Here is what I did:
I found http://dumps.wikimedia.org/other/pagecounts-ez/ which provides page views per month. This seems promising but they don't mention the pageid so what I'm doing is getting a list of all articles from http://dumps.wikimedia.org/enwiki/latest/, create a mapping name->pageid and then parse the pagecount dump. This takes about 30 minutes, here are some statistics:
68% of the articles in the page count file do not exist in the latest dump. This is probably due to some users linking, for example, Misfits_(TV_series) while other link to Misfits_(tv_series) and even stuff like Misfits_%28TV_series%29... I did not bother with those because my program already took long enough to run.
The top 3 pages are:
2.1. Front page with 639 million views (in the last month)
2.2. Malware with 8.5 million views
2.3. Falcon 9 v1.1 with 4.7 million views (cool!)
I made a histogram for the number of pages with a certain view count, here it is:
I also plotted the number of pages I would have to deal with when I disregard all articles below a certain view count. Here it is:

Google Search Appliance (GSA) feeds - unpredictable behavior

We have a metadata-and-url feed and a content feed in our project. The indexing behaviour of the documents submitted using either feed is completely unpredictable. For the content feed, the documents get removed from the index after a random interval every time. For the metadata-and-url feed, the additional metadata we add is ignored, again randomly. The documents themselves do remain in index in the latter case - only our custom metadata gets removed. Basically, it looks like the feeds get "forgotten" by GSA after sometime. What could be the cause of this issue, and how do we go about debugging this?
Points to note:
1) Due to unavoidable reasons, our GSA index is always hovering around the license limit (+/- 1000 documents or so). Could this have an effect? Are feeds purged when nearing license limit? We do have "lock = true" set in the feed records though.
2) These fed documents are not linked to from pages and hence (I believe) would have low page rank. Are feeds automatically purged if not linked to from pages?
3) Our follow patterns include the fed documents.
4) We do not use action=delete with the same documents, so that possibility is ruled out. Also for the content feed we always post all the documents. So they are not removed through feeds.
When you hit the license limit the GSA will start dropping documents from the index so I'd say that's definitely your problem.

ExpressionEngine missing channel entries

I am working on a new web app which is based in ExpressionEngine and for the most part I am basing the content on channel entries. However I am experiencing some very weird issues with the exp channel entries tag in that it is not returning all relevant entries each time. I can't figure out what's going on with it as the entries are definitely available when viewing them in the control panel, and they will also show up as requested in my template, but sometimes they just disappear and/or are not processed properly. This is the case for large and small sets of entries also, ranging from 3 channel entries which fit the criteria specified within the exp tag to 500 entries.
Any thoughts or feedback would be greatly appreciated.
There could be a number of things going on here so here are some things to look at, just in case;
If the entries have entry dates in the future - you'll need your channel entries tag to have the parameter show_future_entries = "yes"
Likewise if the entries are closed, or expired, you'll need to add show="open|closed"
Are you looking at a particular category and these entries aren't assigned to the category?
Are you looking at a particular category but have exlcuded category data from the entries tag
Are you retrieving more than 100 entries? There is a default limit of 100 entries returned unless you specify a limit parameter.