How to approximate processing time? - iphone

It's common to see messages like "Installation will take 10 min aprox." , etc in desktop applications. So, I wonder how can I calculate an approximate of how much time a certain process will take. Off course I won't install anything but I want to update some internal data and depending on the user usage this might take some time.
Is this possible in a iPhone app? How Cocoa guys do this, would it be the same way in iPhone apps?
Thanks in advance.
UPDATE: I want to rewrite/edit some files on disk, most of the time these files are not the same size so I cannot use timers for the first iteration and calculate the rest from that.
Is there any API that helps on calculating this?

If you have some list of things to process, each "thing" - usually better to measure a group of 10 or so "things" - is a unit of work. Your goal is to see how long it takes to process a single group and report the estimated time to completion.
One way is to create an NSDate at the start of each group and a new one at the end (the top and bottom of your for loop) for each group. Multiply the difference in seconds by however many groups you have left (minus the one you just processed) and that should be a reasonable estimate of the time remaining.
Of course this gets more complicated if one "thing" takes a lot longer to process than another "thing" - the above approach assumes all things take the same amount of time. In this case, however, you may need to keep track of an average window (across the last n "things" or groups thereof).
A more detailed response would require more details about your model and what work you're performing.

Related

Firebase analytics - Unity - time spent on a level

is there any possibility to get exact time spent on a certain level in a game via firebase analytics? Thank you so much 🙏
I tried to use logEvents.
The best way to do so would be measuring the time on the level within your codebase, then have a very dedicated event for level completion, in which you would pass the time spent on the level.
Let's get to details. I will use Kotlin as an example, but it should be obvious what I'm doing here and you can see more language examples here.
firebaseAnalytics.setUserProperty("user_id", userId)
firebaseAnalytics.logEvent("level_completed") {
param("name", levelName)
param("difficulty", difficulty)
param("subscription_status", subscriptionStatus)
param("minutes", minutesSpentOnLevel)
param("score", score)
}
Now see how I have a bunch of parameters with the event? These parameters are important since they will allow you to conduct a more thorough and robust analysis later on, answer more questions. Like, Hey, what is the most difficult level? Do people still have troubles on it when the game difficulty is lower? How many times has this level been rage-quit or lost (for that you'd likely need a level_started event). What about our paid players, are they having similar troubles on this level as well? How many people have ragequit the game on this level and never played again? That would likely be easier answer with sql at this point, taking the latest value of the level name for the level_started, grouped by the user_id. Or, you could also have levelName as a UserProperty as well as the EventProperty, then it would be somewhat trivial to answer in the default analytics interface.
Note that you're limited in the number of event parameters you can send per event. The total number of unique parameter names is limited too. As well as the number of unique event names you're allowed to have. In our case, the event name would be level_completed. See the limits here.
Because of those limitations, it's important to name your event properties in somewhat generic way so that you would be able to efficiently reuse them elsewhere. For this reason, I named minutes and not something like minutes_spent_on_the_level. You could then reuse this property to send the minutes the player spent actively playing, minutes the player spent idling, minutes the player spent on any info page, minutes they spent choosing their upgrades, etc. Same idea about having name property rather than level_name. Could as well be id.
You need to carefully and thoughtfully stuff your event with event properties. I normally have a wrapper around the firebase sdk, in which I would enrich events with dimensions that I always want to be there, like the user_id or subscription_status to not have to add them manually every time I send an event. I also usually have some more adequate logging there Firebase Analytics default logging is completely awful. I also have some sanitizing there, lowercasing all values unless I'm passing something case-sensitive like base64 values, making sure I don't have double spaces (so replacing \s+ with " " (space)), maybe also adding the user's local timestamp as another parameter. The latter is very helpful to indicate time-cheating users, especially if your game is an idler.
Good. We're halfway there :) Bear with me.
Now You need to go to firebase and register your eps (event parameters) into cds (custom dimensions and metrics). If you don't register your eps, they won't be counted towards the global cd limit count (it's about 50 custom dimensions and 50 custom metrics). You register the cds in the Custom Definitions section of FB.
Now you need to know whether this is a dimension or a metric, as well as the scope of your dimension. It's much easier than it sounds. The rule of thumb is: if you want to be able to run mathematical aggregation functions on your dimension, then it's a metric. Otherwise - it's a dimension. So:
firebaseAnalytics.setUserProperty("user_id", userId) <-- dimension
param("name", levelName) <-- dimension
param("difficulty", difficulty) <-- dimension (or can be a metric, depends)
param("subscription_status", subscriptionStatus) <-- dimension (can be a metric too, but even less likely)
param("minutes", minutesSpentOnLevel) <-- metric
param("score", score) <-- metric
Now another important thing to understand is the scope. Because Firebase and GA4 are still, essentially just in Beta being actively worked on, you only have user or hit scope for the dimensions and only hit for the metrics. The scope basically just indicates how the value persists. In my example, we only need the user_id as a user-scoped cd. Because user_id is the user-level dimension, it is set separately form the logEvent function. Although I suspect you can do it there too. Haven't tried tho.
Now, we're almost there.
Finally, you don't want to use Firebase to look at your data. It's horrible at data presentation. It's good at debugging though. Cuz that's what it was intended for initially. Because of how horrible it is, it's always advised to link it to GA4. Now GA4 will allow you to look at the Firebase values much more efficiently. Note that you will likely need to re-register your custom dimensions from Firebase in GA4. Because GA4 is capable of getting multiple data streams, of which firebase would be just one data source. But GA4's CDs limits are very close to Firebase's. Ok, let's be frank. GA4's data model is almost exactly copied from that of Firebase's. But GA4 has a much better analytics capabilities.
Good, you've moved to GA4. Now, GA4 is a very raw not-officially-beta product as well as Firebase Analytics. Because of that, it's advised to first change your data retention to 12 months and only use the explorer for analysis, pretty much ignoring the pre-generated reports. They are just not very reliable at this point.
Finally, you may find it easier to just use SQL to get your analysis done. For that, you can easily copy your data from GA4 to a sandbox instance of BQ. It's very easy to do.This is the best, most reliable known method of using GA4 at this moment. I mean, advanced analysts do the export into BQ, then ETL the data from BQ into a proper storage like Snowflake or even s3, or Aurora, or whatever you prefer and then on top of that, use a proper BI tool like Looker, PowerBI, Tableau, etc. A lot of people just stay in BQ though, it's fine. Lots of BI tools have BQ connectors, it's just BQ gets expensive quickly if you do a lot of analysis.
Whew, I hope you'll enjoy analyzing your game's data. Data-driven decisions rock in games. Well... They rock everywhere, to be honest.

How to determine when to start a counter to ensure it never catches the previous counter

I have a problem where I have several events that are occurring in a project, the events happen semi-concurrently, where they do not start at the same time but multiple can still be occurring at once.
Each event is a team of people working on a linear task, starting at the beginning and then working their way to the end. Their progress is based on a physical distance.
I essentially need to figure out each events start time in order for no teams to be at the same location, nor passing eachother, at any point.
I am trying to program this in MATLAB so that the output would be the start and end time for each event. The idea would be to optimize the total time taken for the project.
I am not sure where to begin with something like this so any advice would be greatly appreciated.
If I understand correct, you just want to optimize the "calendar" of events with limited resources (aka space/teams).
This kind of problems are those called NP and there is no "easy" way to search for the best solution.
You here have two options:
Greedy like algorithm: You will have your solution in a resonable time but it won't be the best one.
Brute force like algorithm: You will find the best solution but maybe not in the time you need it.
Usually if the amount of events is low you can go for 2nd option but if don't you may need to go for the first one.
No mather which one you choose first thing you will need to do is to compute if a solution is valid. What does this mean? It means to check for every event wheter if it collisions whith others in time, space and teams.
So lets imagine the problem of making the calendar on a University. There you have to think about:
Students
Teacher
Classroom
So for each event I have to check if another event have same students, teacher or classroom at the same time. First of all I will check the events that match in time with the actual event. Then I will compare the actual event with all the others.
Once you have this done you could just write a greedy algorithm that starts placing events on time just checking if it collides with some other.

Why is my identifier collision rate increasing?

I'm using a hash of IP + User Agent as a unique identifier for every user that visits a website. This is a simple scheme with a pretty clear pitfall: identifier collisions. Multiple individuals browse the internet with the same IP + user agent combination. Unique users identified by the same hash will be recognized as a single user. I want to know how frequently this identifier error will be made.
To calculate the frequency, I've created a two-step funnel that should theoretically convert at zero percent: publish.click > signup.complete. (Users have to signup before they publish.) Running this funnel for 1 day gives me a conversion rate of 0.37%. That figure is, I figured, my unique identifier collision probability for that funnel. Looking at the raw data (a table about 10,000 rows long), I confirmed this hypothesis. 37 signups were completed by new users identified by the same hash as old users who completed publish.click during the funnel period (1 day). (I know this because hashes matched up across the funnel, while UIDs, which are assigned at signup, did not.)
I thought I had it all figured out...
But then I ran the funnel for 1 week, and the conversion rate increased to 0.78%. For 5 months, the conversion rate jumped to 1.71%.
What could be at play here? Why is my conversion (collision) rate increasing with widening experiment period?
I think it may have something to do with the fact that unique users typically only fire signup.complete once, while they may fire publish.click multiple times over the course of a period. I'm struggling however to put this hypothesis into words.
Any help would be appreciated.
Possible explanations starting with the simplest:
The collision rate is relatively stable, but your initial measurement isn't significant because of the low volume of positives that you got. 37 isn't very many. In this case, you've got two decent data points.
The collision rate isn't very stable and changes over time as usage changes (at work, at home, using mobile, etc.). The fact that you got three data points that show an upward trend is just a coincidence. This wouldn't surprise me, as funnel conversion rates change significantly over time, especially on a weekly basis. Also bots that we haven't caught.
If you really get multiple publishes, and sign-ups are absolutely a one-time thing, then your collision rate would increase as users who only signed up and didn't publish eventually publish. That won't increase their funnel conversion, but it will provide an extra publish for somebody else to convert on. Essentially, every additional publish raises the probability that I, as a new user, am going to get confused with a previous publish event.
Note from OP. Hypothesis 3 turned out to be the correct hypothesis.

Postgresql Modeling Ranges

I'm looking to model a group of users that provider various services that take various times and hoping to build on the relatively new ranges datetypes that Postgresql supports to make things a lot cleaner.
Bookings:
user_id|integer
time|tsrange
service_id|integer
Services:
user_id|integer
time_required|integer #in hours
Users:
id|integer
Services vary between users, some might be identical but takes one user 2 hours and another just 1 hour.
Searching for bookings that occur within, or not within a given time period are easy. I'm having trouble figuring out how best I would get all the users who have time available on a given day to perform one or more of their services.
I think I need to find the inverse of their booked ranges, bound by 6am/8pmn on a given day and then see if the largest range within that set will fit their smallest service, but figuring out how express that in SQL is eluding me.
Is it possible to do inverses of ranges, or would their be a better way to approach this?
The question isn't entirely clear, but if I get it right you're look for the lead() or lag() window function to compute available time slots. If so, this should put you on the right track:
select bookings.*, lead("time") over (order by "time") as next_booking_time
from bookings
http://www.postgresql.org/docs/current/static/tutorial-window.html

Are "swap move factories" worth the effort?

I noticed that for problems such as Cloudbalancing, move factories exist to generate moves and swaps. A "move move" transfers a cloud process from one computer to another. A "swap move" swaps any two processes from their respective computers.
I am developing a timetabling application.
A subjectTeacherHour (a combination of subject and teacher) have
only a subset of Periods to which they may be assigned. If Jane teaches 6 hours at a class, there are 6 subjectTeacherHours each which have to be allocated a Period, from a possible 30 Periods of that class ;unlike the cloudbalance example, where a process can move to any computer.
Only one subjectTeacherHour may be allocated a Period (naturally).
It tries to place subjectTeacherHour to eligible Periods , till an optimal combination is found.
Pros
The manual seems to recommend it.
...However, as the traveling tournament example proves, if you can remove
a hard constraint by using a certain set of big moves, you can win
performance and scalability...
...The `[version with big moves] evaluates a lot less unfeasible
solutions, which enables it to outperform and outscale the simple
version....
...It's generally a good idea to use several selectors, mixing fine
grained moves and course grained moves:...
While only one subjectTeacher may be allocated to Period, the solver must temporarily break such a constraint to discover that swapping two certain Period allocations lead to a better solution. A swap move "removes this brick wall" between those two states.
So a swap move can help lead to better solutions much faster.
Cons
A subjectTeacher have only a subset of Periods to which they may be assigned. So finding intersecting (common) hours between any two subjectTeachers is a bit tough (but doable in an elegant way: Good algorithm/technique to find overlapping values from objects' properties? ) .
Will it only give me only small gains in time and optimality?
I am also worried about crazy interactions having two kinds of moves may cause, leading to getting stuck at a bad solution.
Swap moves are crucial.
Consider 2 courses assigned to a room which is fully booked. Without swapping, it would have to break a hard constraint to move 1 course to a conflicted room and chose that move as the step (which is unlikely).
You can use the build-in generic swap MoveFactory. If you write your own, you can say the swap move isDoable() false when your moving either sides to an ineligible period.