Swift Firebase first write incredibly slow

Swift Firebase first write incredibly slow - swift

I have an app in production and am doing some updates. I use an old iPhone 5 for testing, and I've noticed that the write times for initial writes to the Firebase database are incredibly slow. When I say slow, I mean it's taking over a minute to do a single write to a single node (70-80 seconds).
Here is the write request (it's very simple):
FB.ref
.child(FB.members)
.child(currentUser.firstName)
.updateChildValues([FB.currentPoints: currentUser.currentPoints])
The issue is, this only happens on the first write. Subsequent writes are much faster (a couple of seconds at the most), and usually instant.
What I've Tried
I've read the documentation, and also this post here on SO, but it's not helping me.
I also did tests on the simulator and found that the write times for initial write is about 7 seconds. That still seems long to me, but nowhere near as long as the 70-80 seconds on the physical device.
Simulator initial write: 7 seconds (slow). Subsequent writes: instant-ish
iPhone 5 initial write: 80 seconds (extremely slow!). Subsequent writes: instant-ish
Any ideas on how to speed up that initial write? It's very problematic. For example, if a user logs in to make a quick change and then closes the app, the write will never happen (tested), or the delay will cause other inaccuracies within the app (also tested). What to do?
Is there a setting I can change on my database? Or a rule to adjust in my database rules? Or some code to include in my app?

Related

Throttling CPU usage in a Swift thread

I want to traverse the file tree for a potentially large directory in a macOS app. It takes about 3 mins for my example case if I just do it, but the CPU spikes to 80% or so for those 3 minutes.
I can afford to do it more slowly on a background thread, but am not sure of what the best approach would be.
I thought of just inserting 1 millisecond sleep inside the loop, but I am not confident that won't have some negative impact on scheduling / disk IO etc. An alternative would be to do 1 second of work, then wait 2-3 seconds, but I am guessing there is something more elegant?
The core functionality I want is traversing a directory in a nested fashion checking file attributes:
let enumerator = FileManager.default.enumerator(atPath: filePath)
while let element = enumerator?.nextObject() as? String {
// do something here
}

It's generally most energy efficient to spike the CPU for a short time than to run it at a low level for a longer time. As long as your process has a lower priority than other processes, running the CPU at even 100% for a short time isn't a problem (particularly if it doesn't turn the fans on). Modern CPUs would like to be run very hard for short periods of time, and then be completely idle. "Somewhat busy" for a longer time is much worse because the CPU can't power-off any subsystems.
Even so, users get very upset when they see high CPU usage. I used to work on system management software, and we spoke with Apple about throttling our CPU usage. They told us the above. We said "yes, but when users see us running at 100%, they complain to IT and try to uninstall our app." Apple's answer was to use sleep, like you're describing. If it makes your process take longer, then it will likely have a negative overall impact in total energy use. But I wouldn't expect it to cause any other trouble.
That said, if you are scanning the same directory tree more than once, you should look at File System Events and File Metadata Search which may perform this operations much more efficiently.
See also: Schedule Background Activity in the Energy Efficiency Guide for Mac Apps. I highly recommend this entire doc. There are many tools that have been added to macOS in recent years that may be useful for your problem. I also recommend Writing Energy Efficient Apps from WWDC 2017.
If you do need to scan everything directly with an enumerator, you can likely greatly improve things by using the URL-based API rather than the String-based API. It allows you to pre-fetch certain values (including attributeModificationDateKey, which may be of use here). Also, be aware of the fileAttributes property of DirectoryEnumerator, which caches the last-read file's attributes (so you don't need to query them again).
Three minutes is a long time; it's possible you're doing more work than needed. Run your operation using the find commandline tool and use that as a benchmark for how much time it should take.

High load-avg on Heroku Postgres

Some two weeks ago, I deployed some changes to my app (Flask + SQLAlchemy on top of Postgres) to Heroku. The response time of my dynos went up soon afterwards and the time outs in responses started. Before these problems started, the current app's version has been running flawlessly for some 2-3 months.
Naturally, I suspected my changes in the app and went through them, but there were none relevant to this (changes in the front end, replaced plain text emails with HTML ones, minor changes in the static data that the app is using).
I have a copy of the app for testing purposes, so I cloned the latest backup of the production DB and started investigating (the clone was some 45GiB, compared to 56GiB of the original, but this seems to be a normal consequence of "bloating").
It turns out that even the trivial requests take ridiculous amount of time on production, while they work on the testing one as they should. For example, select * from A where some_id in (three, int, values) takes under 0.5 sec on testing, and some 12-15 sec on prod (A has 3M records and some_id is a foreign key to a much smaller table). Even select count(*) from A will take the same amount of time, so it's not indexing or anything like that.
This is not tied to a specific query or even a table, thus removing my doubts of my code as most of it was unchanged for months and worked fine until these problems started.
Looking further into this, I found that the logs contain load averages for the DB server, and my production one is showing load-avg 22 (I searched for postgres load-avg in Papertrail), and it seems to be almost constant (slowly rising over prolonged periods of time).
I upgraded the production DB from Postgres 9.6 / Standard 2 plan (although, my connections number was around 105/400 and the cache hit rate was 100%) to Postgres 10 / Standard 3 plan, but this didn't make a slightest improvement. This upgrade also meant some 30-60min of downtime. Soon after bringing the app back up, the DB server's load was high (sadly, I didn't check during the downtime). Also, the DB server's load doesn't seem to have spikes that would reflect the app's usage (the app is mostly used in the USA and EU, and the usual app's load reflects that).
At this point, I am without ideas (apart from contacting Heroku's support, which a colleague of mine will do) and would appreciate any suggestions what to look or try next.

I ended up upgrading from standard-2 to standard-7 and my DB's load dropped to around 0.3-0.4. I don't have an explanation of why it started so suddenly.

wait/notify mechanism for multiple readers in Oracle sql?

We have multiple processes which read one database table, get available record and work with it. It works fine.
When there is no record in this table each process waits 5 seconds and reads it again.
So, record could idle in the table for 5 seconds which is not good.
What would be recommended solution to eliminate such waiting and proceed immediately when record is created? One solution could be trigger which does something when record created. But this solution requires knowledge of working processes to deliver record to the one of idle processes.
It looks that ideal solution would be when each process starts to read via SQL from something and when record is created one of waiting processes will have it record and other will continue to wait.
Does Oracle 10 provide such or similar mechanism?

Look at Database Change Notification in 10g, which has since been renamed Continuous Query Notification.
I normally like to include an example but it's hard to find a 10g instance these days, and even a short example requires a lot of code. The process looks complicated, it might be better off to use triggers as you suggested, and deal with the tight coupling.

One big call vs. multiple smaller TSQL calls

I have a ADO.NET/TSQL performance question. We have two options in our application:
1) One big database call with multiple result sets, then in code step through each result set and populate my objects. This results in one round trip to the database.
2) Multiple small database calls.
There is much more code reuse with Option 2 which is an advantage of that option. But I would like to get some input on what the performance cost is. Are two small round trips twice as slow as one big round trip to the database, or is it just a small, say 10% performance loss? We are using C# 3.5 and Sql Server 2008 with stored procedures and ADO.NET.

I would think it in part would depend on when you need the data. For instance if you return ten datasets in one large process, and see all ten on the screen at once, then go for it. But if you return ten datasets and the user may only click through the pages to see three of them then sending the others was a waste of server and network resources. If you return ten datasets but the user really needs to see sets seven and eight only after making changes to sets 5 and 6, then the user would see the wrong info if you returned it too soon.
If you use separate stored procs for each data set called in one master stored proc, there is no reason at all why you can't reuse the code elsewhere, so code reuse is not really an issue in my mind.

It sounds a wee bit obvious, but only send what you need in one call.
For example, we have a "getStuff" stored proc for presentation. The "updateStuff" proc calls "getStuff" proc and the client wrapper method for "updateStuff" expects type "Thing". So one round trip.
Chatty servers are one thing you prevent up front with minimal effort. Then, you can tune the DB or client code as needed... but it's hard to factor out the roundtrips later no matter how fast your code runs. In the extreme, what if your web server is in a different country to your DB server...?
Edit: it's interesting to note the SQL guys (HLGEM, astander, me) saying "one trip" and the client guys saying "multiple, code reuse"...

I am struggling with this problem myself. And I don't have an answer yet, but I do have some thoughts.
Having reviewed the answers given by others to this point, there is still a third option.
In my appllication, around ten or twelve calls are made to the server to get the data I need. Some of the datafields are varchar max and varbinary max fields (pictures, large documents, videos and sound files). All of my calls are synchronous - i.e., while the data is being requested, the user (and the client side program) has no choice but to wait. He may only want to read or view the data which only makes total sense when it is ALL there, not just partially there. The process, I believe, is slower this way and I am in the process of developing an alternative approach which is based on asynchronous calls to the server from a DLL libaray which raises events to the client to announce the progress to the client. The client is programmed to handle the DLL events and set a variable on the client side indicating chich calls have been completed. The client program can then do what it must do to prepare the data received in call #1 while the DLL is proceeding asynchronously to get the data of call #2. When the client is ready to process the data of call #2, it must check the status and wait to proceed if necessary (I am hoping this will be a short or no wait at all). In this manner, both server and client side software are getting the job done in a more efficient manner.

If you're that concerned with performance, try a test of both and see which performs better.
Personally, I prefer the second method. It makes life easier for the developers, makes code more re-usable, and modularizes things so changes down the road are easier.

I personally like option two for the reason you stated: code reuse
But consider this: for small requests the latency might be longer than what you do with the request. You have to find that right balance.

As the ADO.Net developer, your job is to make the code as correct, clear, and maintainable as possible. This means that you must separate your concerns.
It's the job of the SQL Server connection technology to make it fast.
If you implement a correct, clear, maintainable application that solves the business problems, and it turns out that the database access is the major bottleneck that prevents the system from operating within acceptable limits, then, and only then, should you start persuing ways to fix the problem. This may or may not include consolidating database queries.

Don't optimize for performance until a need arisess to do so. This means that you should analyze your anticipated use patterns and determine what the typical frequency of use for this process will be, and what user interface latency will result from the present design. If the user will receive feedback from the app is less than a few (2-3) seconds, and the application load from this process is not an inordinate load on server capacity, then don't worry about it. If otoh the user is waiting an unacceptable amount of time for a response (subjectve but definitiely measurable) or if the server is being overloaded, then it's time to begin optimization. And then, which optimization techniques will make the most sense, or be the most cost effective, depend on what your analysis of the issue tells you.
So, in the meantime, focus on maintainability. That means, in your case, code reuse

Personally I would go with 1 larger round trip.
This will definately be influenced by the exact reusability of the calling code, and how it might be refactored.
But as mentioned, this will depend on your exact situation, where maintainability vs performance could be a factor.

What are the approaches for writing a simple clock application?

I am writing a small program to display current time on iPhone (learning :D). I came across this confusion.
Is calling currentSystemTime ( eg: stringFromDate: ) on every second, parse it and print the time on screen is good?
Would it be more effective to call the above routine once and manually update the parsed second every tick of your timer. (Say like ++seconds; write some if loops to adjust minutes and hour).
Will the second approach result in out-of-sync of with the actual time; if the processor load increases or so?
Considering all this which will be the best approach.

I doubt that the overhead of querying the system time will be noticeable in comparison to the CPU cycles used to update the display. Set up an NSTimer to fire however often that you want to update the clock display, and update your display that way. Don't worry about optimizing it until you get the app working.

I would drop the seconds totally and just print the rest of the time then you only have to parse it once a minute.
That's if you want a clock rather than a stopwatch. Seriously, I can't remember the last time I looked at a clock without seconds and thought "Gosh, I don't know if it's 12:51:00 or 12:51:59. How will I make my next appointment?").
If you want to ensure you're relatively accurate in updating the minute, follow these steps:
Get the full time (HHMMSS).
Display down to minute resolution (HH:MM).
Subtract SS from 61 and sleep for that many seconds.
Go back to that first step.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse