iPhone - Getting a reliable timestamp from the app to be store into external mysql database - iphone

I want to write into a mysql database a timestamp in real milliseconds (I mean not just seconds x 1000).
How may I achieve to get that timestamp into the app. I saw that some methods on NSDate could work, but they are based onto the iPhone date/time, that can be changed by the user, so using those methods would conclude to "fake" timestamps.

Any timestamp generated off the local clock will be subject to the same attack, so you'll have to find a reliable, trustworthy source of time information in its stead. An SSL-secured server would do. You'll also have to account for lag in contacting the server, which can be hundreds of milliseconds or more on an EDGE WWAN connection.
From there, you would write the data to the database the same way you would write any other data.

Related

Why do I need to set a timezone for my server, but not using Unix Timestamp?

I am a newbie to backend and creating a backend server that handles multiple countries' data.
Some countries have a daylight saving issue that requires us to correct timezones twice a year regarding their local data, which would make more work on the server, I think.
I know Unix timestamp is the same everywhere so
Is it easier for us to use the Unix timestamp for all data from anywhere rather than UTC+0 ?

TimescaleDB Vs. InfluxDB to plc logging

I am 100% new to logging in an database at all and that's proberbly why there will maybe be some stupid questions here and i hope that it's ok.
I would like to logg data from an Beckhoff plc controller into an DB that is placed on the same IPC as my PLC.
The Beckhoff plc has a direct link function to both InfluxDB and to PostgreSQL that TimescaleDB is based on, so the connection will work fine.
We would like to log data to time so we can go back and see what time certain things did happen and also make questions to the database based on time.
I have been talking to different people and most of them recommend to use TimescaleDB so it would be great to hear the benefits between them and what you guys would recommend me to choose.
The data size we will log is pretty small.
We will have an structure of data and that will contain like 10 INT registers so 20 bytes.
We will make an log to the database evry 1 second on the quick machines and sometimes only one time each 20minutes, but this part i will control in my plc.
So putting data in the DB i belive will be pretty straight forward but then i have some toughts about what i would like to do and what is possible.
Is it possible to ask questions to the DB to give me the amount, higest value, lowest value, mean value the last 60 miuntes, or 24hours etc and then can the database retur these values based on the time frame i give the database in my question to it?
The resolution i log with that is controlled from the plc is only needed to be in that high for 7 days and after that i would like to "downstream / compress" the data. Is that possible in both these databases and is there any benefit in one of them? Maybe easier in one of them?
Is there in one of these two databases a possibility to not write to the HD / disk everytime my plc is putting data to it? Or it will write to the disk everytime automaticly? I did read about something called WAL, what is that or that will not use the RAM to store the data before it writes more data and not so often to the disk?
is there any big difference in setting up these two databases?
I proberly have more questions but these are the main functions that i need in the system.
Many thanks
Is it possible to ask questions to the DB to give me the amount, higest value, lowest value, mean value the last 60 miuntes, or 24hours etc and then can the database retur these values based on the time frame i give the database in my question to it?
Yes! You can use queries to make it. Consider the following table structure:
CREATE TABLE conditions (
time TIMESTAMPTZ NOT NULL,
device INTEGER NOT NULL,
temperature FLOAT NOT NULL
);
SELECT * FROM create_hypertable('conditions', 'time');
ALTER TABLE conditions SET (timescaledb.compress, timescaledb.compress_orderby='time');
The resolution i log with that is controlled from the plc is only needed to be in that high for 7 days and after that i would like to "downstream / compress" the data. Is that possible in both these databases and is there any benefit in one of them? Maybe easier in one of them?
You can create a continuous aggregates that is a fast method to keep your resumed data materialized.
CREATE MATERIALIZED VIEW conditions_hourly(time, device, low, high, average )
WITH (timescaledb.continuous) AS
SELECT time_bucket('1 hour', time) as time,
device,
min(temperature) as low,
max(temperature) as high,
AVG(temperature) as average
FROM conditions
GROUP BY 1,2;
And then you can add a retention policy for keeping only the last 7 days.
SELECT add_retention_policy('conditions', INTERVAL '7 day');
And add a continuous aggregates policy that will keep your view up to date every hour:
SELECT add_continuous_aggregate_policy('conditions_hourly',
start_offset => INTERVAL '1 day',
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour');
Is there in one of these two databases a possibility to not write to the HD / disk everytime my plc is putting data to it? Or it will write to the disk everytime automaticly? I did read about something called WAL, what is that or that will not use the RAM to store the data before it writes more data and not so often to the disk?
In Postgresql you can use async commits: https://www.postgresql.org/docs/current/wal-async-commit.html

I would like to cache new data in Redis, before insert them directly on Postgres

I am dealing with a great number of inserts per second int a Postgres DB (and a lot of read too).
A few days ago I heard about Redis and start to think about send all these INSERTS for Redis first, to avoid a lot of open/insert/close things in Postgres every second.
Than, after some short period, i could group those data from Redis, in a INSERT SQL structure and run them together in Postgres, with only one connection opened.
The system stores GPS data and an Online Map read them, in real time.
Any suggestions for that scenario? Thanks !!
I do not know how important it is in your case to have the data available for your users almost real time. But from the listed above, I do not see anything that can not be solved by configuration/replication for Postgresql.
You have A lot of writes to your database; before going for a different technology, Postgresql is tested in big battles and I am sure you can get more by configuring it to handle more writes if it is optimized. link
You have a lot of read to your database; A Master-Slave replication can let all your read traffic be targeted to those DB salves and you can scale horizontally as much as you need.

Calculating price drop Apps or Apps gonna free - App Store

I am working on a Website which is displaying all the apps from the App Store. I am getting AppStore data by their EPF Data Feeds through EPF Importer. In that database I get the pricing of each App for every store. There are dozen of rows in that set of data whose table structure is like:
application_price
The retail price of an application.
Name Key Description
export_date The date this application was exported, in milliseconds since the UNIX Epoch.
application_id Y Foreign key to the application table.
retail_price Retail price of the application, or null if the application is not available.
currency_code The ISO3A currency code.
storefront_id Y Foreign key to the storefront table.
This is the table I get now my problem is that I am not getting any way out that how I can calculate the price reduction of apps and the new free apps from this particular dataset. Can any one have idea how can I calculate it?
Any idea or answer will be highly appreciated.
I tried to store previous data and the current data and then tried to match it. Problem is the table is itself too large and comparing is causing JOIN operation which makes the query execution time to more than a hour which I cannot afford. there are approx 60, 000, 000 rows in the table
With these fields you can't directly determine price drops or new application. You'll have to insert these in your own database, and determine the differences from there. In a relational database like MySQL this isn't too complex:
To determine which applications are new, you can add your own column "first_seen", and then query your database to show all objects where the first_seen column is no longer then a day away.
To calculate price drops you'll have to calculate the difference between the retail_price of the current import, and the previous import.
Since you've edited your question, my edited answer:
It seems like you're having storage/performance issues, and you know what you want to achieve. To solve this you'll have to start measuring and debugging: with datasets this large you'll have to make sure you have the correct indexes. Profiling your queries should helping in finding out if they do.
And probably, your environment is "write once a day", and read "many times a minute". (I'm guessing you're creating a website). So you could speed up the frontend by processing the differences (price drops and new application) on import, rather than when displaying on the website.
If you still are unable to solve this, I suggest you open a more specific question, detailing your DBMS, queries, etc, so the real database administrators will be able to help you. 60 million rows are a lot, but with the correct indexes it should be no real trouble for a normal database system.
Compare the table with one you've downloaded the previous day, and note the differences.
Added:
For only 60 million items, and on a contemporary PC, you should be able to store a sorted array of the store id numbers and previous prices in memory, and do an array lookup faster than the data is arriving from the network feed. Mark any differences found and double-check them against the DB in post-processing.
Actually I also trying to play with these data, and I think best approach for you base on data from Apple.
You have 2 type of data : full and incremental (updated data daily). So within new data from incremental (not really big as full) you can compare only which record updated and insert them into another table to determine pricing has changed.
So you have a list of records (app, song, video...) updated daily with price has change, just get data from new table you created instead of compare or join them from various tables.
Cheers

Raw Data or Pre-Calculated Values in Database?

In general, is it better to store raw data with pre-calculated values in the database and concentrate on keeping the database up-to-date if I remove or delete a row while using the pre-calculated values for display to the user
OR
is it better to store the raw data and calculate the correct display values on-the-fly?
An example (which is pertinent to my project) would be similar to the following:
You have a timer application. In my case its using Core Data. It's not connected to the web, but a self-contained app that runs on a computer or mobile device (user's choice). The app stores a raw start time and a raw end time. The application needs to display the duration of the event and the interval at which the events are occuring. Would it be better to store a pre-calculated "duration" time and even a pre-formatted duration string that will be used for output or would it be better to calculate the duration on-the-fly, so to speak, for display?
Same goes with the interval, although there's another layer involved because when I create/delete/update a row in the database, I'll have update the interval for the items that are affected by this. Or, is it better to just calculate as the app executes?
For the record, I'm not trying to micro-optimize. I'm trying to figure out the best way to reduce the amount of code I have to maintain. If performance improves as a result, so be it.
Thoughts?
Generally, you would want to avoid computed values in the DB (from existing columns/tables), unless profiling absolutely dictates that they are necessary (i.e., the DB is underperforming or to great of a load is being placed on the server). This is even more true for formatting of the data, which should almost always be performed on the client side, instead of wasting DB server cycles.
Of course, any data that is absolutely mandatory to perform the calculations should be stored in the database.
When you speak of reducing the amount of code you need to maintain, keep in mind that the DBA needs to maintain stored-proc code and table schemas, too. Moving maintenance responsibilities from Developers to DBAs is not eliminating work, it is just shifting it.
Finally, database changes often cascade to many applications, whereas application changes only affect that application.
The only time I store calculated values in a database is if I need it for historical purposes. You'll see this all the time in accounting software.
For example if I'm dealing with an invoice, I will typically save the calculated invoice total because perhaps the way that total will get calculated later on will change.
I will also sometimes perform the actual calculation on the database server using views.
As with so many other things, "it depends". For your described case, I would lean towards keeping the calculation in code. If you do choose to use the database, you should use a view to dynamically calculate rather than put in a static value. The risk of changing the start time or end time and forgetting to change the duration would be too high otherwise :)
This really depends on wether you want to be pure (keep your data clean) or fast. Compute capacity on the desktop facilitates purity, high speed cores and large memory spaces make string composition for table cells possible with large data sets.
However on the phone, an iPhone 4 even, computing a single NSString for a UITableViewCell over a set of 1000 objects takes a noticeable amount of time, and this can affect your user experience.
So, tune the balance for your use case. Duration doesn't sound like it will change, so I would precalculate and store the duration AND the display string (feels aweful from the perspective of a DBA, but it will render fast on the phone).
For the interval it sounds like you actually need another entity, to relate the interval to a set of events. It would then be easy enough to pre-compute / maintain this calculation as well each time the relationship changes (i.e. you add an entity to the relationship, update the interval).