Lets say that I have gps coordinates for 1000 stores and a short text to each one. What would be the best way to store this information? SQL? One more thing to consider is how to load the information into the app. It don't seem to be a smart thing to load everything direct, the best way seems to be loading the stores in the specific area but how do I search for those stores? Is that easy to do in SQL? As you see I don't have so much experience of database programming.
storing in SQLite3 file would be best for the moment as you have lots of data and through a db, you can fetch the required data (via query) back on demand.
as for the getting stores of specific area - assuming you want to locate nearby stores within 10 KM of radius.
This will not be hard to fetch from db.
I am no server expert but can give you a guideline which will work:-
you will be having current lat/long, you will have lat/long in the table as well for each row.
may be a server guy could help you create you a query with formula where you can fit the distance calculation formula in 'where' clause's condition and you will get the records of nearby (10 KMs) places records.
Related
I have 21 million rows (lines in csv files) that I want to import into MongoDB to report on.
The data comes a process on each PC's within our organisation - which create a row every 15 minutes showing who is logged on.
Columns are: date/time, PC Name, UserName, Idle time (if user logged on)
I need to be able to report from a PC POV (PC usage metrics) and a User POV (user dwell time and activity/movement).
Initially I just loaded the data using mongoimport. But this raw data structure is not easy to report on. This could simply be my lack of knowledge of MongoDB.
I have been reading http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb which is a great article on schema design for time series data in mongodb.
This makes sense for reporting on PC usage - as I could pre-process the data and load it into Mongo as one document per PC/date combination, with an array of hourly buckets.
However I suspect this would make reporting from the user POV difficult.
I'm now thinking of create two collection - one for PC data and another for user data (one document per user/date combination etc).
I would like to know if I'm on the right track - or if anyone could suggest a better solution, of if indeed the original, raw data would suffice - and instead I just need to know how to query from both angles (some kind of map-reduce).
Thanks
Tim
I am trying to implement a quick search as-you-type mechanism.
In my current implementation, when the user launches the app for the first time, he has to wait a little bit for a downloading process to complete. During that time, information about the 20,000 products that the app sells is being downloaded. Each product is represented by an instance of NSManagedObject and is added to a Core Data database.
The real problem is the way to use those products. After the user launches the app once again (not the first time), the products need to be loaded to memory so the search would be quick.
In order to do that, I loop over the entire database and create an instance of NSDictionary for each product that contains its information, because it is much easier to use dictionary objects in my program to retrieve information about the product.
Because the dictionaries are stored in the memory and therefore the search process is very quick, but iterating over the 20,000 objects (onces per launch) and creating dictionaries takes a lot of time (about a minute), so that solution is not good.
I thought about another way to reach the quick-search goal: Fetching objects from the database after each letter has been typed. But I do not know how fast it would be.
What is the recommended way to do that?
Thanks,
Sagiftw
I have a similar feature in my app but have considerable less records. I have indices on all search fields and create as simple (inexpensive) sql querys (NSPredicate)as possible from the input (2nd fetchedResultsController only for searching). The result set contains the 'search items'. This is at least fast enough for around 1000 entries (test data size) with a random distribution of text type search keys. Its possible a good idea to fetch in the background to prevent the gui from being unresponsive.
I am working on a Website which is displaying all the apps from the App Store. I am getting AppStore data by their EPF Data Feeds through EPF Importer. In that database I get the pricing of each App for every store. There are dozen of rows in that set of data whose table structure is like:
application_price
The retail price of an application.
Name Key Description
export_date The date this application was exported, in milliseconds since the UNIX Epoch.
application_id Y Foreign key to the application table.
retail_price Retail price of the application, or null if the application is not available.
currency_code The ISO3A currency code.
storefront_id Y Foreign key to the storefront table.
This is the table I get now my problem is that I am not getting any way out that how I can calculate the price reduction of apps and the new free apps from this particular dataset. Can any one have idea how can I calculate it?
Any idea or answer will be highly appreciated.
I tried to store previous data and the current data and then tried to match it. Problem is the table is itself too large and comparing is causing JOIN operation which makes the query execution time to more than a hour which I cannot afford. there are approx 60, 000, 000 rows in the table
With these fields you can't directly determine price drops or new application. You'll have to insert these in your own database, and determine the differences from there. In a relational database like MySQL this isn't too complex:
To determine which applications are new, you can add your own column "first_seen", and then query your database to show all objects where the first_seen column is no longer then a day away.
To calculate price drops you'll have to calculate the difference between the retail_price of the current import, and the previous import.
Since you've edited your question, my edited answer:
It seems like you're having storage/performance issues, and you know what you want to achieve. To solve this you'll have to start measuring and debugging: with datasets this large you'll have to make sure you have the correct indexes. Profiling your queries should helping in finding out if they do.
And probably, your environment is "write once a day", and read "many times a minute". (I'm guessing you're creating a website). So you could speed up the frontend by processing the differences (price drops and new application) on import, rather than when displaying on the website.
If you still are unable to solve this, I suggest you open a more specific question, detailing your DBMS, queries, etc, so the real database administrators will be able to help you. 60 million rows are a lot, but with the correct indexes it should be no real trouble for a normal database system.
Compare the table with one you've downloaded the previous day, and note the differences.
Added:
For only 60 million items, and on a contemporary PC, you should be able to store a sorted array of the store id numbers and previous prices in memory, and do an array lookup faster than the data is arriving from the network feed. Mark any differences found and double-check them against the DB in post-processing.
Actually I also trying to play with these data, and I think best approach for you base on data from Apple.
You have 2 type of data : full and incremental (updated data daily). So within new data from incremental (not really big as full) you can compare only which record updated and insert them into another table to determine pricing has changed.
So you have a list of records (app, song, video...) updated daily with price has change, just get data from new table you created instead of compare or join them from various tables.
Cheers
I will have xml file on server. It will store information of about 600 stores. information includes name, address, opening time , coordinates. So is it ok to parse whole file into iphone then select nearest stores according to coordinates?
I am thinking about processing time and memory use
Please suggest
The way I would do this is write a web service and pass it the coordinates and download only those within a certain radius. Always try to download as little data as possible to the iPhone (especially xml data)
I just put this here
http://quatermain.tumblr.com/post/93651539/aqxmlparser-big-memory-win
A simple solution would be to group them into clusters that are somehow related, probably by location. You already have an XML on a server, so simply split them up into 3 groups of related stores of around 200, or preferably even smaller. I'm not entirely sure on why you would want to store 600 data points of that nature. I feel that if you filter/shrink on the server side you could be saving a lot of time/memory.
I have seen people storing 300-400 data points, though it is so dependent on how large your defined objects in your Core Database are, that it is probably best for you to just run some tests.
I have populated a Core Data database and need to query it based upon my users location. We use similar code at the backend of a webservice as a UDF and return the distance as a column, but we now have a requirement to cache some of this data for offline use.
I know CLLocation has a distanceTo method but is this going to be efficient when parsing a few thousand rows of data?
A few thousand rows isn't a huge amount of data. Make sure you design your CoreData schema such that it can load the objects with the location information without also loading tons of other data (put any large blocks of data into their own objects and let it load them lazily).
When you start getting beyond a few thousand, though, you might need to go to a different local data structure, such as some library that implements an R-Tree or range searches.