I'm creating a cms and have not yet settled on the matter of where to store the complete url for a given page in the structure.
Every page have a slug (url friendly name of the page) and every page has a nullable (for top-level pages) parent and children.
Where do I store the complete url (/first-page/sub-page) for a given page? Should this go in the database along with the other properties of the page or in some cache?
Update
It's not the database design I'm asking about, rather where to store the complete url to a given page so I don't need to traverse the entire url to get the page that the user requested (/first-page/sub-page)
Update 2
I need to find which page belongs to the currently requested url. If the requested url is /first-page/sub-page I don't want to split the url and looping through the database (obviously).
I'd rather have the entire url in the table so that I can just do a single query (WHERE url = '/first-page/sub-page') but this does not seem ideal, what if I change the slug for the parent page? Then I also need to update the url-field for all descendants.
How do other people solve this issue? Are they putting it in the database? In a cache that maps /first-page-/sub-page to the id for the page? Or are they splitting the requested url and looping though the database?
Thanks
Anders
Store it in a cache, because the web servers will need to be looking up URLs constantly. Unless you expect the URLs of pages to change very rapidly, caching will greatly reduce load on the database, which is usually your bottleneck in database driven web sites.
Basically, you want a dictionary that maps URL -> whatever you need to render the page. Many web servers will automatically use the operating system's file system as the dictionary and will often have a built-in cache that can recognize when a file changes in the file system. This would probably be much more efficient than anything you can write in your CMS. It might be better, therefore, to have you CMS implement the structure directly in the file system and handle additional mapping with hard or soft links.
I just did this for MvcCms. I went with the idea of content categories/sub categories and content pages. When a content category / subcategory is created I go recursively through the parents and build the entire route and then store it in the category table. Then when the page is requested I can find the correct content page and find out when going through a nav structure if the current nav being built is the current or active route.
This approach requires some rules about what happens when a category is edited. The approach right now is that once the full path is set for a sub category it can't be change later with the normal tools.
The source is a mvccms.codeplex.com
Related
Is there e.g. a crawler that can find (and list the form action etc.) all pages that have forms in my site?
I'd like to log all pages with unique actions to then audit further.
Norconex HTTP Collector is an open source web crawler that can certainly help you. Its "Importer" module has a "TextBetweenTagger" feature to extract text between any start and end text and store it in a metadata field of your choice. You can then filter out those that have no such text extracted (look at the EmptyMetadataFilter option for this).
You can do this without writing code. As far as storing the results, the product uses "Committers". A few committers are readily available (including a filesystem one), but you may want to write your own to "commit" your crawled data wherever you like (e.g. in a database).
Check its configuration page for ideas.
I am doing some updates to a site I have developed over the last few years. It has grown rather erratically (I tried to plan ahead, but with this site it has taken some odd turns).
Anyway, the site has a community blog ( blog.domain.com - used to be domainblog.com) ) and users with personal areas ( user1.domain.com, user2.domain.com, etc ).
The personal areas have standard page content that the user can use, or add snippets of text to partially customize. Now the owner wants the users to be able to create their own content.
Everything is done up to using a file browser.
I need a browser that will allow me to do the following:
the browser needs to be able to browse the common files at blog.domain.com/files and the user files at user_x.domain.com/files
the browser will also need to be able to differentiate between the two and generate the appropriate image url.
of course, the browser access to the user files will need to be dynamic and only show those files particular to the user (along with the common files)
I also need to be able to set a file size for images
the admin area is in a different directory than either the blog or the user subdomains.
general directory structure
--webdir--
|--client --
|--clientsite--
|--blog (blog.domain.com)
|--sites--
|--main site (domain.com)
|--admin (admin.domain.com)
|--users--
|--user1 (user1.domain.com)
|--user2 (user2.domain.com)
...etc.
I have tried several different browsers and using symlinks but the browsers don't seem to be able to follow them. I am also having trouble even setting them to use a directory that isn't the default.
what browser would you recommend? what would I need to customize to make it work.
TIA
ok, since I have not had any responses to this question, I guess I will have to do a work around and then see about writing a custom file browser down the road.
I'm working on an iPhone app which has a news feed. This news feed is pulled from a JSON web service I've written (currently living on MAMP on my laptop).
Anyway, I use a MySQL DB to store references to my images, which are stored in the apache filesystem.
I store them in a very particular way, and this is how I store them:
Full Images: ng_(postid)_(seqid)
Thumbs: tng_(postid)_(seqid)
PostID is the unique ID that is assigned to every news post.
SeqID is an ID that is only unique for the photos for that news post.
I probably didn't make that very clear... example:
The images files in the first post might look like this
ng_1_1.jpg
ng_1_2.png
ng_1_3.jpg
The image files for the second post might look like this
ng_2_1.jpg
ng_2_2.png
ng_2_3.gif
This has worked great up till now, but I tried to see what would happen if I deleted a post, and recreated one in it's place?
Let's say we have a post called 'Old Post', which has 2 images, with a postid of 7.
It's images might look like this:
ng_7_1.jpg
ng_7_2.jpg
Let's say we deleted that post, and then created a new one afterwards, which has three images and is called 'New Post'.
It's images will look like this:
ng_7_1.jpg
ng_7_2.jpg
ng_7_3.jpg
Now, here comes the problem... If the device has viewed the old post, which was deleted, and then views this new post, they will see the first two images as the ones from OLD POST. Not the new ones.
Why? SDWebImage thinks because the URL is identical, and therefore decides to pull the cached image from disk. It doesn't even display the cached version, and then check if the image has been updated.
So, I've worked out there are two possible solutions to this:
Somehow get SDWebImage to check the online image, after displaying the cached version
Pass down a key in my JSON, to tell my app to wipe SDWebImage's cache (when necessary)
So, my question is, how would you go about deleting SDWebImage's cache, or making it check the server after displaying the cached version?
I think your PostID values are not unique in your system and that causes you problems. If you had unique PostID values it would be impossible to delete a post with given ID and assign that ID value to a new post...
PostID shouldn't be reusable in my opinion - can you imagine a clerk deleting a specific order in his system, creating a new one with old ID and one day getting a call from customer that provides his order ID which is now overwritten in the system?
Other thing is that you should never ever delete cached images on client's side - be user-friendly, save bandwidth and users' data plans (check this link why that matters). However you can specify cacheMaxCacheAge for SDWebImage to get rid of old, unused images. You can also remove specific images using removeImageForKey: when for example user decides to delete a specific post on his device.
Finally, the case you're describing relates more to updating a specific post, so posts can get different image set for example. In that scenario the simplest thing you can do is to use unique images IDs, so when a post is downloaded, new images will be downloaded (old ones will be deleted when cache reaches its max age - look for cacheMaxCacheAge). Alternatively, you could introduce a kind of synchronization mechanism in your DB/JSON (e.g. based on timestamps: if a post is downloaded and has a newer timestamp than the post stored in application cache, you remove old resources and download new images, text, etc... If timestamps are equal you're good with data you already downloaded).
An advanced solution would use RestKit and Core Data which would enable users to browse your posts offline and update content (images, text) when your web resource (JSON) changes.
What an epistle... I just hope my comments are useful for you :)
I wanted to auto fetch data(gold price) from a website and update a variable. Do i have to load the whole .html file in a string and find the price? Is there any other way? Even if I updated the variable, how do i save it, so it retains it's updated value(price)?
Do i have to load the whole .html file
in a string and find the price?
Yes
Is there any other way?
Only if the web site also provides an API that gives you access to just the data you need.
Even if I updated the variable, how do
i save it, so it retains it's updated
value(price)?
A variable will keep it's value until you change it. However if you want to preserve it even when the user quits your app, so that it starts again from the same value, you could save it in NSUserDefaults for example.
Do be aware however, that the data is almost certainly copyright, you can't just scrape data from a website and publish an app based on that data without considering the legal perspective. Price data is normally owned by the exchange and you will need a license to re-publish it.
When creating a CMS which would you recommend?
Making a htaccess dynamically create the pages based on ?pg=name
or
Making a FTP connection to auto create each file on the fly? This means when a new page is created/edited/deleted the admin, when saved, would ftp into the site and create the page.
Pros and Cons
"Pro" Less files means less space
"Con" More continually overhead for apache to redirect
"Con" More space taken
"Pro" Less work to find file sense its created and only once loaded when changed
ALright, let me clarify. Which is the better option.
create index.php and have all htaccess redirect to it sending ?pg=name and then get the content from database
have an admin automatically ftp into a site when content is created/edited/delete and create the page so when the person types the page in its hard coded
Without a doubt the best way to go for your CMS is using Apache mod_rewrite. This way you have more flexibility in the future for changing the way that you want URLs displayed, and it expedites the creation of new content so that it doesn't have to be uploaded via FTP every time.
If you have to use FTP to use your CMS, I'm afraid it won't be very scalable, which is one of the benefits of a CMS.
Your 'better option' is 1. Stick to mod_rewrite.
If you want to, you can mix those options - use htaccess for nice names for your pages, rewriting them to ?pg=name and then load data from file or database.