Page renders on browsers but throws "404 page not found" for SEO crawlers and/or requests made by program. What could be the issue? - zend-framework

I built a site out of ZF and installed it fine on my server. I have the MVC structure and use custom routing (for SEO purposes) as below:
mysite.com/controller.html
mysite.com/controller/action.html
Generally, everything is working fine but the only problem is that SE crawlers won't find any .html files. If i open the "Activity" window from Safari, I see all the css and other files being referenced/read fine but not the page itself.
So, the page renders fine on a browser but SE crawlers or any program that made the request won't find the page. I'm wondering if it's an Apache issue. My .htaccess file is the same file that shipped with the ZF.
I really appreciate any advise/suggestions/comments!

Is it possible that your app is serving all pages with a 404 status code? So browsers and crawlers are getting the same thing, but the browser will render the content whereas the crawlers ignore it. I've seen some people use the Error Controller in ZF as a way of doing routing (not a good idea), where the Error Controller 'catches' all requests and then examines the params to determine what to display.
If this isn't your problem please could you edit your question to include:
How it is you know that crawlers are getting a 404
Some more info on how you are doing your routing
Also if you can provide an example URL we can check the headers that are being returned.

Related

Facebook App Not Displayed Insecure Content Message In Chrome

I've been trying to get to bottom of this problem for a few hours but I can't seem to fix it, I've seen other questions similar to this and tried to use those to implement a fix for my problem but to no avail.
I've built a facebook contest canvas app which displays fine independantely but when I link it to a facebook page (as a link to a new contest) chrome no longer displays is and gives the following warning:
The page at 'https://www.facebook.com/contest/app_xxxxxxxx' was loaded over HTTPS, but ran insecure content from 'http://mydomain.com/': this content should also be loaded over HTTPS.
I've learned partly by trawling this site that the chrome security is fussier, and the app loads correctly, without errors in FireFox and IE but I can't find any resources that are loaded from a non https source.
I have been through with firebug checking in the net tab and checked that all of the loaded resources are using https (the png images, the jpg images, the css files and the jquery js files which are all hosted on the same server that has the certificate), I have even tried hosting the transitional dtd doc itself but nothing seems to make the warning go away and the app display correctly.
In the other similar questions it seems that there are either resources sourced from non-https sources or there are ssl switches used in the javascript library for facebook passed before the fb init.
The problem is that I am using only the php sdk not the js one (although I am using version 1.9 of jquery, hosted on my server) and I could find no similar ssl specific settings there.
If someone could give me a tip about how I could investigate further, what I might be missing or is familiar with this issue I'd be interested to hear about it.
Thanks a lot.
David
Facebook requires the app to come from https:// you need an ssl certificate on your server and to enable ssl. in the Facebook app settings change secure url to https://mydomain.com url
I did have a similar issue recently (but it only caused issues on IE10) and I resolved that by adding P3P header
header('P3P:CP="IDC DSP COR ADM DEVi TAIi PSA PSD IVAi IVDi CONi HIS OUR IND CNT');
Found the solution!
In the facebook app settings, if the page tab url is specific to a page e.g. https://www.mydomain.com/index.php, chrome doesn't complain with the insecure content message but if you reference a directory the error is propogated. I found this confusing since the 'canvas' urls need to be directories.
I hope this answer will save someone a few hours! :)

Silent failure loading page application in iframe over https

Problem
I have an application driving a tab on a client's page. The application works correctly if the user has not enabled FB's "secure browsing" feature. If attempting to view over HTTPS, the iframe doesn't even appear (no errors, no mixed-content warnings). When correctly loading over HTTP, the div with the id "pagelet_app_runner" has an iframe inserted into it and the application content is loaded inside there. Over HTTPS, this div remains empty and the iframe is not inserted into the page. There are no Javascript errors appearing in Firebug or Chrome's equivalent console.
Why I'm Asking Here
The host has a valid SSL certificate and there is no 'mixed content' at the URL in question. I can successfully view the content over HTTP or HTTPS by visiting the URL directly, and I can do the same by visiting apps.facebook.com/canvasURL/tabURL. It is only when attempting to view within a Page Tab that the HTTPS load fails as described above. My application is configured with both regular and secure canvas and tab URLs.
Attempted Debugging
I've recorded some sessions with Charles but since the iframe isn't being inserted into the page, I think I'm coming at the problem after it's already occured. I'm no Charles expert so happy to be corrected here.
Apache isn't seeing any request (in either regular or ssl logs) for the affected loads. non-SSL loads come through as expected in access_log.
Plea for Help
I'm out of ideas for debugging this. Does anybody have any suggestions? What really obvious and stupid mistake might I have made? :)
edit: nicer formatting
Your app canvas URL is https://skinnycomp.nextstudio.com.au/skinnycowcomps/ , which send 404 error to Facebook proxy (request is going through proxy when viewing app via tab), also when viewing your app via apps (https://apps.facebook.com/122381834451561/), again 404... maybe Facebook proxy is ignoring 404 and posting blank...
Try changing canvas URL to https://skinnycomp.nextstudio.com.au/skinnycowcomps/tab, also you can check if your app is accessed via page tab, in signed_request there should be page_id...
23:51:15.379[549ms][total 1667ms] Status: 404[Not Found]
GET https://skinnycomp.nextstudio.com.au/skinnycowcomps/
This is a real longshot since I'm sure you've triple checked all the settings, but the blank page can happen if an invalid url is specified in the Page Tab URL field in the app settings. Since it only happens on https, it would imply something specifically with the Secure Page Tab URL entry. It might be worth checking that again, and maybe even re-saving it or changing it to something else to see if it helps.
I was using relative URLs for the regular and secure tab URL fields. From memory relative URLs here were mandatory at some point in the past. It appears now that a relative URL will still work for HTTP but not for HTTPs. Fix: absolute URLs. Hopefully FB update their field validation to match what's required too.

security warning in IE9 "Show all content"

I'm implementing the facebook Comments plugin on my site. Users get the warning "Show all content" in IE9
This other publisher using the same plugin and it does not bring up the warning.
Can some please help me with this?
Asking users to turn of the mixed content warning in their IE9 is not an option.
We were just looking at this today and our workaround for now was to include the Facebook Library over https (even when the page itself is viewed over http). Although not ideal it gets rid of the mixed content warnings in IE9 until they have fixed their bug.
That seems to be how it was accomplished at www.vg.no linked in the original question, the library is linked via https.
From their code:
<script src="https://connect.facebook.net/nb_NO/all.js"></script>
I have the same problem:
I have a page that's 100% http. But, the facebook javascript (which I call over http), is returning assets (.js, images) over https, which is generating security warnings for IE(9) users.
I have figured out it's the comment widget from Facebook (
Here's an example of a live page on http: with the error:
http://app.gophoto.com/p?id=10173&rkey=CD01891B287792415384&s=1&a=6940
Here's one of the assets that Facebook returns over HTTPS
https://s-static.ak.facebook.com/rsrc.php/v1/y8/r/7Htnnss1mJY.js
(I'm unable to comment (for some reason?) on Joel's answer. But, his suggestion to fetch the initial all.js over https on http sites does not actually work. I've tried it, and it also inherently looks incorrect since even the initial js fetch violates the mixing up of http & https content.)

iPhone not caching Asp.Net pages

If I create a basic asp.net application and set the #outputcache the page is cached fine in chrome & IE on the desktop. First request returns 200, subsequent request return 304 for the default.aspx. (I'm monitoring through fiddler)
However accessing the same page from an iPhone I noticed that it's always returning 200 for the aspx file. All resources are being caching and are returning 304's. So it's just the aspx page.
Any ideas why this is happening?
Some technical details:
<%# OutputCache Duration="30" VaryByParam="None" Location="Any" %>
Stock standard ASPX page. Content-Length: 2464
Reloading on the iPhone using refresh control or keyboard "go" doesn't make a difference.
Explicitly setting eTag does not make a difference
Last-Modified is set
but If-Modified-Since is not being send for the ASPX page
Latest IOS 4.3.1
IIS 7.5 running on Win7 using ASP.NET 4
I think I figured it out. Feel free to correct me however. Website caching is a very messy area.
The root of the problem is that the iPhone is not sending "If-Modified-Since" headers with it's requests. Without that the server cannot reply with a 304.
After some experiments I've found that if you use a link to navigate to the page it will send the 'If-Modified-Since' header and everything works as expected and the server neatly returns a 304.
Cases where it does not send a "If-Modified-Since" even though it's cached:
Typing in the URL
Pressing the refresh button
Selecting the URL and pressing Go
Opening as a bookmark
Opening from a saved reference on the home screen
It only seems to be doing this for the primary url everything else that is referenced does have a "If-Modified-Since" header (where applicable).
Note: I've tested this on IOS 4.3.1 only. Looking at the link Paddo send and further investigation into that area it seems that Apple likes to change caching behaviour between IOS versions.
Found this, re php (will also apply to .net)... iphone doesn't seem to cache any resource over 15k in size, and total cache size is 1.5MB. Note this is old information so may have changed.
http://www.phpied.com/iphone-caching/
The solution for a file over 15k, is to use an offline application cache manifest file, as outlined here:
http://developer.apple.com/library/safari/#documentation/AppleApplications/Reference/SafariWebContent/Client-SideStorage/Client-SideStorage.html
PS I know your content length is below 15k - so something else must be amiss... but I'm still hopeful that the manifest file will work.
for ASP.NET just use
Response.AddHeader( "Cache-Control","no-cache");
or
Response.AddHeader( "Pragma", "no-cache");
or
Response.Cache.SetCacheability(HttpCacheability.NoCache);

Why does IE8 render my site then immediately redirect to its internal 404?

I administer a site, hosted on Yahoo! hosting, which has recently shown a strange behavior: when you visit in IE8, the page loads and is rendered normally, then as soon as it finishes rendering, the browser switches to show its local/internal 404 page. The address bar still shows the site URL.
When I view the site in (as far as I can tell) the same state on my local Apache server, it doesn't do this. This leads me to suspect it may have something to do with server configuration and response headers, but I don't know what that might be.
Is anyone familiar with this behavior?
I experienced this behavior when using a .htc hack to provide artificial CSS border-radius support.
I'm not sure what is causing that issue specifically, but you could use a packet capture utility like Wireshark or Fiddler2 to investigate the issue further. Otherwise, it would be helpful if you were to post a link to the site.
Your page contains JavaScript code which modifies the DOM while the page is still loading.
See other SO questions, such as here and here.
Solution: place your DOM manipulation code into < body onload> or jquery.ready() to execute after page loading is complete.