Does page speed insight ignore pages with no index, no follow tags added to discouage bots from crawling - chrome-ux-report

Does page speed insight ignore pages with no index, no follow tags added to discouage bots from crawling.
If not, is there a way to make google ignore a page from the origin data summary (Field data).

Pages that explicitly opt out of indexing should not have URL-level field data in PSI or the CrUX API. However, those pages may still be included in the origin-level performance distribution.

Related

How do sites like Facebook ensure vanity URLs don't conflict with internal pages/folders?

Sites like FB offer you a shallow "vanity" URL of your choice to give out to people like facebook.com/examplestore
But because its so shallow in depth, it could easily conflict with internal pages. For example if I register the vanity URL of facebook.com/jobs then that means Facebook can never add a "Jobs" section to their site in the future without changing their URL. If they try to get rid of the existing page (maybe its a fan page for Steve Jobs) then it would mess up SEO for that page anyway.
I have thought that there should be a list of reserved words stored in a database table and the choice of vanity URL could be checked against that. But is this the only way to do it?
It would mean having to brainstorm every single possible section/page you may EVER have on your site (e.g. settings, index,... potentially any other section in the site) and even then you could miss a few. Is there any alternative/better way to do it?

Googlebot is not reading dynamic content

Website is fully dynamic.
meta tags, opengraph tags and contents are created dynamicially on webpages.
I might be doing something wrong. Please guide me to get approved for GOOGLE ADSENSE Program.
Google Adsense gave reason "Insufficient content" for this
I think the only real answer is to implement some kind of partial caching. If needed content is not in the source code of your pages, it won't be indexed.
What exactly do you mean by "fully dynamic" and what parts do you want to be indexed?

How to search Facebook's "invitable_friends" results

I have a Facebook app, and I'd like to allow my users to invite their Facebook friends to my app. The proper endpoint is /me/invitable_friends which is working well. But towards the bottom of that doc page, they recommend implementing a "search box" to filter the results, yet they don't offer any example of how to do this. I've searched around and haven't found anything.
It doesn't appear as though you can pass additional params for filtering the results. Obviously I can filter the results after the fact, but that's not scalable since the API only returns ~20 users at a time. That limit is modifiable (I believe), though it's of course not wise to bump it too high.
So how can I build a search box interface if I can't pass the search text to the endpoint? I must be missing something.
Thanks in advance.
PS - I'm using the JS SDK.
You should probably file a bug.
Based on the documentation the default size is 1000 records (average Facebook friend list size is 300-400)
If you don't see the next parameter at the end of the result under paging then there are no more results.

Facebook Graph API SEO Comments and Profanity Filter

I'm trying to integrate the Facebook comments left on our site in a way in which the content can be crawled by search engines and also for people (although I highly doubt there will be many) who don't have Javascript enabled on their browser.
Currently our Facebook comments are displayed via the use of the Facebook comment social plugin (using the <fb:comments href="MY_URL" num_posts="50" width="665"></fb:comments> tag). This ends up rendering an iFrame (which are mostly ignored by search engine crawlers) so the plan is to render this information and format it with basic HTML. To do this, the comments are pulled using the Graph API - this is then only be displayed to crawlers and people with Javascript disabled.
This all works nicely using the Graph API call (https://graph.facebook.com/comments/?ids=MY_URL), parsing the JSON result and displaying it on the page. The problem is that the <fb:comments> approach filters our results based on a blacklist we have set up on one of our Facebook Apps. The AppId with the relevant blacklist is stored on the page using metadata (<meta property="fb:app_id" content="APP_ID"/>) which the <fb:comments> control obviously must somehow use to filter the comments.
The problem is the Graph API method does not filter any results as I guess no blacklist (or App Id containing a blacklist) is specified. Does anyone know how to specify a Facebook App ID to the API call URL or of another way to not fetch commnents back that violate the terms of the blacklist?
On a side note, I know the debate about filtering content in comments rages on but it is a management decision to implement the blacklist, and one that I have no influence in changing - just incase anyone felt the need to explain the reasons why content filtering is or isn't a good idea!
Any thoughts on a solution?
Unfortunately there's no way to access a filtered list of comments using the API - it might be a reasonably request to have this in the API - you should file a wishlist item in Facebook's bug tracker
Otherwise, the only solution I can think of is to implement your own filter on your side when retrieving and displaying the comments from the API.
According to the Comments plugin documentation the filter on Facebook's side is implemented as a simple substring match, so it should be trivial to implement.
A fairly simple regular expression match should be able to check each comment against a relatively long list quickly.
(Unfortunately, the tradeoff here is that implementing a filter is easy, but you'd also need to write an interface so that whoever's updating the list of disallowed words can maintain the list for both the Facebook plugin, and your own filtering.)
Quote from docs:
The comment is checked via substring matching. This means if you blacklist the
word 'at', if the comment contains the sequence 'a' 't' anywhere it will be
marked with limited visibility; e.g. if the comment contained the words 'bat',
'hat', 'attend', etc it would be caught.
Pretty sure there is no current way of doing this from the graph API, the only thing I can suggest is taking the blacklist and build your own filter

Dealing with 301 redirects for a brand new website

I have seen multiple articles on redirecting Urls when the site has been redesigned or Url just changed to a standard format but I need to know how to manage when the Url has no correlation to the old one.
For instance, an old Url may have been www.mysite.com/index.php?product=12 but there is no way to map that Url to the new site.
I don't want search engines to think that the page has broken so I assume the best thing to do is to 301 redirect to the home page but I am not sure how I would do that effectively. Would I just change the 404 error page to do a 301 to the home page?
Also, would that then cause issues with duplicate content via dofferent Urls?
Is it better to just not worry about these and let the search engines re-index the new Urls?
I am running IIS7 with Rewrite module and ASP.NET 2.
Thanks.
Why do you say there is no way to map that URL to the new one? There probably is, since both should be unique identifiers for a given resource. If your site has good rankings, it may be worth the pain to work this out and have a 301 redirect to the right page. In this way, the ranks should be unchanged.
Redirecting everything to the new home page will probably have a negative effect. It really depends on how the bots are going to interpret this. But it may seem an artificial way to increase the rank of the home page, and correspondingly get a penalty.
Doing nothing and waiting for the bots to index your new site will of course work, but often you cannot afford to lose the high rank you have gained.
All in all, I would advise you to ask here a new question on how to map the old URLs to the new ones, and do proper redirects.
That product URL you supplied is obviously, well, a product. The best bet is to 301 redirect it to a new page that is the most relevant to that old page. If there aren't any external links even pointing to it at all, just let it die. Be sure to remove it from any sitemaps or old navigation links you may have internally though or it will keep getting re-indexed which is what you want to avoid.
Once you have your new site structure set up, visit a site like AuditMyPc.com and create a brand new sitemap of your new site setup. Then login to Google Webmaster Tools and resubmit the new sitemap. This normally will fix the problem, but if that page is indexed, expect it to stay in Google's index for a while. They don't clean themselves up too well.