Sitemap not really working in typo3 9.5.x - typo3

I'm trying to get a sitemap working in Typo3 9.5.x. If I go to https://domain.tld/?type=1533906435 I get the following page.
<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="/typo3/sysext/seo/Resources/Public/CSS/Sitemap.xsl"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://domain.tld/index.html?sitemap=pages&cHash=38eee382dd3fc2edb80b67944d477100</loc>
<lastmod>2019-10-07T13:57:04-07:00</lastmod>
</sitemap>
</sitemapindex>
So far so good. But the link in there should take me to the actual sitemap, but instead takes me straight to the root page without any redirection. This happens on 2 different sites. I didn't configure anything special, just enabled the seo system extension and included the static template as described here.
When I submitted the sitemap to Google's Search Console, it said "could not fetch", but the next day the status was "Success" and it discovered URLS. I guess Google crawled the root page and found the links on it.
How do I get the sitemap working or is there a bug somewhere?

The index.html part in the path to the pages sitemap looks weird to me.
Can you please try to open the loc url without the index.html part?
If you can see the sitemap then, we have to take a look where this index.html is coming from.

Related

TYPO3 seo no sitemap ?type=1533906435

Typo3 10.4.16
I am not able to create a sitemap.xml with the system extension SEO.
My current understanding is, that the only requirement is, to install SEO and include static template. Than I should be able to see a sitemap with https://yourdomain.com/?type=1533906435. But I always see my start page and not a sitemap.
Under EXT:seo/Resources/Public/CSS/Sitemap.xsl I see a file, but it is 1 month old has nothing to do with my site.
Is there any possibility to trigger seo to generate a new sitemap?
Output from Template -> Template Analyzer:
Or what can I check?
Edit 2021-06-14 10:10
I have moved up the seo static template to to top. No change!
Yes there is a redirect from www.mysite.xyz to www.mysite.xyz/startseite!
When I use a different type, no changes.
This is the a screen shoot from the Object browser:
Edit 2021-06-14 11:00
I am a big step forward. The problem was the redirect, thanks to Richard.
But now I have a problem with the redirect. Even if I include under sites->StticRoutes a new Route with sitemap.xml it is not working together with the redirect.
So the next I tried is the new feature of seo to set xslFile. So I used xslFile = sitemap.xsl and also tried xslFile = /sitemap.xsl but both are not working!
Is it not possible to define a xslFile on the root?

Facebook Object Debugger returns 404 not found when trying to scrape

I have a simple Tumblr website blog, upon which I post content.
However since I changed my DNS, the Facebook Object debugger sees really old data for my root url: http://www.kofferbaque.nl/ and for every post (for instance: http://kofferbaque.nl/post/96638253942/moodoid-le-monde-moo) it shows a 404 not found, which is bullshit because the actual content is there.
The full error message: Error parsing input URL, no data was cached, or no data was scraped.
I have tried the following things to fix it:
clear browser cache / cookies / history
using ?fbrefresh=1 after the URL (didn't work)
I've added a FB app_id to the page (made sure the app was in production - added the correct namespaces etc. - also didn't change anything)
Checked out other questions regarding this subject
Rechecked all my meta tags a dozen times
What other options are there to fix this issue?
If you need more info please ask in the comments.
2014-09-08 - Update
When throwing my url into the static debugger https://developers.facebook.com/tools/debug/og/echo?q=http://www.kofferbaque.nl/. The 'net' tab from firebug gives the following response:
<meta http-equiv="refresh" content="0; URL=/tools/debug/og/echo?q=http%3A%2F%2Fwww.kofferbaque.nl%2F&_fb_noscript=1" /><meta http-equiv="X-Frame-Options" content="DENY" />
2014-09-11 - Update
removed duplicate <!DOCTYPE html> declaration
cleanup up <html> start tag (aka - removed IE support temporarily)
I've placed a test blog post to see if it would work, it didn't. Somehow my root url started 'magically' updating itself. Or let's say, it removed the old data - probably due to the fact that I removed the old app it was still refering to. However, it still doesn't see the 'newer' tags correctly.
Still no succes
2014-09-12 - Update
Done:
moving <meta> tags to the top of the <head> element
removed fb:app_id from page + the body script, for it has no purpose.
This appearantly doesn't make any changes. It also appears that tumblr injects lots of script tags at the start of the head element. Maybe that is the reason the Facebook scraper doesn't 'see' the meta tags.
The frustrating bit is that through some other og tag scanner: http://iframely.com/debug?uri=http%3A%2F%2Fkofferbaque.nl%2F, it shows all the correct info.
First, the HTML is not valid. You got the doctype two times (at least on the post page), and there is content before the html tag (script tag and IE conditionals).
This may be the problem, but make sure you put the og-tags together at the beginning of the head section - The debugger only reads part of the page afaik, so make sure the og-tags are in that part. Put all the other og-tags right after "og:site_name".
Btw: ?fbrefresh=1 is not really necessary, you can use ANY parameter - just to create a different url. But the debugger offers a button to refresh the scraping, so it´s useless anyway.

How to correctly change page extensions sitewide not to loose rankings in google

Moved content from php files to content management system. Changed page extensions from .php to without any extension, like: before some-page.php after some-page
All php pages keep at server, but made redirection in each php file, like header('Location:'. basename($_SERVER['PHP_SELF'],'.php') );. I cannot (do not know how to) do it in .htaccess, because in .htaccess is this code
RewriteRule ^([a-zA-Z0-9_-]+)$ show-content.php?$1
DirectoryIndex show-content.php?index
From google webmasters tools removed previous sitemap and added the new sitemap. See that the new sitemap accepted, but no urls included.
All above I did 3-4 days before. And today see that website dropped down in search results (some pages can not find at all in first pages).
Trying to understand what I did wrong.
Seems the first thing is wrong redirection code. Changed to header('Location:'. basename($_SERVER['PHP_SELF'],'.php'), true, 301 );. Is such code ok?
What else I did wrong?
You definitely need to do a 301 redirect from the old URL (with the extension) to the new URL (without the extension) as this tells Google the page has moved to a new location and you want all links and other SEO goodness (like rankings) need to be "transferred" to the new URL.
You can do the 301 redirect any way you want so using PHP's header() is fine as long as you are redirecting to the right page. Your example should work fine.

How to integrate an index.php file (in which is a form) into a fan gate, so it's just visible for fans?

how to integrate an index.php file (in which is a form) into a fan gate (so it's just visible for fans)?
I already got the fangate php file but I don't know how to integrate another php file in the fangate.
It doesn't work with " include "index.php"; " because everytime someone clicks the Send/Participate button of the form in the index.php file a new white/empty site is loading. But the index.php file works without the fan gate integration.
Has someone had the same problem or has anyone an idea what I can do to solve the problem?
An immediate solution would be to modify your fan_gate.php file so it will redirect the fans to the index.php file
However, the better solution would be to fix your form. It should work fine with include, I have several includes in my app and it works just fine. The problem is probably with your form's target (action). Change it to fan_gate.php if you use include.
If it still doesn't work, can you give me a link to the page?

Get google to index links from javascript generated content

On my site I have a directory of things which is generated through jquery ajax calls, which subsequently creates the html.
To my knwoledge goole and other bots aren't aware of dom changes after the page load, and won't index the directory.
What I'd like to achieve, is to serve the search bots a dedicated page which only contains the links to the things.
Would adding a noscript tag to the directory page be a solution? (in the noscript section, I would link to a page which merely serves the links to the things.)
I've looked at both the robots.txt and the meta tag, but neither seem to do what I want.
It looks like you stumbled on the answer to this yourself, but I'll post the answer to this question anyway for posterity:
Implement Google's AJAX crawling specification. If links to your page contain #! (a URL fragment starting with an exclamation point), Googlebot will send everything after the ! to the server in the special query string parameter _escaped_fragment_.
You then look for the _escaped_fragment_ parameter in your server code, and if present, return static HTML.
(I went into a little more detail in this answer.)