google search console discovered urls is too few for sitemap index

google search console discovered urls is too few for sitemap index - google-search-console

My sitemap index file does not show any errors on google search console, but it only shows 397 discovered urls whereas it should have been over a million.
Basically my sitemap index file looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap><loc>https://www.somesite.com/sitemap1</loc><lastmod>2020-09-14T04:38:25Z</lastmod></sitemap>
<sitemap><loc>https://www.somesite.com/sitemap2</loc></sitemap>
<sitemap><loc>https://www.somesite.com/sitemap3</loc></sitemap>
<sitemap><loc>https://www.somesite.com/sitemap4</loc></sitemap>
<sitemap><loc>https://www.somesite.com/sitemap5</loc></sitemap>
... (614 sitemap entries in total)
</sitemapindex>
What can be wrong? Do I have too many <sitemap> entries?

This was related to combination of having too many sitemap files and some giving timeout errors. Search console is not good at reporting those errors with a lot of sitemaps. The issue is fixed after reducing sitemap count to 200 and making them load under 30 seconds.

Related

How do I get Gatling reports to show URLs instead of request_0 etc?

I'm new to Gatling, apologies if this is a complete noob question.
The "Details" tab of my Gatling report looks like this:
The left-hand menu contains all the requests that were made. My problem is that, in all but a few rare cases, they're just labelled "request_x" instead of the URL or filename. So where there is a bottleneck I can't tell what page or resource was causing it.
I found that if I manually edit the .scala file before running the scan, I can change each one by hand, e.g. if I change...
.exec(http("request_0")
.get(uri01)
.headers(headers_0)
.resources(http("request_1")
.get(uri02)
.headers(headers_1)))
...to..
.exec(http(uri01)
.get(uri01)
.headers(headers_0)
.resources(http(uri02)
.get(uri02)
.headers(headers_1)))
...it seems to have the desired effect. But I don't want to have to change hundreds of these by hand every time I have a new test to run.
Surely there's a better way?
FWIW I'm generating this scala file using Gatling's "recorder" with an HAR file exported from Chrome, as opposed to running the recorder as a proxy. But I have tried the proxy option and got the same end result.

Cruise Control .net Changing Log File appereance

i would like to change the apperance of the log file, generated by ccnet. It is useful, if the error messages are separated from the original Log Messages, but in order to debug, it is a bit tricky to see, when the error really happened. Our powershell skript runs for 6-8 hours and creates about 38k lines in the log file, so i would really apprechiate a solution, how i could list the errors with the other lines in the log files. Additionally it would be cool, if all the errors would still appear separatedly.
So far i have not found a lot documentary that explained how to change the log file output...
Simon

Not sure how this is logged, but in the end, logs produced during the build are put into the build-log file, that you will find in artifacts folder.
Then this logs are transposed into html output using xsl transforms. If none of the built-in reports is useful to you, you can create a custom xsl and plug it in, see the dashboard.config file, the following section allows for adding additional xsl transforms:
<buildReportBuildPlugin>
<xslReportBuildPlugin description="MSBuild Log" actionName="MSBuildBuildReport" xslFileName="xsl\MSBuild4Log.xsl"/>
...

If you know what the error messages are going to be you can parse them with an xsl file and generate some html that will show up in the build emails. The following goes in ccservice.exe.config.
<xslFiles>
<file name="c:\path\to\custom_errors.xsl"/>
</xslFiles>
custom_errors.xsl is an xsl file that finds the error messages in the raw build log xml and then generates html from them. This html will show up in the build emails. You have to create custom_errors.xsl. It's a significant amount of work to get working the first time especially if you're new to xml/xsl/html/css. If you undertake this I suggest doing all the testing outside of ccnet using a xsl transformer and inputting a sample ccnet build log. ccnet uses a css file to style the html so be aware of that. You can edit this too.
Note you have to restart the ccnet service after editing ccservice.exe.config.

Getting only 100 crawl issues using webmaster tool

I am using Feed crawlIssues = wtr.GetCrawlIssues(encodedSiteID); to get the crawl errors from my webmaster tool account. There are more than 5k errors but the above code retrieves just the first 100. How do I retrieve all the errors?
Thanks

I've run into the same issue as you have, I only got the first 100 errors, too. Basically, because of a bug in the webmaster tools, it only shows you the errors in 100 batches.
It does not have a built in solution as far as I know, but there is a workaround. Instead of using the GetCrawlIssues function, you can access the data through http requests with the provided ExecRequest.exe command line tool. The basic usage is:
ExecRequest cl QUERY http://www.google.com/webmasters/tools/feeds/example_site.com/crawlissues/?start-index=1&max-results=100 example#gmail.com mypassword
This will output the resulting XML to the console. You can specify the starting point, and the number of errors you want to download:
?start-index=startIndex
&max-results=100
You can set the max-result value to wathever you want, but it will only download a maximum of 100 items.
After downloading in batches, you can get the data from the downloaded xml files.
If you only need the data, I've also written a small script in Python, you can check it out here, it's pretty straightforward.

How to solve validation error on xsi:noNamespaceSchemaLocation in jdoconfig.xml

Since I updated today to GAE 1.7.2.1, I'm having validation errors in eclipse in all my jdoconfig.xml files.
I have the default jdoconfig.xml content :
[...]
<jdoconfig xmlns="http://java.sun.com/xml/ns/jdo/jdoconfig"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="http://java.sun.com/xml/ns/jdo/jdoconfig">
[...]
And eclipse validation throws:
Referenced file contains errors (http://java.sun.com/xml/ns/jdo/jdoconfig).
For more information, right click on the message in the Problems View and
select "Show Details..."
When clicking on details I can see a bunch of lines like:
s4s-elt-character: Non-whitespace characters are not allowed in schema elements
other than 'xs:appinfo' and 'xs:documentation'. Saw 'var_U = "undefined";'.
In different lines and different content in "Saw ... "
It occurs in every single project I start using the "New Web Application Project..." from the google plugin.
So does anyone have this problem? Any fix?

Try this:
<jdoconfig xmlns="http://java.sun.com/xml/ns/jdo/jdoconfig"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/jdo/jdoconfig http://java.sun.com/xml/ns/jdo/jdoconfig_3_0.xsd">
Per the answer here Validating jdoconfig with incorrect url
The xmlns is not a real file/directory, more a namespace, so ought not exist! The version is appended to get the real XSD file, namely http://java.sun.com/xml/ns/jdo/jdoconfig_3_0.xsd

There are a couple problems here.
The syntactic problem is that the URI you are giving as the value of xsi:noNamespaceSchemaLocation is redirected to http://www.oracle.com/technetwork/java/index.html and returns an HTML document. The XSD validator you are using is trying without success to parse
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type"
content="text/html; charset=utf-8" />
<script type="text/javascript">
var _U = "undefined";
var g_HttpRelativeWebRoot = "/ocom/";
var SSContributor = false;
...
as an XSD schema document, and for one reason or another its attempts to explain what went wrong focus on finding the string var_U = "undefined" in a place where it was not expecting to see character data.
Then there are some conceptual problems.
Your document is in a namespace named http://java.sun.com/xml/ns/jdo/jdoconfig. Why on earth are you pointing the schema validator to a schema without a target namespace (which is that noNamespaceSchemaLocation does), if you want to validate your document? Given that (at least some of) your document's elements are namespace-qualified, you will want (as joncalhoun has already suggested) to use xsi:schemaLocation and provide a pair telling the validator where it can find a schema document for each namespace you want it to know about.
It's possible that a schema document used to be served from the location http://java.sun.com/xml/ns/jdo/jdoconfig, but since it's apparently the standard namespace named for your vocabulary, that's not actually very likely. Most systems distinguish fairly reliably between namespaces, which are abstract and poorly defined things, and schema documents, which are typically XML documents that define specific XSD schema components for a given namespace. It's not illegal to use the URI for a schema document as the name of a namespace, but it is unusual.
Note that the URL given by joncalhoun for the schema document (http://java.sun.com/xml/ns/jdo/jdoconfig_3_0.xsd) actually does resolve (after redirection to http://www.oracle.com/webfolder/technetwork/jsc/xml/ns/jdo/jdoconfig_3_0.xsd) to a schema document, which specifies http://java.sun.com/xml/ns/jdo/jdoconfig as its target namespace. (This means that even if you did succeed in retrieving this schema document by giving its URI as the value of xsi:noNamespaceSchemaLocation, you'd then get an error because it's not a schema document for elements and attributes with no namespace.)
This makes me think that you should read joncalhoun's answer again and try it again, carefully. If it didn't work when you tried it, my money says that either you tried something similar but not exactly what he suggested, or it solved this problem but that simply exposed some other problem, which is easy to mistake for failure.

One solution is setting XML Catalog in Eclipse preferences.
Details:
Entry element: URI
Location: http://java.sun.com/xml/ns/jdo/jdoconfig_3_0.xsd
URI: http://java.sun.com/xml/ns/jdo/jdoconfig_3_0.xsd
Key type: Namespace name
Key: http://java.sun.com/xml/ns/jdo/jdoconfig

The syntactic and conceptual issues C.M. mentions are a problem with the plugin and Google's settings where both recommend,
xsi:noNamespaceSchemaLocation="http://java.sun.com/xml/ns/jdo/jdoconfig
I don't specifically use jdo but I still get the validation error with this namespace. It was fine with this namespace until just recently.
I used LuboM's method and it worked for me. Neither LuboM's nor joncalhoun's is the answer though since it ties me in to jdo 3.0
Oracle is going to have to provide the fix. Apparently their intent was to resolve the namespace issues themselves across versions of jdo.

This is what I did to fix it:
<?xml version="1.0" encoding="utf-8"?>
<jdoconfig xmlns="http://java.sun.com/xml/ns/jdo/jdoconfig"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="http://java.sun.com/xml/ns/jdo/jdoconfig_3_0.xsd">

I am success on this:
Right Click Project -> Properties -> Validation -> XML Syntax
Enable Project Specific Settings (If you need)
Under Validating Files, For No grammar Specified Select "Warning"
Click "Ok"
If you ask for Validating the file, Click "Yes"
You can do the same for all the projects by going to Windows -> Preferences.
Make sure you are validating the file (Step 4).

I had the same issue, and excluded just this jdoconfig.xml file from Eclipse's validation. Even though your Eclipse throws an error for it, it in no way affects being able to deploy the project to GAE correctly.
Here is how to exclude just the jdoconfig.xml file to get rid of that pesky error:
Right click on your Eclipse Project, ->Properties->Validation->XML Validator, click on the "..." button for further options.
You should see Include Group and Exclude Group options. Click Exclude Group->Add Rule...->Folder or file name, and browse to your file.
Clean or rebuild your project. The validation error should be gone.
This worked for me in Eclipse Luna.

You might have try this path to solve your problem:
<?xml version="1.0" encoding="utf-8"?>
<jdoconfig xmlns="http://java.sun.com/xml/ns/jdo/jdoconfig"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="http://java.sun.com/xml/ns/jdo/jdoconfig">

iPhone: repeating Elements in XML

I am new to XML parsing. I am parsing the following XML. There are tutorials for if XML has unique attributes but this XML has repeating attributes.
<?xml version="1.0" encoding="utf-8"?>
<start>
<Period periodType="A" fYear="2005" endCalYear="2005" endMonth="3">
<ConsEstimate type="High">
<ConsValue dateType="CURR">-8.9919</ConsValue>
</ConsEstimate>
<ConsEstimate type="Low">
<ConsValue dateType="CURR">-13.1581</ConsValue>
</ConsEstimate>
</Period>
< Period periodType="A" fYear="2006" endCalYear="2006" endMonth="3">
<ConsEstimate type="High">
<ConsValue dateType="CURR">-100.000</ConsValue>
</ConsEstimate>
<ConsEstimate type="Low">
<ConsValue dateType="CURR">-13.1581</ConsValue>
</ConsEstimate>
</Period>
</start>
I need to fetch the low and high values based on the years 2005 and 2006.

I agree with SB's comment, if you wan't to handle xml-datastructurse, you should know at least the basic stuff.
A good tutorial i can reccomend is ww3 schools XML Tutorial
once you did that, you should know that there are several ways to parse xml files. For flatfiles i recommend to use the TBXML Library, it is really fast and easy to handle within your code.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

google search console discovered urls is too few for sitemap index - google-search-console

This was related to combination of having too many sitemap files and some giving timeout errors. Search console is not good at reporting those errors with a lot of sitemaps. The issue is fixed after reducing sitemap count to 200 and making them load under 30 seconds.

Related

How do I get Gatling reports to show URLs instead of request_0 etc?

Cruise Control .net Changing Log File appereance

Getting only 100 crawl issues using webmaster tool

How to solve validation error on xsi:noNamespaceSchemaLocation in jdoconfig.xml

iPhone: repeating Elements in XML

Categories

Resources