Getting only 100 crawl issues using webmaster tool - google-search-console

I am using Feed crawlIssues = wtr.GetCrawlIssues(encodedSiteID); to get the crawl errors from my webmaster tool account. There are more than 5k errors but the above code retrieves just the first 100. How do I retrieve all the errors?
Thanks

I've run into the same issue as you have, I only got the first 100 errors, too. Basically, because of a bug in the webmaster tools, it only shows you the errors in 100 batches.
It does not have a built in solution as far as I know, but there is a workaround. Instead of using the GetCrawlIssues function, you can access the data through http requests with the provided ExecRequest.exe command line tool. The basic usage is:
ExecRequest cl QUERY http://www.google.com/webmasters/tools/feeds/example_site.com/crawlissues/?start-index=1&max-results=100 example#gmail.com mypassword
This will output the resulting XML to the console. You can specify the starting point, and the number of errors you want to download:
?start-index=startIndex
&max-results=100
You can set the max-result value to wathever you want, but it will only download a maximum of 100 items.
After downloading in batches, you can get the data from the downloaded xml files.
If you only need the data, I've also written a small script in Python, you can check it out here, it's pretty straightforward.

Related

Tableau Command Line - Refresh datasource

I have a Tableau Workbook which connects to server-side data source. Researching online it seems the accepted way to refresh this is something like
tabcmd refreshextracts -–datasource “Number of Goals”.
My datasource is named "aggregated usage table". I have tried basically copying what I saw online, using
tabcmd refreshextracts -–datasource “aggregated usage table”, and receive the error message "*** Item not found". Clearly I am not correctly identifying my data source.
Can someone help me determine the correct syntax for this?
The O/E is Linux. Thank you!
Longshot, but something looks off about your dashes...
When I copy yours vs mine:
-– vs --
When I put yours into word and enlarge they look even weirder. Sometimes this can happen if copying and pasting from the web to import into your script.
Try copying and pasting my command and see if it works:
tabcmd refreshextracts --datasource “aggregated usage table”

Why watson Personality Inisights shows different results using different API versions/demo

My apologies if the question is duplicated. We are facing an issue with the analysis of a profile using Watson Personality Insights API in Spanish. We have a demo we implemented using PI API version 2 and then we tested the results (exact same text) with the demo published on developer cloud(in spanish) and we found important differences on how the big five were calculated when the facet values were not that different. Is it possible that these differences are caused because of the API version? The issue that with our demo the big five values produced a kind of negative summary profile when the developercloud summary is kinder.
We could send both result jsons. For example here is how the big five rated:
BigFive DeveloperCloud Demo V2
Openness 0.773834349 0.847273232
Conscientiousness 0.916616088 0.914907481
Extraversion 0.796331544 0.612606551
Agreeableness 0.17445636 0.096118648
Emotional range 0.036287447 0.01623536
thanks in advance!!
So the API version would not make a difference, as that just governs the format of the API; the back-end models are the same for both v2 and v3 of the API.
So the jist of your question is that when you run the same text in your app, and in the demo you get different big5 results, while the facet values are about the same.
This might be easiest solved by you opening a support ticket so we can debug the issue together; if you'd rather not do that then can you provide a sample text? Typically it boils down to a difference in the way the text is parsed.
Another question; did you try making the request using curl? That would cut out any custom logic in your app and narrow down the problem.
thanks Neil for you answer!
We tested the text using CURL and we noticed that the results didn't change by the service version used but instead by how the text was sent. If we called the service using curl passing a plain text input(formatted in UTF-8 with line breaks) it returned the same results for version2 and version 3 and also matched the ones from our demo. If we called the service using curl passing json input WITHOUT line breaks it returned the same values as well. But if we called the service passing the json input WITH line breaks then the results changed and almost matched those shown by ibm demo. My question here is which are the correct results? The ones shown when the text is sent as a plain text input(with line breaks) or when the text is sent as json input(with line breaks)? Is there any technical guideline besides the one shown in developercloud on how the text should be parsed to use this service?
Thanks again!

How do I get Gatling reports to show URLs instead of request_0 etc?

I'm new to Gatling, apologies if this is a complete noob question.
The "Details" tab of my Gatling report looks like this:
The left-hand menu contains all the requests that were made. My problem is that, in all but a few rare cases, they're just labelled "request_x" instead of the URL or filename. So where there is a bottleneck I can't tell what page or resource was causing it.
I found that if I manually edit the .scala file before running the scan, I can change each one by hand, e.g. if I change...
.exec(http("request_0")
.get(uri01)
.headers(headers_0)
.resources(http("request_1")
.get(uri02)
.headers(headers_1)))
...to..
.exec(http(uri01)
.get(uri01)
.headers(headers_0)
.resources(http(uri02)
.get(uri02)
.headers(headers_1)))
...it seems to have the desired effect. But I don't want to have to change hundreds of these by hand every time I have a new test to run.
Surely there's a better way?
FWIW I'm generating this scala file using Gatling's "recorder" with an HAR file exported from Chrome, as opposed to running the recorder as a proxy. But I have tried the proxy option and got the same end result.

How to get total error count from Klocwork automatically

I have a project running in Klocwork and after the build gets completed the Klocwork results will be generated. Every time I need to go to the Klocwork portal to get the results and look for the new issues or the total issues. Instead I need an API or script to get the total number of issues from the Klocwork results automatically when the build is successful.
Is there any way to achieve this? One way is to view the portal page source as html and get the result I need. However, I think there might be a better solution.
Can someone help me in achieving this?
Thanks in advance.
I answered a similar question over here. Below is an updated answer with links to the documentation for the most recent release, Klocwork 11.
Klocwork has a WebAPI which you can use to query this type of information from your favorite scripting language, or for example with curl. API documentation is also provided on your Klocwork server at http://klocwork_server_host:port/review/api, for example http://localhost:8080/review/api.
The query:
curl --data "action=search&user=my_account&project=my_project&query=build:build_1 status:Analyze state:New,Existing&ltoken=xxxx" http://localhost:8080/review/api
will return a list of all open (state New and Existing), non-cited (status Analyze) issues found in a build named build_1 of project my_project.
For a list of the keywords you can use in the query string with the search action, see Searching in Klocwork Review.
If you want just a summary of the number of defects instead of getting the whole list, you can use the report action:
curl --data "action=report&user=my_account&project=my_project&build:build_1&x=Category&y=Component&filterQuery=status:Analyze state:New,Existing&ltoken=xxxx" http://localhost:8080/review/api
which returns back a summary of the number of defects by checker category (taxonomy) and component. Sample output is below:
{"rows":[{"id":1,"name":"C and C++"},{"id":3,"name":"MISRA C"},{"id":4,"name":"MISRA C++"}],"columns":[{"id":5,"name":"System Model"}],"data":[[122],[354],[890]],"warnings":[]}
You can modify the x and y axis parameters to produce different breakdowns of the issues, for example by Severity and State instead:
curl --data "action=report&user=my_account&project=my_project&build:build_3&x=Severity&y=State&filterQuery=state:New,Existing,Fixed&ltoken=xxxx" http://localhost:8080/review/api
output:
{"rows":[{"id":1,"name":"Critical"},{"id":2,"name":"Error"},{"id":3,"name":"Warning"},{"id":4,"name":"Review"}],"columns":[{"id":-1,"name":"Existing"},{"id":-1,"name":"Fixed"},{"id":-1,"name":"New"}],"data":[[10,5,2],[20,6,1],[45,11,3],[1112,78,23]],"warnings":[]}
The WebAPI cookbook documentation has an example of using python with the report action and processing and formatting the response.

Problems with GitHub rendering my README.rst incorrectly..?

I've got a GitHub repo/branch where I'm attempting to update the README.rst, but it's not formatting the way I expect when it comes to the bullet lists I'm including.
Everything seems ok except for my Usage section, in which I have the following:
*****
Usage
*****
- Open the template file that corresponds to the API call you'd like to make.
* Example: If we want to make a call to the RefundTransaction API we open up /templates/RefundTransaction.php
- You may leave the file here, or save this file to the location on your web server where you'd like this call to be made.
* I like to save the files to a separate location and keep the ones included with the library as empty templates.
* Note that you can also copy/paste the template code into your own file(s).
- Each template file includes PHP arrays for every parameter available to that particular API. Simply fill in the array parameters with your own dynamic (or static) data. This data may come from:
* Session Variables
* General Variables
* Database Recordsets
* Static Values
* Etc.
- When you run the file you will get a $PayPalResult array that consists of all the response parameters from PayPal, original request parameters sent to PayPal, and raw request/response info for troubleshooting.
* You may refer to the `PayPal API Reference Guide <https://developer.paypal.com/webapps/developer/docs/classic/api/>`_ for details about what response parameters you can expect to get back from any successful API request.
+ Example: When working with RefundTransaction, I can see that PayPal will return a REFUNDTRANSACTIONID, FEEREFUNDAMT, etc. As such, I know that those values will be included in $PayPalResult['REFUNDTRANSACTIONID'] and $PayPalResult['FEEREFUNDAMT'] respectively.
- If errors occur they will be available in $PayPalResult['ERRORS']
You may refer to this `overview video <http://www.angelleye.com/overview-of-php-class-library-for-paypal/>`_ of how to use the library,
and there are also samples provided in the /samples directory as well as blank templates ready to use under /templates.
You may `contact me directly <http://www.angelleye.com/contact-us/>`_ if you need additional help getting started. I offer 30 min of free training for using this library,
which is generally plenty to get you up-and-running.
For some reason, though, when you look at that on GitHub the first line of the bullet lists is coming up bold and italics and I have no idea why. Also, the sub-list where it shows Session Variables, General Variables, etc. is supposed to be all the same sub-list. I'm not sure why it's dropping into another sub when it sees General Variables.
Any information on what I've done wrong here would be greatly appreciated. Thanks!
Switch from .rst to .md and then use '#' for your headings.
## Usage
- Open the template file that corresponds to the API call you'd like to make.
* Example