robots.txt and Google spyware - robots.txt

On my web page I have robots.txt in which I specified some pages that I don't want google to index.
Chrome and Google toolbar sends information about pages that I have visited.
I read somewhere that google will index sites which I blocked in robots.txt.
Is that true?
Where can I read more about it?

http://blog.ineedhits.com/tips-advice/why-google-can-index-your-robotstxt-blocked-urls-12346715.html
It seems it wasnt google chrome or toolbar who crawled sites. The problem was that there exists extern links on sites that are blocked.

Related

why my mobile site detects google crawler as desktop thus switches to desktop version crawling

I am struggling to get my mobile site (under m.xxxx.xxx.com subdomain) to be crawled and indexed by google, via "google search console" and "url inspection' process.
problem is that my mobile site resolves desktop clients only to a desktop version of the site.
Now,
google bot indexing appears to my site as desktop device , and is redirected to desktop.
is there a way to define mobile indexing only via sitemap ? or robots.txt ?
thanks.
Oded.

Google Ads has disapproved our website due to malicious or unwanted external links but we couldn't find the mentioned link (debysale.com) anywhere

We run our website in wordpress betheme. We are trying to put AdSense ads on our website. For that, we contacted with google team, but from there end found a malicious or unwanted external links for which they are disapproving our website again and again. Previously we had malware which we removed recently. After that, We tried wordfence, google transparency site scan, secure wp and various other website scanners but found no malware or malicious external links.
They malicious link google mentioned : debysale[.]com
How to find and get rid of this malicious link? If anyone could assist, that would be very helpful.
I am attaching the reply of google with this
enter image description here
For your help our website link is https://rkpl.com.bd/
Open webpage in Google Chrome, right click on page and press view source code, press crtl+f and search for the link. Do this for each page on your website.

Website not indexed/scraped/crawled via Bing Custom Search

I want to create a multi-site search function using Bing Custom Search. I have created a search instance on https://www.customsearch.ai/ using my company-provided Microsoft account.
After adding my company's main www website (www.mysite.com) the search works, in the try-it-out tool and when going via the v7 API.
When I add a subsite, e.g. ``mysubsite.mysite.com`, how ever it does not crawl and display search results from that site.
I have tried:
Allowing subpages for mysite.com
Specifying protocol, e.g. HTTPS
Waiting for a day or two
What can be the problem? Sure, the subsite is not publicly released yet (or announced I mean), but it is accessible by everyone with an Internet connection and a web browser. How come Bing Custom Search does not find it when I tell it the exact address?
Thank you in advance.
If your subsite/pages are not crawled or indexed, you can check out the webmaster info on this page: https://learn.microsoft.com/en-us/azure/cognitive-services/bing-custom-search/define-your-custom-view. Search for webmaster documentation on this page.

How do I change robots.txt on a Google site?

I need to make changes to someone's robot.txt file, but their site is managed by Google, (so no FTPing).
I have full access to the site via the browser (Site Actions / Manage Site, etc) but how do I get to the robots.txt file to update it?

Crawl Errors in Google Webmaster Tools

I have a selection of crawl errors in my Google Webmaster tools for links that no longer exist on my site. These are a result of an old hack, where a pharmacy hack linked to PDFs. These have all been removed months ago, but external sites are still linking to these pages, which are then causing crawl errors.
Is there a way to alert Google that these links are fake/spam?
There is a special page where google allows reporting of various spam pages, you should check this:
https://www.google.com/webmasters/tools/spamreport?hl=en&pli=1