google text search vs. metadata search - metadata

I'm very interested in search engines.
Today in a talk I heard that google performs a text search, while more complex engines could rely on the use of metadata, which is apparently not so used by google.
Which is the difference between text search and metadata search?
Could you provide some links where I can go deeper on this subject?

Metadata is 100% text.
The reason why Google doesn't use it is because people tend to lie about their content (not automatically on purpose.)
Now, what Google doesn't use is the Keywords meta data tag (although they may be checking it out to see whether you're a liar...) They do use the other meta tags.
I just wrote a long list of meta tags supported by many systems. I still need to add many more, but out of those that are there the og:image and description and some others are very useful.
http://snapwebsites.org/implementation/feature-requirements/layout-feature-core/meta-tags-and-links-supported-core

Related

Why would "Disallow: /*?s=" be used in a robots.txt file?

We got notice from Google's Search Console that one of our blog posts couldn't be crawled. When inspecting the URL from the Google Search Console it reports that the paged was blocked by the following in our robots.txt file.
Disallow: /*?s=
I also ask why "Disallow: /*?s=" would be used? Why worry about parses that contain the letter "s"? If we remove it, what's the risk? Thanks so much in advance for any additional insight that can be shared - P
This query is commonly used on WordPress-based sites.
There may be several types of content on your site and the site builder wanted to allow search only for certain types of content by another way of searching.
It makes sense for example on a store site that wants to restrict users from searching for the products using a customized search form so that they do not wander behind the scenes of the site.
Google's robot has a number of ways to identify if it's a WordPress based site, which is probably why it's looking for that end of the path.

Google requiring certain properties for certain types when adding Schema.org markup?

I tried to add Schema.org markup on my site, with the type Article, then adding some properties on it, when checked on Google Structured Data Testing Tool, it said required certain properties, like dataPublished, author etc.
I can add some properties to meet the requirement, but not all of them. Is this requirement real? I mean really required by the rule? Or just Google rule? I came across this page https://developers.google.com/search/docs/data-types/articles It said for non-AMP, those requirements are only optional (ignored or recommended, non of them said required for non-AMP).
This get me confused, anyone knows about this, what's your opinion? Do Google Structured Data Testing Tool already include AMP requirement?
These are required/recommended for getting one of Google’s search features.
If you don’t want that Google search feature, or if you can’t provide all necessary properties, you can keep everything like it is and ignore the errors and warnings.
Related answers
Should Schema.org dateModified have some default value if not available?
Schema.org/Microdata markup for list of recent posts without providing “author” / “publisher”?
Do I have to create new visible elements to abide by Google's Microdata Schema.org requirements?
Omitting price property for sold products?
Use Schema.org for Article without image property?
Image missing and required - Wordpress AMP Structure doesn't add Image attribute
On Webmasters SE:
Schema.org BlogPosting and image required
Is it mandatory to have rich snippets for AMP pages?

How to tag the code of a website for structured data recognition by Google SEO?

we're just completing a new site build. With the current theme, we have had issues with structured data (we've highlighted it on Webmasters tools, and weeks later had to re-highlight it, and even then the highlighting prediction is not where we would like it to be).
It seems like Google is not able to find our Title, author, categories, content, featured image, date very easily. I'd expect to be able to communicate this to Google with 100% accuracy, since its so simple and we use the same format for all our articles). So maybe our theme is missing something by way of tags or something in the code to point to and identify this data?
Is that the case? Could someone please tell me what this aspect is called (so I can research it by its term), explain what I need to do with the new build, point me in the direction of an authoritative explanation/tutorial?
The site in question is a WordPress site, but I also am working on some php sites and would like to use this information on all sites, if it can be applied this way.
Thanks
You can use micro-data to mark-up the structured data. Also Google will really like your site if you show him (with a code) everything about the site - navigation, sidebar (aside), content (article) and so on. I suggest you to read about schema.org and micro-formats.
Here is an usefull article about your problem and how to implement micro-formats to your site.

Tags vs. categories for website content?

I am creating a site for electronics and programming projects and articles, and I'm trying to figure out whether to use categories, tags or both. I've been leaning towards just using tags, as it's done here on StackOverflow.
Seen from the perspective of the user, what provides the best user experience and makes the information easy and intuitive to find. I realize that this is much a question of personal preference, but I am interested in hearing opinions.
Here is what I ended up doing: I implemented both categories and tags; a post can only have one category but multiple tags.
The category is used as part of the URL, this puts a keyword in the URL which is good for SEO and it makes the URLs more structured. The categories are selected from a drop-down menu, and they are required. Categories are type specific, meaning articles will probably not have the same categories as projects or images.
articles/foobar // Show all articles with the category foobar
articles/1/foobar/article_slug // View a specific article
Tags can be added and attached to a posts simply by typing them with comma separation, they are used in the meta keywords field. I don't think that matters much to SEO, but they are available so why not. Multiple tags can be attached to a post, but at least one is required. Tags are not type specific but universal, meaning that all resources may share the same tags. So a search for a tag may return articles, projects and images.
tags // Show all tags, and number of resources that use them
tags/foobar // Show all resources with the tag foobar
articles/tagged/foobar // Show all articles with the tag foobar

Google Rich Snippets warnings for hCard

I get the following errors from the Google Rich Snippet Tool for my website http://iancrowther.co.uk/
hcard
Warning: This information will not appear as a rich snippet in search results results, because it seems to describe an organization. Google does not currently display organization information in rich snippets
Warning: At least one field must be set for Hcard.
Warning: Missing required field "name (fn)".
Im experimenting with vcard and Schema.org and am wondering if I'm missing something or the validator is playing up. I have added vcard and Schema.org markup to the body which may be causing confusion. Also, I am making the assumption I can use both methods to markup my code.
Update:
I guess with the body tag, I'm just trying to let Google discover the elements which make up the schema object within the page. I'm not sure if this is a good / bad way to approach things? However it lets my markup be free of specific blocks of markup. I guess this is open to discussion but I like the idea of having a natural flow to the content that's decorated in the background. Do you think there is any negative impact? I'm undecided.
I am in favour of the Person structure, this was a good call as this is more representative of the current site content. I am a freelance developer and as such use this page as my Organisation landing page, so I guess I have to make a stronger decision of the sites goals and tailor the content accordingly, ie Organisation or Person.
I understand that there is no immediate rich snippet gains, but im a web guy so have a keen interest in these kind of things.
With schema testing, I find it easiest to start from the most obvious problem, and try to work our way deeper from there. Note, I have zero experience with hcard, but I don't believe the error you mentioned actually has anything to do with your hcard properties.
The most obvious problem I see, is that your body tag has an itemtype of schema.org\Organization. When you set an itemtype on a dom element, you are saying that everything inside of that element is going to help describe that itemtype. Since you've placed this on your body element, you are quite literally telling Google that your entire page is about an organization.
From the content of your page, I would recommend changing that itemtype to schema.org\Person. This would seem to be a more accurate description. Once you make that change and run the scanner again, you may see more errors relating to the schema and we can work through those too (for example, you'll probably need to set familname and givenName).
With all of that said, you should know that currently there are no rich snippets that you will gain from adding this schema data. Properly setting this up on your page, is only good to do, especially since we don't know what rich snippets Google or others will expose in the future, but currently you won't see any additional rich snippets in Google search results from adding these tags. I don't want to discourage you from setting this up properly but I just want to set your expectations.