What Are Google Search Essentials?
Google categorizes the Search Essentials into three main areas:
- Technical requirements
- Spam policies
- Key best practices
We will now address these sections one after another.
Technical Requirements
Technical requirements refer to the minimum settings and configurations Google requires to crawl, index, and serve your content on search results pages. These requirements are basic, and many sites meet them without requiring any additional work.
The technical requirements listed in Google Search Essentials include:
- Ensure Googlebot can access your content
- Ensure your pages return an HTTP 200 D'accord status code
- Ensure your pages contain indexable content
1 Ensure Googlebot Can Access Your Content
Googlebot is the web crawler Google uses to discover and crawl pages on the web.
If you want your content to appear on Google’s search results pages, you must ensure that Googlebot can access and crawl it. Content that cannot be crawled cannot be indexed, and content that is not indexed cannot be served on search results pages.
By default, Google will not crawl pages that require the visitor to log in or enter a password. You may also block Googlebot from accessing your content using your fichier robots.txt or the noindex meta tag.
So, if you want a page or content to appear in search results:
- Ensure the page does not require a password
- Ensure the page does not require the user to log in
- Ensure the page is accessible to anyone on the web
- Ensure the page is not blocked using the robots.txt file
- Ensure the page is not set to noindex
2 Ensure Your Pages Return an HTTP 200 OK Status Code
La 200 OK status code is the HTTP response code indicating a request was successful. In this case, it means the requested webpage is available, and Googlebot can access it.
Google only indexes pages that return a 200 OK status code. It will not index pages that return a client or server-side error. These errors are typically 4xx and 5xx series status codes, such as 404 introuvable, 410 Disparu, 500 Erreur de serveur interne, et 502 Mauvaise passerelle les erreurs.
You should inspect your pages to ensure they return a 200 OK response code and not some 4xx or 5xx error. It is crucial to note that 5xx errors may appear at irregular intervals. So, Googlebot may encounter them while you do not.
3 Ensure Your Pages Contain Indexable Content
Google only indexes content that is not spammy and is published in a supported file format. Googlebot can crawl many file formats, and you typically would not have any issues if you stick to common file formats.
However, here are all the file formats that Googlebot currently supports:
Image file formats
- PNG (Portable Network Graphics)
- JPEG (Joint Photographic Experts Group)
- GIF (Graphics Interchange Format)
- SVG (Scalable Vector Graphics)
- WebP (Web Picture format)
- BMP (Bitmap Image File)
Video file formats
Document file formats
- .csv (comma-separated values)
- .doc and .docx (Microsoft Word)
- .epub (electronic publication)
- .htm, .html, and related formats (HyperText markup language)
- .hwp (Hancom Hanword)
- .odp (OpenOffice presentation)
- .ods (OpenOffice spreadsheet)
- .odt (OpenOffice text)
- .pdf (Adobe portable document format)
- .ppt, .pptx (Microsoft PowerPoint)
- .ps (Adobe PostScript)
- .rtf (rich text format)
- .svg (scalable vector graphics)
- .tex (TeX/LaTeX)
- .txt, .text, and related formats (text)
- .wml and .wap (Wireless markup language)
- .xls and .xlsx (Microsoft Excel)
- .xml (Extensible markup language)
Geographic file formats
- .gpx (GPS exchange format)
- .kml and .kmz (Google Earth)
Source code file formats
- .bas (Basic source code)
- .c, .cc, .cpp, .cxx, .h, and .hpp (C and C++ source codes)
- .cs (C# source code)
- .java (Java source code)
- .pl (Perl source code)
- .py (Python source code)
Spam Policies
The spam policies outline the manipulative practices to avoid, ensuring your content is eligible to appear in Google search results pages.
Sites that engage in the spam practices listed in this section will have their rankings demoted or their content removed from search results pages altogether.
Google has automated systems to identify spammy content and practices. It could also have human reviewers review a site and apply a manual action penalty.
The spam content and practices mentioned in Google Search Essentials include:
1 Dissimulation
Dissimulation is a deceptive SEO technique in which a site shows different content to search engines and human visitors. Typically, the site displays optimized content to search engines while delivering different and often less relevant content to visitors.
Examples of cloaking include:
- A website shows an optimized, keyword-rich article to search engines, but displays an entirely different low-quality promotional landing page to visitors
- A website serves search engine crawlers a page of relevant, helpful content while showing visitors an unrelated page filled with ads or affiliate links
2 Doorway Pages
Pages de porte are webpages created specifically to rank for a search query and funnel visitors to a different page when clicked. These pages contain little to no helpful content and are solely intended to drive visitors to another page.
Examples of doorway pages include:
- An online store creates pages optimized for keywords like “cheap running shoes for men” and “best running shoes for flat feet,” but they all redirect visitors to the “Running shoes” category page of the store
- A plumbing service company creates city-specific pages optimized for local keywords like “plumber in New York” and “plumber in Chicago,” but all pages redirect users to the same service page
3 Expired Domain Abuse
Expired domain abuse involves acquiring expired domains and then redirecting the backlinks pointing to the acquired domain to your site. This practice leverages the expired domain’s authority and rankings without requiring any additional work from the domain’s new owner.
Examples of expired domain abuse include:
- Purchasing an expired domain that previously belonged to a government agency and redirecting the traffic to an unrelated e-commerce site
- Purchasing an expired domain that once hosted a high-quality blog and repurposing it to host low-quality, ad-filled content
4 Hacked Content
Hacked content refers to malicious or spammy content inserted into compromised sites. Hackers can use this content to distribute malware or display harmful and irrelevant content to visitors.
Examples of hacked content include:
- Spammy ads for illegal products on an existing or new page on a hacked site
- Code that automatically redirects visitors of a hacked site to spammy or harmful pages
5 Hidden Links and Text
Hidden links and text involve using content or hyperlinks that are invisible to users but visible to search engines. The hidden texts are typically keywords the page wants to rank for, while the hyperlinks link to unrelated pages or sites.
Examples of hidden links and texts include:
- A site places links to various affiliate products in a 1-pixel by 1-pixel image that is invisible to visitors.
- A site includes blocks of keywords on a webpage, but changes their font color to white so that they blend into the white background.
6 Remplissage de mot-clé
Remplissage de mot-clé is the practice of overloading a page’s textual content or meta tags with unnecessary keywords and phrases. The keywords typically cause sentences and paragraphs to appear unnatural and, in extreme cases, incomprehensible.
Examples of keyword stuffing include:
- An e-commerce website that repeats the same set of phrases in a product description
- A local business site containing blocks of text listing the cities and areas the business wants to rank for
7 Link Schemes
Link schemes refer to any scheme designed to manipulate a site’s link structure. This includes link spam, link farms, hidden links, paid links, spammy link directories, and automated links.
Examples of link schemes include:
- A blogger buys links from high-traffic sites and blogs in an unrelated niche
- A blogger publishes hundreds of comments on various blog posts and forums with links pointing back to their site
8 Machine-Generated Traffic
Machine-generated traffic refers to visits from non-human sources created using automated scripts or bots. This traffic is used to skew analytics data, inflate site visits, and manipulate search rankings
Examples of machine-generated traffic include:
- A blog uses automated scripts to click on ads repeatedly, creating artificial impressions and clicks
- A website operator uses bots to generate thousands of fake pageviews to make the site appear more popular than it is
9 Malware and Malicious Behaviors
Malware is malicious software designed to harm, exploit, or compromise a network, system, or device. It can steal data, damage systems, or allow unauthorized access to the visitor’s device.
Examples of malware and malicious behaviors include:
- A software that alters the visitor’s browser settings without their knowledge, input, or permission
- A free application that installs a keylogger on a visitor’s computer and steals their sensitive information
10 Misleading Functionality
Misleading functionality refers to websites that deceive visitors into believing they will access specific functions, content, or services. However, the site does not contain such a function, content, or service and instead redirects visitors to other content.
Examples of misleading functionality include:
- A site claims to offer a free PDF converter tool, but instead redirects users to a page filled with ads
- An SEO tool promises to provide detailed keyword analysis, but secretly installs malware that changes the visitor’s browser homepage
11 Scaled Content Abuse
Scaled content abuse refers to the mass production and distribution of low-quality, duplicate, or near-duplicate content across multiple pages or sites. The content may be generated by AI, humans, or a combination of AI and humans.
Examples of scaled content abuse include:
- An SEO firm creates hundreds of nearly identical pages targeting every city and state combination in the country, with minimal variations in content
- An online store generates thousands of product pages using the same basic template and only slightly altering product descriptions to create the appearance of a vast inventory
12 Contenu récupéré
Scraped content is material copied from other websites without any helpful content added to it. It is typically copied without permission or proper attribution to the original publishing site.
Examples of scraped content include:
- A site changes a few words in the content originally published by another site and passes it off as its own
- A site copies entire articles from other reputable sites and republishes them without permission, citing the source, or adding any new value to it
13 Sneaky Redirects
Sneaky redirects refer to the act of sending users to a different URL other than the one they initially requested. In most cases, the search engine is directed to a high-quality page, while human visitors are redirected to a low-quality, spammy page.
Examples of sneaky redirects include:
- A page appears to be a blog post about health tips, but it redirects visitors to a spammy online pharmacy
- A page that directs search engines and desktop visitors to a legitimate-looking e-commerce site but redirects mobile users to an unrelated site selling questionable supplements
14 Site Reputation Abuse
Site reputation abuse refers to situations wherein a site publishes content belonging to a third-party partner or advertiser without direct involvement in the content creation process. The content typically has little value to visitors and is intended to capitalize on the site’s rankings.
Examples of site reputation abuse include:
- A medical site hosting third-party content advertising a casino.
- A fitness site hosts affiliate content about “best workout supplements,” which is written entirely by a partner without any input or oversight from the site itself.
15 Thin Affiliate Pages
Thin affiliate pages are pages that contain affiliate links. However, the description and reviews of the affiliate product are copied from the original merchant site without any input from the site on which it is published.
Examples of thin affiliate pages include:
- A page that advertises an affiliate product, but with little to no unique content or genuine reviews about the product
- A travel blog that links to affiliate booking sites but provides little or no helpful information about the travel destination
16 User-Generated Spam
User-generated spam refers to spammy content published by visitors, often without the site owners’ knowledge. The spammy content could be published to the site’s comment section, hosting service, or file hosting platform.
Examples of user-generated spam include:
- An online forum filled with spammy posts and comments that contain irrelevant links to third-party sites
- A free web hosting platform that allows anyone to register a site, but does not moderate the content published to those sites
Google Search Essentials lists other content and behaviors that could cause Google to demote a site even when the content is not considered spam. Some include:
- Sites with lots of copyright removal requests
- Sites that publish defamatory content
- Sites that deal in counterfeit items
- Sites that engage in fraudulent activities
- Sites that impersonate a legitimate business
- Sites with too many legal removal requests issued against them
- Sites that request compensation to remove personal information from the web
- Sites that receive many requests to remove personal information from the web
- Bloggers who repeatedly create spammy sites that violate Google’s policies
Key Best Practices
The key best practices outline some suggestions for improving your chances of ranking on the Google search results page. The key best practices listed in Google Search Essentials are:
1 Create Helpful Content
Google recommends that creators create content for people rather than for search engines. This sort of content is helpful and genuinely addresses visitors’ needs and queries.
Helpful content is well-researched, comprehensive, and easy to read. It engages the audience, provides unique insights into a topic, and demonstrates the author’s expertise and authority on the topic.
2 Include Keywords in Your Content
Google recommends including relevant words and phrases that your target audience is likely to search for in your content. You should place these keywords at strategic locations, such as in your title, main headings, alt text, link text, and within the content.
This helps Google understand the topic and context of the content, making it more likely to appear in relevant search results. However, ensure that the keywords flow naturally with the rest of your content; otherwise, it may sound unnatural and be perceived as keyword stuffing.
3 Include Internal Links in Your Content
Google recommande y compris internal links dans your content. These links help Google to find other content on your site, understand your site structure, and understand the relationship between different pages on your site.
These links also help visitors navigate your site, discover related content, and stay engaged for longer periods. However, you should ensure the internal links to the content you want in search results are crawlable.
4 Inform Others About Your Site
Google recommends that you inform people about your site. You should be actively engaged in online and offline communities where you inform like-minded people about the products and services you talk about on your site.
5 Include Structured Data in Your Content
Google recommends that you include structured data in your content. This helps Google to understand the context and details of the information presented in your content.
Structured data enhances your search engine visibility and is a requirement for content considered for rich results.
6 Prevent Google From Indexing Content You Don’t Want in Search Results
Google recommends you block Googlebot from accessing content you do not want on search results pages.
By default, Googlebot will not index content that requires a password. However, you can also block Googlebot from accessing content using the noindex meta tag or using your robots.txt file to disallow Googlebot from crawling your site.
7 Follow Image, Video, Structured Data, and JavaScript Best Practices
If you have image, video, structured data, and JavaScript on your site, ensure to follow their best practices to ensure Google can understand that content.
That is, your images should be well-formatted, optimized, and published in a supported format. They should also contain descriptive titles, file names, and alt text.