Google Declined To Comment On A Possibly Huge Leak Of Their Search Algorithm Documentation

Google’s search algorithm is likely the most important mechanism on the internet, determining which sites survive and which die, as well as the appearance of online material.

However, how Google ranks websites has long been a mystery, with journalists, scholars, and search engine optimization professionals piecing together the details.

Now, an enormous leak claiming to show thousands of pages of internal information appears to provide an unprecedented view behind the scenes of how Search works — and shows that Google hasn’t been totally open about it for years. So far, Google has not responded to several requests for comment on the papers’ validity.

Rand Fishkin, who has worked in SEO for over a decade, claims that a source sent 2,500 pages of information with him in the hopes that reporting on the leak would counter the “lies” shared by Google employees about how the search algorithm works. According to Fishkin, the docs detail Google’s search API and explain what information employees have access to.

Fishkin’s details are deep and technical, and are likely to be more understandable to developers and SEO professionals than the average person. The leak’s contents are not necessarily proof that Google employs the precise data and signals mentioned for search rankings. Rather, the leak reveals what data Google obtains from webpages, sites, and searches, as well as providing SEO experts with indirect suggestions about what Google appears to care about, according to SEO expert Mike King’s review of the documents.

The released documents cover a variety of issues, including how Google collects and utilizes data, which sites Google prioritizes for sensitive topics such as elections, how Google manages small websites, and more. According to Fishkin and King, some of the information in the documents appears to contradict Google personnel’ public pronouncements.

“‘Lied’ is harsh, but it’s the only accurate word to use here,” writes King. “While I don’t necessarily fault Google’s public representatives for protecting their proprietary information, I do take issue with their efforts to actively discredit people in the marketing, tech, and journalism worlds who have presented reproducible discoveries.”

Google has not reacted to The Verge’s requests for comment on the documents, including a direct request to deny their authenticity. Fishkin told The Verge via email that the corporation has not contested the authenticity of the leak, but that an employee asked him to adjust some phrasing in the post about how an event was described.

Google’s proprietary search algorithm has spawned an entire industry of marketers that rigorously adhere to Google’s public instructions and implement it for millions of businesses around the world. The pervasive, often annoying practices have resulted in a widespread perception that Google Search results are becoming increasingly clogged with rubbish that website operators believe they must provide in order to have their sites viewed. In response to The Verge’s previous reports on SEO-driven methods, Google staff frequently use the same defense: that’s not what the Google guidelines say.

However, several elements in the released documents throw into question Google’s public pronouncements about how Search works.

Fishkin and King provide an example of whether Google Chrome data is used in ranking at all. Google has consistently stated that it does not utilize Chrome data to rank pages, however Chrome is clearly addressed in sections regarding how websites appear in Search. According to the records, the links underneath the main vogue.com URL may have been constructed in part using Chrome data.

Another question is how, if at all, E-E-A-T influences ranking. E-E-A-T, which stands for experience, expertise, authoritativeness, and trustworthiness, is a Google metric used to assess the quality of results. Google personnel have previously stated that E-E-A-T is not a ranking criteria. Fishkin observes that he hasn’t uncovered many documents that mention E-E-A-T by name.

However, King described how Google appears to collect author data from a page, including a field for determining whether an entity on the page is the author. According to one of the documents released by King, the field was “mainly developed and tuned for news articles… but is also populated for other content (e.g., scientific articles).” Though this does not establish that bylines constitute an explicit ranking factor, it does demonstrate that Google is keeping track of this property. Google personnel have previously stated that author bylines are something website owners should do for their readers, not Google, because they have no impact on results.

Though the records aren’t precisely smoking guns, they do provide a detailed, unfiltered glimpse at a closely guarded black box system. The US government’s antitrust action against Google, which centered upon Search, has also resulted in internal documentation becoming public, providing additional insights into how the company’s core product works.

Google’s general ambiguity about how Search works has resulted in websites that appear identical as SEO marketers attempt to beat Google using cues provided by the business. Fishkin also criticizes publications that credulously support Google’s public claims as true without much further investigation.

“Historically, some of the search industry’s most prominent voices and prolific publications have been content to uncritically regurgitate Google’s public remarks. They use headlines like ‘Google says XYZ is true,’ rather than ‘Google claims XYZ; evidence suggests otherwise,'” Fishkin adds. Please do better. If this leak and the DOJ trial can bring about just one reform, I hope it’s this.”

Source

Radiant TV, offering to elevate your entertainment game! Movies, TV series, exclusive interviews, music, and more—download now on various devices, including iPhones, Androids, smart TVs, Apple TV, Fire Stick, and more.