Skip to main content

Yandex caught scraping Google SEO code

As TechRadar Pro reported earlier in January 2023, a former Yandex employee with a “political” motive has allegedly leaked a wide-ranging repository of source code for many of the web portal’s products, potentially shedding light on the dark art of search engine optimization.

BleepingComputer reports the employee leaked git sources totalling 44.7GB of files, containing “all of” Yandex’s source code except for its anti-spam rules, that were obtained in July 2022.

The raw source code won’t be of interest to everyone, Search Engine Land's report that 17,854 search ranking factors have been uncovered as part of the leak should be of interest to any person, business or publication looking to see their pages ranked highly in search engines.

Yandex leak SEO insights

A partial list of factors ranked by the Yandex search engine from one file in the codebase, shared by CEO of SEO consultancy MOG Media Martin MacDonald, does shed some light on the aspects of copy that Yandex applies weight to. 

Per Russian Search News, these include PageRank and several aspects of links such as age and relevancy, the perceived relevance of copy, host-reilability, and innate preferences towards specific sites with perceived authority, such as Wikipedia. 

A deeper, longer, more technical dive by Search Engine Land also shows that this priority also includes a “NEWS_AGENCY_RATING”, allowing Yandex’ search engine to show preference to certain news organizations.

Others include the number of unique visitors, percentages of organic traffic, and average domain rankings across queries.

However, it’s perhaps melodramatic, or a little desolate, for MacDonald to describe it as “the most interesting thing to have happened in SEO in years.”

While the leaked codebase certainly offers a raft of insights, it’s worth noting that many websites will be looking to rank well on Google over Yandex, purely because the former is far better known. 

Both companies have shared web engineers over the years, Yandex does use many of Google’s open source technologies, such as TensorFlow and BERT, and references to Google data appear in the leaked codebase.

However, Search Engine Land’s deep dive argues that the Yandex leak can give general insight into the anatomy of a modern search engine, but, per Russian Search News, many of the Yandex’ leaked ranking search factors go unused, or are officially considered depreciated. 

Even the technical deep dive admits many of Google (the search engine’s) known aspects, such as its crawler and index systems, differ from Yandex’.

All of this, combined with the age of the leaked codebase, makes it unclear as to how assumptions over how Yandex and Google may both rank pages will fare.



Comments

Popular posts from this blog

Windows Copilot leak suggests deeper assimilation with Windows 11 features

Key Windows 11 features may soon be customizable as Microsoft further integrates its Windows Copilot AI assistant into the operating system. This tidbit comes from tech news site Windows Latest , which claims to have discovered new .json (JavaScript Object Notation) files within recent preview builds of Windows 11. These files apparently hint at future upgrades for the desktop AI assistant. For example, a “TaskManagerService-ai-plugin.json” was found which is supposedly a “plugin for Task Manager integration”. If this ever comes out, it could give users the ability to “monitor or close running apps using” Copilot. In total, six are currently tested and they affect various aspects of Windows 11. Next, there is an “AccessbilityTools-ai-plugin.json” that gives Copilot a way to “control accessibility [tools]. This would make it "easier for those with [a] disability to navigate through the system.” Third is “ai-plugin-WindowsSettings.json” for controlling important Windows 11 set...

Google Chrome releases security fix for this major flaw, so update now

Google says it has fixed a high-severity flaw in its Chrome browser which is currently being exploited by threat actors in the wild.  In a security advisory , the company described the flaw being abused and urged the users to apply the fix immediately.  "Google is aware that an exploit for CVE-2023-2033 exists in the wild," the advisory reads. Automatic updates The zero-day in question is a confusion weakness vulnerability in the Chrome V8 JavaScript engine, the company said. Usually, this type of flaw can be used to crash the browser, but in this case it can also be used to run arbitrary code on compromised endpoints.  The flaw was discovered by Clement Lecigne from the Google Threat Analysis Group (TAG). Usually, TAG works on finding flaws abused by nation-states, or state-sponsored threat actors. There is no word on who the threat actors abusing this flaw are, though. Read more > Patch Google Chrome now to fix this emergency security flaw > Emergency...

Samsung's ViewFinity S9 may be the monitor creatives have been searching for

Originally revealed during CES 2023 , Samsung has finally launched its ViewFinity S9 5K monitor after nine long months of waiting.  According to the announcement, the ViewFinity S9 is the company’s first-ever 5K resolution (5,120 x 2880 pixels) IPS display aimed primarily at creatives. IPS stands for in-plane switching , a form of LED tech offering some of the best color output and viewing angles on the market. This quality is highlighted by the fact that the 27-inch screen supports 99 percent of the DCI-P3 color gamut plus delivers 600 nits of brightness.  Altogether, these deliver great picture quality made vibrant by saturated colors and dark shadows. The cherry on top for the ViewFinity S9 is a Matte Display coating to “drastically [reduce] light reflections.”  As a direct rival to the Apple Studio Display , the monitor is an alternative for creative professionals looking for options. It appears Samsung has done its homework as the ViewFinity S9 addresses some of...