Skip to main content

Yandex caught scraping Google SEO code

As TechRadar Pro reported earlier in January 2023, a former Yandex employee with a “political” motive has allegedly leaked a wide-ranging repository of source code for many of the web portal’s products, potentially shedding light on the dark art of search engine optimization.

BleepingComputer reports the employee leaked git sources totalling 44.7GB of files, containing “all of” Yandex’s source code except for its anti-spam rules, that were obtained in July 2022.

The raw source code won’t be of interest to everyone, Search Engine Land's report that 17,854 search ranking factors have been uncovered as part of the leak should be of interest to any person, business or publication looking to see their pages ranked highly in search engines.

Yandex leak SEO insights

A partial list of factors ranked by the Yandex search engine from one file in the codebase, shared by CEO of SEO consultancy MOG Media Martin MacDonald, does shed some light on the aspects of copy that Yandex applies weight to. 

Per Russian Search News, these include PageRank and several aspects of links such as age and relevancy, the perceived relevance of copy, host-reilability, and innate preferences towards specific sites with perceived authority, such as Wikipedia. 

A deeper, longer, more technical dive by Search Engine Land also shows that this priority also includes a “NEWS_AGENCY_RATING”, allowing Yandex’ search engine to show preference to certain news organizations.

Others include the number of unique visitors, percentages of organic traffic, and average domain rankings across queries.

However, it’s perhaps melodramatic, or a little desolate, for MacDonald to describe it as “the most interesting thing to have happened in SEO in years.”

While the leaked codebase certainly offers a raft of insights, it’s worth noting that many websites will be looking to rank well on Google over Yandex, purely because the former is far better known. 

Both companies have shared web engineers over the years, Yandex does use many of Google’s open source technologies, such as TensorFlow and BERT, and references to Google data appear in the leaked codebase.

However, Search Engine Land’s deep dive argues that the Yandex leak can give general insight into the anatomy of a modern search engine, but, per Russian Search News, many of the Yandex’ leaked ranking search factors go unused, or are officially considered depreciated. 

Even the technical deep dive admits many of Google (the search engine’s) known aspects, such as its crawler and index systems, differ from Yandex’.

All of this, combined with the age of the leaked codebase, makes it unclear as to how assumptions over how Yandex and Google may both rank pages will fare.



Comments

Popular posts from this blog

Garmin's new radar-equipped tail light will keep you safe on your e-bike

Garmin's Varia bike radars are some of the most popular pieces of cycling tech around – and now the company has delivered its first rearview radar to have been specially designed for some of the best e-Bikes .   Garmin's Varia range mounts to the back of your bike and broadcasts a radar signal behind you, so you can get visual and audible alerts when something's overtaking you. Even better, the new Varia eRTL615 plugs directly into most e-bikes, with no battery required. Because the catchily-named Varia eRTL615 is also a tail light, it'll also make sure you're visible to other vehicles too, promising to emit a flashing or solid light that's visible from up to a mile away in daylight. To connect Garmin's new radar tail light to your e-bike, you'll need to pick the right Garmin adapter cable (which isn't included). You can buy power cables compatible with Bosch, Shimano, or USB-A terminals or connections, with more info on those available on Garmin...

Revolution Software is using their own AI technology to remake Broken Sword

TechRadar Gaming is reporting live from Gamescom 2023 on the latest and greatest developments in gaming and hardware. Revolution Software announced at Gamescom 2023 that Broken Sword would be coming back, with Broken Sword - The Shadow of the Templars getting a full remake while a sixth title in the series is coming in the future too, under the title Broken Sword - Parzival’s Stone .  Speaking to TRG ahead of the announcement, Cecil talked about the studio’s plans for a Broken Sword remake and the sixth title in the series. Cecil is a larger-than-life character, who is able to talk about the studio’s plans with enthusiasm. It even carries a pocketful of stones to illustrate the plans for Parzival’s Stone , but he also talks about how Broken Sword - The Shadow of the Templars would be using AI to upscale.  Cecil wasn’t shy about the studio’s use of AI technology, but he gave a fairly robust explanation of why the game was using it. The AI technology will be used to upda...

Hackers steal passwords, emails from hookup websites

Two gay hookup websites have been breached with sensitive and personal user data stolen and sold online, new reports have claimed. The databases, which are now being sold on dark web forums, were taken from platforms called TruckerSucker, and CityJerks. They contain enough personally identifiable information to engage in identity theft , such as usernames and passwords, email addresses, profile pictures, sexual preferences, birth dates, postal addresses, IP addresses, and bios. The passwords are encrypted, but according to TechCrunch, the algorithm is “weak” and could be broken by a more persistent hacker. The silent treatment HaveIBeenPwned founder Troy Hunt, who was tipped off on the leak, described the incident as a “typical forum breach, albeit with super sensitive content.”  However the content includes more than just identity data, as there are also messages users exchanged, including arranging meetings and describing their sexual preferences.  In total, more than...