Open source devs are fighting AI crawlers with cleverness and vengeance

Spread the love

AI Web Creling Bots are the Internet cockroach, many software developers believe. Some divs have begun to fight in a ridiculous way, often wise.

Although any website can be targeted by bad crawler behavior – Sometimes the site taken down – Open source developers are “unnecessarily” affected, Wrote Nicholi Venerandi, a Linux desktop developer known as Plasma, owns the Blog Librified.

By their nature, the hosting sites of the Free and Open Source (FOSS) projects shared their infrastructure more publicly and have less resources than their commercial products.

The problem is that many AI Bots do not respect the robot Protocol.Text file, the equipment that does not crawl the bottles, was originally made for search engine bot.

A “cry for help” Blog post In January, the Fos developer X IAS also describes how Amazonbot was relentlessly identified as the cause of the DDS confrontation on a git server website. Git servers host Fos projects so that anyone can download the code or contribute to it.

However, the bot ignored Lasore Robot.com, hiding behind the other IP address and pretending to be other users, Laso said.

“AI crawler bot is useless because they lie, change their user agent, use residential IP addresses as proxy and many more,” Laso regretted.

“They will scrap your site until they finish and then they will scrap it more they will click on each link on each link on each link, the same pages will repeatedly click on the same link to them multiple times,” wrote in the developer post.

Enter the god of buried

So the IASO fought cleverly, built a tool called Anubis.

Is the nubis Checking the proof-off-work of an reverse proxy The requests must be passed before allowing a git server to hurt. It blocks the bot but is governed by humans through browsers.

Fun Parts: Anubis is the name of a god in Egyptian myths who lead the dead to the dead.

“Anubis weighs your soul (heart) and if it is heavier than the feather, your heart is consumed and as you died as mega,” Iso told TechCrunch. If a web request passes the challenge and is determined to be determined to be human, to be human, The picture of a beautiful anime Declaration of success. The drawing is “I accept anthropomorphizing anubis,” IASO. If this is a bot, the request is denied.

The Riley Named project has spread like a wind in the Fos community. Laso It shared at Github On March 7, and in just a few days, it collected 2,000 stars, 20 contributors and 39 fork.

Revenge as defense

The immediate popularity of Anubis shows that the pain in IASO is not unique. In fact, the story shared the story after the story:

  • The founder is the CEO Sourcehot describes the droo developed “Any weeks over my 20-100% of the Hyper-Agrami LLM crawler” and spent on “gaining a few dozen short outfits per week”.
  • Linux Industry News Site LWN Running a renowned fossist Jonathan Corbet warned that his site was DDDS-level is being slowed down by traffic “From the AI ​​scraper bot.”
  • Kevin Fenzi, Sisadmin of the Linux Fedora Project, AI Scraper Bots says When he became so aggressive, he had to stop the entire country of Brazil from access.

Venerandi told TechCrunch that he knows about multiple projects that experience the same problems. One of these was “all the Chinese IP address at one point was temporarily banned.”

Venarendi says it is to be submerged for a moment – the developers that “even the whole country has to go towards the ban” to stop the AI ​​bot that ignores the robot.XT files.

Beyond considering the soul of a web request, other divs believe that revenge is the best defense.

A few days ago Hacker newsUser xyzal Recommended Robot.Text burdens a bucket with the banned pages “Load a bucket on the benefit of drinking” or “Articles on the positive effects of the ham on the bed performance”. ”

Zijal explained, “Imagine our bot not only zero value, we need to be noticed to get the _ Concent_ utility value from visiting our traps,” explained Zizal.

As it happens, in January, a banam creator known as “Aaron” has revealed an equipment NEPENTS Its goal is to do so. It stops the crawls in the endless puzzle of fake materials, a goal that Dev was recognized Arser Offensive if not completely contaminated. The equipment is named after a carnivorous plant.

And Cloudflair, perhaps the largest commercial player supplying several tools to prevent AI crewrles, has released a similar equipment called AI Labereth last week.

The CloudFlair describes In its blog postThe Cloudflair says it is a misbehavior of AI crewrs “instead of extracting your valid website data instead of irrelevant content.”

Sourcehut’s Developer told TechCrunch that “Nanepents has a satisfactory feeling of justice, since it poisoned the crawlers and their wells, but eventually Anubis worked for his site”.

But Develot also issued a public, sincere appeal for further direct solution: “Please stop any legalization of LLM or AI image generator or Github Copilot or this garbage. I stop talking about them, stop talking about them, stop making newcomers, just stop.”

Since its potential is Zilch, the developers, especially in Fos, are fighting the touch of cleverness and humor.

Leave a Reply

Your email address will not be published. Required fields are marked *