Previously, Google had open-sourced robots.txt that is used in Java. However, now it has announced to release some additional source code projects in C++ and Java.
C++ and Java:
Google has officially released the source code that validates the robots.txt testing framework using C++. It ensures that the outcomes after parsing (parser results) are following the official robots.txt specifications. Moreover, it also validates parsers written in different languages other than C++. Besides C++, Modern Java is a widely used language in various applications. Google has announced to release the official port Java as well.
In its blog post, Google writes:
“Last year we released the robots.txt parser and matcher that we use in our production systems to the open source world. Since then, we’ve seen people build new tools with it, contribute to the open source library (effectively improving our production systems- thanks!), and release new language versions like golang and rust, which make it easier for developers to build new tools.”
Robots.txt Specification Test:
The blog post states:
“Currently there is no official and thorough way to assess the correctness of a parser, so Andreea built a tool that can be used to create robots.txt parsers that are following the protocol.”
Java robots.txt parser and matcher:
In addition to this, Ian created the Java port of C++ robots.txt parser. Speaking of functions and behaviors, the new parser is an exact duplicate of C++ parser. Teams are very eager to use the Java robots.txt parser in production systems, and let’s hope that they find it convenient.
The requirement to Run the Testing Frameworks:
Java Development Kit 1.7+ for Apache and Maven (Automation tool for Java projects)
Google’s protocol buffers
As always, Google invites the developers to contribute to these projects and let them know if there is a room for improvement and development.