CodeCrawler in search of developers -- ADTmag

CodeCrawler in search of developers

By Kathleen Ohlson
May 24, 2005

A group of developers at the University of Illinois-Urbana Champaign released CodeCrawler, a Web-based search engine tailored for developers to search source code.

The tool, which is available at http://codecrawler.sourceforge.net, is a token-identifying, code-discovering engine that ranks results by relevance for developers trying to find components of source code across local file systems and Web-based code banks. Its core is comprised of components, including Lucene, CTAGs and Highlight, and delivered via an interface.

Administrators install CodeCrawler and configure source code repositories to be searched. CodeCrawler then builds a search index for the source code while analyzing each file and extracting semantic information. Developers search the indexed repositories and examine the code from within a Web browser.

Developers often are burdened with large-scale software development and maintenance. As the code base increases, it becomes more difficult keeping the code and documentation up to date, and fixing existing bugs.

They use grep utilities to find a particular piece of code by searching source files for a match with a regular expression, but grep utilities have disadvantages, according to the university's developers. Writing a regular expression for a search requires at least some knowledge about what is being searched, and the results returned from a grep are all the matches to the given regular expression, but are all relevant. Grep utilities are part of the operating system or integrated within an IDE and the results can't be viewed from the Web.

Although Web search engines aren't as precise as grep utilities, these search engines allow inexact matches, rank the results by relevance and display them in a Web-viewable form. Search engines also compute a relevance score for a particular result based on how many occurrences of the search keywords appear in the results.

CodeCrawler combines features of Web search engines and grep utilities, adding knowledge about programming language syntax and source code semantics to allow searches that more accurately determine the relevance of search results. It provides a Web interface to enable users to submit queries using regular expressions found in grep searches, keywords used in Web searches and special programming specific extensions.

Search results will be ranked by relevance, taking into account source code semantics, such as class, method and variable, and point to the original source code. CodeCrawler will support many programmable languages and will be extended with support for new programming languages.

About the Author

Kathleen Ohlson is senior editor at Application Development Trends magazine.

Featured

AppTrends

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

VSLive! 4-Day Hands-On Training Seminar: Hands-on with Blazor
May 5-8, 2025

Cybersecurity & Ransomware Live! VirtCon 2025
May 13-15, 2025

VSLive! 3-Day Hands-On Training Seminar: Master Modern JavaScript: Unlock the Full Potential of Your Code
June 2-4, 2025

VSLive! 2-Day Hands-On Training Seminar: Asynchronous and Parallel Programming in C#
June 24-25, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
July 15-18, 2025

Visual Studio Live! @ Microsoft HQ
August 4-8, 2025

Visual Studio Live! San Diego
September 8-12, 2025

Live! 360 2-Day Hands-On Seminar: Swimming in the Lakes of Microsoft Fabric and AI – A Hands-on Experience
September 18-19, 2025

Live! 360 Orlando
November 16-21, 2025

Artificial Intelligence Live! Orlando
November 16-21, 2025

Cloud & Containers Live! Orlando
November 16-21, 2025

Cybersecurity & Ransomware Live! Orlando
November 16-21, 2025

Data Platform Live! Orlando
November 16-21, 2025

Visual Studio Live! Orlando
November 16-21, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
December 16-19, 2025

Free White Papers

More Tech Library