Code-Reading Tools

Diomidis Spinellis
Department of Management Science and Technology
Athens University of Economics and Business
Athens, Greece
dds@aueb.gr

Typical Tasks

Regular Expressions

Regular Expression Symbols

^Beginning of a line
$End of a line
.Any character
Expression?The expression zero or one times
Expression*The expression zero or more times
Expression+The expression one or more times
Expression{n}The expression n times
Expression{n,}The expression at least n times
Expression{n,m}The expression at least n but no more than m times
Expression1|Expression2 The expression1 or the expression2
(Expression) The expression within the brackets
\1 \2 ... \n The content of the nth bracket

Character Classes

[abc]One of a, b, or c
[a-z]A letter from a to z
[^abc]Any letter appart from a, b, and c.
\t The character tab
\n The character newline
\r The character carriage return
\a The character alert
\f The character form feed
\e The character escape
\cx The character control-x (a-z)
\d Digit
\D Non-digit
\s Space
\S Non-space

The Editor as a Code Browser

Code Searching with grep

Locating File Differences

Roll Your Own Tool: Implementation Options

Choosing an Implementation

A Simple Grep Program in Java ...

/*
 * Globally match regular expression and print
 * Modelled after the Unix command with the same name
 * D. Spinellis
 */

import java.util.regex.*;
import java.io.*;

class Grep {
    public static void main(String args[]) {
        if (args.length != 2) {
            System.err.println("Usage: Grep pattern file");
            System.exit(1);
        }

        Pattern cre = null;        // Compiled RE
        try {
            cre = Pattern.compile(args[0]);
        } catch (PatternSyntaxException e) {
            System.err.println("Invalid RE syntax: " + e.getDescription());
            System.exit(1);
        }

        BufferedReader in = null;
        try {
            in = new BufferedReader(new InputStreamReader(
                 new FileInputStream(args[1])));
        } catch (FileNotFoundException e) {
            System.err.println("Unable to open file " +
                args[1] + ": " + e.getMessage());
            System.exit(1);
        }

        try {
            String s;
            while ((s = in.readLine()) != null) {
                Matcher m = cre.matcher(s);
                if (m.find())
                    System.out.println(s);
            }
        } catch (Exception e) {
            System.err.println("Error reading line: " + e.getMessage());
            System.exit(1);
        }
    }
}

... And its Equivalent in Perl

#!/usr/bin/perl -n
BEGIN {$pat = shift;}
print if (/$pat/);

Tool Building Advice

Example: Signature Survey

Example of a signature survey

Using the Compiler

Compiler Warning Messages

(Depending on the language, some of the above may be errors)

The Compiler as a Code-Reading Tool

Code Browsers

Code browsers typicall offer the following facilities:

Code Browsers in OO

In OO languages given a class you can find:

Beautifiers

Other related tools:

Runtime Tools

Drawing Diagrams

Lab Tasks (Java)

Lab Tasks (Unix)

Further Reading

Exercises and Discussion Topics

  1. If you are using Windows try installing a Unix-type operating system on a spare (old) computer, or disk drive, or disk partition. Linux (e.g. Ubunutu), FreeBSD, and Solaris are some possible choices for a free Unix-like operating system. Alternatively, install on your Windows machine the Cygwin environment, which offers comparable functionality.
  2. Learn the regular expression syntax and the related find commands provided by the editor you are using.
  3. Write regular expressions to locate integers, floating point numbers, and a given word within a character string. Where would you use such expressions?
  4. Learn about and experiment with the tag facility of the editor you are using.
  5. Often programmers identify code parts that need further attention using a special marker such as XXX or FIXME. Search for, count, and examine such instances in the source code tree.
  6. Propose a simple way to calculate a "measure of similarity" metric for two different files. Apply this metric to files in the source collection that share the same filename and create a list of files that could benefit from a structured approach towards code reuse.
  7. Identify code-reading tasks that can benefit by a custom-built tool. Briefly describe how each such tool could be built. Implement one of the tools you described.
  8. Provide (or, better yet, locate) code examples for some of the compiler warnings discussed in this lecture. Can your compiler detect them? Are there legitimate reasons for using such code?
  9. Many warnings generated by C compilers would be treated as errors in other strongly typed languages. Find a list of warnings that your C compiler generates, and mark those that identify code with legitimate uses. See how many of them are detected in other languages.
  10. Run some of the winning IOCCC (http://www.ioccc.org) entries through the C preprocessor and try to understand how the program works and how the original obfuscation was achieved.
  11. Compile a program (with compiler optimizations disabled) and read the generated symbolic code. Repeat this process with all optimizations enabled. Describe how arguments are passed between functions in each case.
  12. Format one of the course's example files using a pretty-printer.
  13. See if your editor supports user-defined syntax coloring. Define syntax-coloring for a language that you use and the editor does not support.