Using HTML Linking to Help Novice Programmers to Reuse Components

Bohdan Nebesh
Department of Computer Science
The George Washington University
Washington D.C. 20052
danko@acm.org
and
Dr. Michael Feldman
Department of Computer Science
The George Washington University
Washington D.C. 20052
mfeldman@seas.gwu.edu

This paper appeared originally in Proc. 28th ACM-SIGCSE Technical Symposium on Computer Science Education, San Jose, CA, March 1997. We have added some hyperlinks and reformatted parts of the paper.

Click HERE for the slides from this presentation.

    1. Introduction
    2. Problems with Current Systems
    3. Ada 95 and the Java API
    4. System Description
    5. Implementation
    6. Preliminary Experimental Results
    7. Conclusion
    8. Acknowledgments
    9. References

Abstract

Software reuse needs to be taught early in the computer science curriculum. One of the major obstacles that students face when reusing software is the difficulty in learning how to use components from a software library. To aid in understanding components we built a tool that automatically embeds Hypertext Markup Language (HTML) links in Ada 95 specification files. Derived types are linked to their parent types, child packages are linked to their parents, and all subprogram parameter and return types are linked to their declarations. We conducted a controlled experiment to determine if these links help novice programmers to learn to use library components. Researchers have not formally investigated which comprehension techniques are effective and which are not. Our results indicate that our techniques are effective in aiding novice programmers to learn to use a reusable component.

1. Introduction

Software reuse is needed to meet the demands that are being placed on software development teams today. To address this need, the teaching of object-oriented programming early in the undergraduate computer science curriculum is being advocated [6]. Along with teaching object-oriented programming, many educators believe that teaching reuse early in the curriculum is essential [1]. They believe that students should learn to reuse components before learning how to create components. Many computer science curricula still teach students to build everything from scratch [7].

One of the major problems that students have with learning object-oriented programming is understanding how to use inheritance properly, especially deciding when to use composition instead of inheritance [1]. Teaching students about the proper use of inheritance by teaching software reuse first has been recommended [1]. Unfortunately learning how to use a given component from a component library is difficult. Often learning how to reuse a component is more time consuming then reinventing the component. This research will make it easier for programmers, especially novice programmers to learn how to use a reusable component from a library.

As the popularity of the World Wide Web (WWW) increases more software repositories will be widely available. Already programmers browse the WWW in search of needed software applications and components. Most of the component repositories on the WWW have a simple indexing scheme to aid in finding a specific component, usually a text file containing an alphabetical listing of all the components in the library. The repositories usually contain some component documentation; although in most cases the documentation is neither comprehensive nor very useful. Once a potentially useful component library is found it usually is downloaded to the computer system on which it will be used. Then the programmer needs to look at the source specification files in the repository to see how well the components match his or her needs and to understand how to use them. Unfortunately this step is not done with a Web Browser, but most likely with a program editor. We believe that a programmer should be able to search the WWW for desired components and look at the components in detail using the same tool, a Web Browser.

Object-oriented libraries expanded the problem of placing information about a component in more than one file. Even well documented object-oriented libraries place information about the use of a component in several files. For example, the use of inheritance in a library separates component information into base component files and a series of derived component files. Thus, to fully understand the functionality of a given component one has to look at all of the component’s ancestors.

Another problem with understanding an object-oriented library is the complexity of subprogram arguments. Often the arguments are components that must be initialized prior to calling the subprogram. This initialization information is usually stored in a separate file. The problem is compounded if the component that needs to be initialized has an initialization subprogram that requires yet another component as an argument. This research attempts to overcome these problems.

2. Problems with Current Systems

Programmers usually go through the following five steps when reusing components: identify the required components; find components that potentially match the requirements; understand the components; adapt the components to the exact needs; and compose the components [2, 3]. Much work has been done on building tools to make this process easier.

There are two major problems with systems that are currently being used by programmers reusing components from a library. One, all the systems require the user to learn to use a new tool that is usually language and computer system dependent. Two, the system usually works with no other tools or works within a complete software engineering environment. Thus, the tool does not easily integrate into a programmer’s current working environment. None of the systems unify component search and retrieval with component understanding.

Many reusability tools are very complicated and are not easy to use. Many systems’ usability studies indicate that users did not use all the available features. Users typically picked a small set of features and used them throughout the study. We do not know if the popular features were the most useful or just the easiest to use. We believe that it is very important to learn not just which systems are useful, but which comprehension techniques are effective in understanding a reusable component library. Without measured information about the effectiveness of various techniques it is difficult to determine the proper direction for reusability research.

3. Ada 95 and the Java API

Since our work was tested on the Ada/Java bindings, we’ll review the basic aspects of Java. The Java system consists of:

Much has been made of J-code's platform independence, however its language independence has received less publicity. Nevertheless, compilers are available or under development for (at least) Rexx, Smalltalk, and Ada 95.

The Ada 95 compiler, called AppletMagic and developed by Intermetrics, is available for several platforms as a free public beta from http://www.inmet.com/javadir/download/index.html. This compiler comes with a set of Ada 95 package interfaces that collectively implement a "binding" to the Java API. Thus an Ada 95 program can be written to make full use of the applet capabilities of the Java system, including the GUI classes.

Other AppletMagic work at this university has been reported in [5].

4. System Description

Ideally we would like to see a programmer search the Web for a desired component library using a Web search application. Once a candidate library is found the programmer should be able to browse through all the library’s documentation and specification files. The documentation includes: component descriptions, specification files, example programs, performance information, testing information, requirements, analysis and designs. Providing this information is a burden on the library developer. No one has determined which information is essential and which information is useful for component comprehension.

We decided to link only information from the specification files since this is the most commonly available library information. We created an index page that consists of an alphabetical listing of all package names linked to their respective specification files. This information is used to find a desired component from the library. In the future it would be interesting to integrate our system with a library search tool. In this way more powerful searching capabilities would be available for finding the desired component.

Since many libraries are not well documented, especially freely available ones, we assume that a library’s specification files are the sole documentation. Also, we assume that programmers will learn to use components without any printouts of the specification files. Not only is this more convenient, but more ecologically sound. We avoid using source files that describe the implementation of a component because a component should be reused without knowing its implementation details.

Our system supports the following types of linking. Every child package is linked to its parent package. Every derived type is linked to the type it is derived from. The only way to ascertain the full behavior of a type that is derived from a tagged type is to look at the behavior defined by all the type’s ancestors. Similarly every subtype is linked to its parent type. Every parameter type and return type is linked to the type’s declaration. The only way to understand how to use a subprogram is to understand its parameters and return type. Additionally every with’ed and use’d package is linked to the included package.

The linking will be illustrated using simplified specification files from the Ada/Java bindings. The Ada/Java bindings include a package called java.awt.button whose specification file is shown in Program 1. On the third line in the package there is a link to java.awt.button’s parent package. Clicking on java.awt will display package java.awt. This is an example of child package to parent package linking.

On the second line of code in package java.awt.button we see that package java.lang.string is with’ed and use’d. To see package java.lang.string, we just have to click on the java.lang.string link. This is an example of links to with’ed and use’d packages.

Program 1

The fourth line of package java.awt.button declares type button_obj to be derived from type component_obj. If we click on the component_obj link, package java.awt.component is displayed. Part of this package is shown in Program 2 below. This package declares that type component_obj is derived from type Object (line 2). This is an example of links representing the inheritance relationship.

1. package java.awt.Component is
    2. type Component_Obj is abstract new Object with null record;
    3. type Component_Ptr is access all Component_Obj’Class;
    4. function getParent(Obj : access Component_Obj) return Object_Ptr;
    5. function getPeer(Obj : access Component_Obj) return Object_Ptr;
    6. function getToolkit(Obj : access Component_Obj) return Object_Ptr;
    7. function bounds(Obj : access Component_Obj) return Rectangle_Ptr;
8. end java.awt.Component;

Program 2

If we look at functions bounds(line 7), we see that it returns a Rectangle_Ptr. To see how a Rectangle_Ptr is declared click on it. This brings up package java.awt.rectangle shown in Program 3 below. In this package we see that Rectangle_ptr is an access to a Rectangle_Obj’Class (line 8). This is an example of linking the use of a type in the subprogram signature to the type’s declaration. This type of information is required for a programmer to understand what type of components can be passed as arguments. Often an argument component must be created or initialized. This information is usually located in the file that defines the component that is being passed. If the component file names are not intuitive then finding the correct file can be difficult, especially for a novice programmer.

1. package java.awt.Rectangle is
    2. type Rectangle_Obj is new Object with record
      3. x : Integer;
      4. y : Integer;
      5. width : Integer;
      6. height : Integer;
    7. end record;
    8. type Rectangle_Ptr is access all Rectangle_Obj’Class;
    9. function new_Rectangle return Rectangle_Ptr;
10. end java.awt.Rectangle;

Program 3

We did not create any links from a parent component to all its children because this information would be presented better in a graphical diagram. This information would be used to discover what kinds of variations of a component are available. Often a programmer will find a component that somewhat fits his/her needs, but the component needs some modifications to fully meet the requirements. One of the parent component’s children may already have the proper modifications, or may be closer to matching the programmer’s requirements than the parent. This link can identify what kind of components can be placed into a container component.

Our system will not do any re-formatting or highlighting of the specification files for increased readability because we decided to concentrate on the issue of what to link. We conducted a controlled experiment to determine whether the linked specification files effectively aid in component comprehension.

We tested our system on the Ada/Java bindings since this is the first industrial-strength inheritance-based Ada 95 library we have seen. We publicly released the following Hypertext-Linked Ada Libraires: AppletMagic Bindings to Java libraries, Ada 95 Standard Libraries and GNU Ada 95 (GNAT) Extensions, and the GNAT Bindings to Ada 95 Macintosh Application Framework (AMAF). The adalib-html embedding software is also freely available.

5. Implementation

Our software automatically converts a set of Ada 95 specification files into a linked set of HTML files, creating one HTML file for every Ada 95 file plus one HTML index file. We chose Ada 95 as the target language because we wanted to provide the Ada community with a tool that will help it to take advantage of the new features available in Ada 95. Our system is designed to minimize the effort required to change target languages.

The system is implemented as a two-pass parser using AYACC, which is like YACC (yet another compiler compiler) except that it generates Ada code. Initially it parses the set of Ada 95 files to extract all the information that is required for linking. The second pass uses the extracted linking information to embed links in the files. The lexical analyzer is used to read tokens and to embed linking information. We wrote a lexical analyzer without using any tools like ALEX because we needed to give our lexical analyzer the ability to embed links and to write output files.

The system is designed to be completely independent of the formatting of the input files. The input files must only be consistent with the Ada 95 grammar. The system design makes it easy to add the ability to format and indent the input files. Similarly the system could be modified to support software maintenance work. For example, both specification and body files could be parsed, and HTML links representing call graphs could be embedded.

A Web browser is used to traverse the hypertext links associated with the components in the source code, as was similarly done by Helm and Maarek [4]. Since several Web (HTML) browsers have been developed and tested on most computers our work will not be limited to a specific computing platform. We used HTML without embedding any extra parsable information within it. This is important because it allows any HTML display program to display the embedded source code.

6. Preliminary Experimental Results

We conducted a controlled experiment to determine if the links we created were useful. The experiment consisted of four practice and five actual questions that subjects had to answer about the Ada/Java bindings. All the subjects had to browse through the Ada/Java bindings in order to answer the questions. We had three groups of subjects in our experiment. Each group interacted with the Ada/Java bindings using a Web browser, but each had a different interface. All three groups began browsing by viewing a Web page that contained a list of packages linked to the specification file that declared the given package.

The first group had access to all the specification files that were linked with our tool. The second group also had access to linked specification files and in addition had access to a UNIX grep based searching tool. The third group had access to unlinked specification files and access to the searching tool. The third group represented the typical way in which programmers currently interact with libraries, that is, using grep and a text editor. The first group represented the new way of exploring libraries using hypertext links. The second group measured the effect that using grep would have on subjects using the new system.

The subjects were drawn from two pools, one of novice programmers and one of expert programmers. All the novice programmers were undergraduate computer science students, primarily from the George Washington University and University of Scranton. The expert programmers all had at least a bachelor degree in computer science or a related field and also had several years of programming experience beyond their undergraduate education. All the experts understood object-oriented programming and new several programming languages. They were all exposed to either the Pascal or Ada programming languages.

Our results indicate that using just the linked files the novice group of programmers was able to answer most accurately and quickly the questions. The time difference among the three groups is statistically significant. The novice group without the linked files was the least accurate and by far the slowest in answering the questions. They were also much more frustrated during the experiment. The differences among the three groups were greater for the novice programmers than for the expert programmers.

All links were used by subjects except the with’ed and use’d package links. Preliminary results indicate that “with” and “use” links are not very useful for understanding how to use a component. The links from a child package to a parent package were not as useful as the other links. This is probably because the Ada/Java bindings do not have a deep package hierarchy that is no more than three packages deep.

The links from a derived type to its parent were very useful, especially for searching for subprograms that can act on a type that is derived from a tagged type. Similarly the links from subprogram parameter and return types to their declarations were very useful in determining how to use a subprogram. The usefulness of both these types of links is illustrated in the following example:

To determine the font family type of a button object first traverse the link to the button object’s parent, component object. The component object contains a function called getFont, which returns an access to a font object. By traversing the link from the return type of getFont, we view the font object package. This package contains the getFamily function that returns a string containing the name of the font family associated with the current font object.

This example also illustrates how our experimental questions relate to actions that programmers must accomplish in order to understand a reusable library.

After completing the experiment, some of the subjects in the linked novice group said that they finally understood how inheritance works. This indicates that a software library that was linked with our tool might be used to promote discovery learning of object-oriented programming. At the same time it might be used to encourage students to reuse components at an early point in the computer science curriculum. Since our experiment was not designed to test subject’s understanding of object-oriented programming concepts we cannot confirm these possible advantages of our tool.

7. Conclusion

We believe that students need to learn to reuse components from a reusable library early in the computer science curriculum. We also believe that learning how to use a component from a library can be very difficult. The hypertext linking of specification files can be used to overcome this problem. To determine which comprehension techniques are effective and which are not we conducted a controlled experiment. Results from our experiment indicate that embedding HTML links into specification files can make it easier for novice programmers to understand how to use a component from a library. Specifically linking derived types to their parent types and subprogram parameter and return types to their declarations can help programmers to learn how to use a component. Linking of child packages to parent packages is useful, although not as essential as the other links.

8. Acknowledgments

This work was funded in part by DISA grant DCA100-95-1-1-0011. We would like to thank Drs. Rachelle Heller, Dianne Martin, Jack Beidler, and Claudia Pearce for valued research assistance.

9. References

1. Biddle, R., Tempero, E., “Explaining inheritance: A Code Reusability Perspective,” ACM SIGCSE Bulletin, March 1996, pp. 217-221.

2. Biggerstaff, T., and Richter, C., “Reusability Frameworks, Assessment, and Directions,” IEEE Software, March 1987, pp. 41-49.

3. Capretz, L., and Lee, P., “Reusability and Life Cycle Issues Within an Object-Oriented Methodology,” Proceedings of the Technology of Object-Oriented Languages and Systems Conference 8, 1992, pp. 139-150.

4. Helm, R., and Maarek, Y., “Integrating Information Retrieval and Domain Specific Approaches for Browsing and Retrieval in Object-Oriented Class Libraries,” ACM OOPSLA 91 Conference Proceedings, October 1991, pp. 47-61.

5. Kann, C. W. et al, "Experiences Using Ada and Java in Computer Science Education," Proceedings of the Tenth ASEET Symposium, Prescott, AZ, June 1996, p. 41-49.

6. Kolling, M., and Rosenberg, J., “BLUE - A Language for Teaching Object-Oriented Programming,” ACM SIGCSE Bulletin, March 1996, pp. 190-194.

7. Tewari, R., “Software Reuse and Object-Oriented Software Engineering in the Undergraduate Curriculum,” ACM SIGCSE Bulletin, March 1995, pp. 253-257.