3.3. Offline mode: separating instrumentation and execution

3.3. Offline mode: separating instrumentation and execution
	3. Getting Started (ANT)

As convenient as the on-the-fly mode is, in many cases it is not sufficient. For example, running a commercial J2EE container in a custom instrumenting classloader is practically impossible. Certain (bad) coding practices also fail for code executing in a custom classloader. Finally, in large scale development there is a common need to collect and merge coverage data from multiple execution runs and multiple JVM processes.

This is where separate instrument/execute/report phases are a necessity. This section repeats the previous exercise using <emma> ANT task, which provides several subtasks for managing offline instrumentation: <instr>, <merge>, and <report>. In a typical ANT build each <emma> tag acts as a container for an arbitrary sequence of sub-tags. This design allows for a simple form of build flow control, whereby entire sequences of EMMA commands can be disabled at a single point.

Let's go back to the starting point of the previous section and assume that you have a build.xml file with EMMA tasks imported and the following build infrastructure created:

  <!-- root directory for the example source code: --> 
  <property name="src.dir" value="${basedir}/src" />

  <!-- javac class output directory: -->
  <property name="out.dir" value="${basedir}/out" />

  <!-- output directory used for EMMA work files and coverage reports: -->
  <property name="coverage.dir" value="${basedir}/coverage" />

  <target name="init" >
    <mkdir dir="${out.dir}" />
    <path id="run.classpath" >
      <pathelement location="${out.dir}" />
    </path>
  </target>

  <target name="compile" depends="init" description="compiles the example source code" >
    <javac debug="on" srcdir="${src.dir}" destdir="${out.dir}" />
  </target>

  <target name="run" depends="init, compile" description="runs the examples" >
    <java classname="Main" classpathref="run.classpath" >
 </target>

In a real world project the actual application could be either an end user application or your test framework driver. Adding offline coverage instrumentation and reporting to this build is not much harder than it was in the command line tools case, in Section 2.3, “Offline mode: separating instrumentation and execution”. All you need to do is sandwich <java> (or your test framework driver, or anything that can run on Java classes) between EMMA's <instr> and <report>:

  <target name="emma" description="turns on EMMA instrumentation/reporting" >
    <property name="emma.enabled" value="true" />
    <!-- EMMA instr class output directory: -->
    <property name="out.instr.dir" value="${basedir}/outinstr" />
    <mkdir dir="${out.instr.dir}" />
  </target>

  <target name="run" depends="init, compile" description="runs the examples" >
    <emma enabled="${emma.enabled}" >
      <instr instrpathref="run.classpath"
             destdir="${out.instr.dir}"	
             metadatafile="${coverage.dir}/metadata.emma"
             merge="true"
      />
    </emma>

    
    <java classname="Main" fork="true" >
      <classpath>
       <pathelement location="${out.instr.dir}" />
        <path refid="run.classpath" />
        <path refid="emma.lib" />
      </classpath> 
      <jvmarg value="-Demma.coverage.out.file=${coverage.dir}/coverage.emma" />
      <jvmarg value="-Demma.coverage.out.merge=true" />
    </java>

    <emma enabled="${emma.enabled}" >
      <report sourcepath="${src.dir}" >
        <fileset dir="${coverage.dir}" >
          <include name="*.emma" />
        </fileset>

        <txt outfile="${coverage.dir}/coverage.txt" />
        <html outfile="${coverage.dir}/coverage.html" />
      </report>
    </emma>
  </target>

When EMMA instrumentation is enabled via emma.enabled build property, the sequence of logic here is as follows:

	Again, there is a emma helper target to set `emma.enabled` to `true`. Additionally, this target defines `${out.instr.dir}` property for coverage-instrumented output. It is important that this property exist only when `emma.enabled=true`, as you will see later.
	EMMA's offline instrumentor executes and copies instrumented classes into `${out.instr.dir}`. Class metadata is dumped into a `metadata.emma` file. (Note that <instr> accepts the same `run.classpath` path element reference as the original build: this is what makes EMMA so easy to integrate with ANT.)
	The application runs with instrumented classes.
	In version 2.0, EMMA's runtime coverage data is dumped by a JVM exit handler and for this to happen <java> needs to be forked.
	As in the command line case, the instrumented classes need to be first in the classpath. Note that this only happens when `${out.instr.dir}` property has been defined by emma helper target. Also, <java> needs to have `emma.jar` in its classpath (recollect the path reference created in Section 3.1, “Adding EMMA tasks to your ANT build”).
	For certainty, <java> is configured with an explicit filename that accepts runtime coverage profile data.
	Each time you run, do you want to accumulate coverage data or discard previous results? The default is the former, which is what is forced explicitly above. Note, however, that you could set `coverage.out.merge` to `false` and make EMMA delete the previous coverage dump file before creating a new one.
	EMMA report processor executes.
	The report processor combines the class metadata and runtime coverage profile to produce a couple of reports.

Overwrite mode

EMMA can do "in place" instrumentation (overwrite mode), whereby the classes and jar files are overwritten with their instrumented versions. However, this is easy to abuse, so the build above is being careful and creates a separate directory for coverage-instrumented output.

Let's see the new build in action. Again, code coverage is turned on only if emma target appears on ANT command line before any run targets:

>ant emma run
Buildfile: build.xml

emma:

init:
    [mkdir] Created dir: .../examples/out

compile:
    [javac] Compiling 4 source files to .../examples/out

run:
[emma.instr] processing instrumentation path ...
[emma.instr] instrumentation path processed in 200 ms
[emma.instr] [3 class(es) instrumented, 0 resource(s) copied]
[emma.instr] metadata merged into [.../coverage/metadata.emma] {in 31 ms}
     [java] EMMA: collecting runtime coverage data ...
     [java] main(): running doSearch()...
     [java] main(): done
     [java] EMMA: runtime coverage data written to [.../coverage/coverage.emma] {in 15 ms}
[emma.report] 2 file(s) read and merged in 0 ms
[emma.report] writing [txt] report to [.../coverage/coverage.txt] ...
[emma.report] writing [html] report to [.../coverage/coverage.html] ...

BUILD SUCCESSFUL
Total time: 7 seconds

Instrumenting just the right classes

Unlike some other tools, EMMA's design is based around filtering for the right set of classes at instrumentation time. Doing so allows EMMA to scale to enterprise size projects and ensures the best performance throughout all three offline stages: instrumentation, execution, and reporting. To select a subset of classes to be processed, you can use two complementary techniques:

segment your ANT path elements well, do not lump everything into just one path. Specifically, keep third-party libraries, your application classes, and testcase classes in separate path elements. It will be easy to combine them when needed and code coverage will be easier because you will have just the right path element to use as <instr>'s instrpath attribute or nested element: your application classes.
additionally, you can give <instr> several <filter> nested elements, each potentially containing a list of class name inclusion/exclusion patterns. This will allow you to zoom in on just the right part of your application. And adding an ANT command-line override for one of those will allow every developer on the team to narrow things down to their own module. See the exact ANT syntax for specifying coverage filters in the reference manual.

Note that your build can contain several instrumentation and execution stages and <report> will happily merge all of the results in memory before generating the reports. To merge all files on disk (for maintenance and disk storage reasons) you can use <merge> sub-task:

  <target name="merge" description="demonstrates dump file merging" >
    <emma>
      <merge outfile="${coverage.dir}/session.emma" >
        <fileset dir="${coverage.dir}" >
          <include name="*.emma" />
        </fileset>
      </merge>
    </emma>
  </target>

In EMMA's offline mode, you are in complete control of mixing and matching metadata and coverage data from different application runs. You can instruct all tools to merge all data in the same file or keep everything in separate file repositories. EMMA tools default to merge=true output file mode for metadata and runtime coverage data, but properties exist to alter this behavior. See Chapter3, EMMA Property Reference in the reference manual for full details on EMMA configuration.

To summarize, an existing build.xml can be converted to use EMMA's offline instrumentation mode by following these steps:

add EMMA task definitions
add the necessary <instr> tasks, making sure that the application classes are instrumented before they are used at runtime
configure classpaths and coverage inclusion/exclusion filters such that only your application classes, not third-party libraries or testcases, are instrumented
make sure the instrumented classes will be prepended to your application's runtime classpath (that is, if you are not packaging them as EJBs, a web app, etc instead)
add a <report> task that aggregates class metadata and runtime coverage profiles for reporting as needed
make sure there is a way to turn coverage instrumentation on and off (you can use either the existing ANT solutions for that or the enabled attribute on all EMMA tasks)

Further reading. This has been a quick intro to EMMA's offline instrumentation tools for ANT. For further details read the reference manual starting with Section3, <instr>/instr.