您的位置：首页 > 产品设计 > UI/UE

Automation for the people: Remove the smell from your build scripts

2009-09-16 10:27 417 查看

Practices to create consistent, repeatable, and maintainable builds

Paul Duvall
, CTO, Stelligent Incorporated

Paul Duvall is the CTO of Stelligent Incorporated
,
which helps companies address software quality with effective developer
testing strategies and Continuous Integration techniques that enable teams
to monitor and improve code quality early and often. He is a contributing author to
the UML™ 2 Toolkit
and currently co-authoring
Continuous Integration: Improving Software Quality and Reducing Risk
(Addison-Wesley).
Summary:
How much time do you spend maintaining project
build scripts? Probably much more than you'd expect or would like to
admit. It doesn't have to be such a painful experience. Development
automation expert Paul Duvall uses this installment of Automation for the people
to demonstrate how to improve a number of common build practices that
prevent teams from creating consistent, repeatable, and maintainable
builds.

I dislike the term "smell" when it comes to describing something
like code. It feels strange to speak anthropomorphically about bits and
bytes. It's not that the word "smell" doesn't accurately reflect a
symptom that indicates that code may be wrong, it just sounds funny to
me. Yet, I am choosing to perpetuate my vexation to describe software
builds because, frankly, many build scripts I've seen over the years stink
.

Often, even great programmers have difficulty
constructing a build script; it's as if they recently learned how to write
procedural
code -- writing large monolithic build files, copying-and-pasting
scripted code, hard coding attributes, and so on. I've always wondered
why that is (maybe because build scripts don't get compiled into
something a customer will eventually use?). Yet, we all know that build
scripts are central to creating the code the customer eventually uses
and if those scripts are a big ball of mud, creating
that software efficiently becomes challenging.

Thankfully, you can easily employ a number of practices on a build
(whether it be Ant, Maven, or even a custom one) that will go a long
way toward keeping your builds consistent, repeatable, and
maintainable. One effective way to learn how to create better build
scripts is to see what not
to do, understand why that is the case, and then see the correct
way to do something. And in this article, I take that approach. I
detail the following nine most common build smells you should avoid,
why you should avoid them, and then how to fix them:

IDE-only builds

Copy-and-paste scripting

Long targets

Large build files

Failing to clean up

Hard-coded values

Builds that succeed when tests fail

Magic machines

A lack of style

About this series

As developers, we
work to automate processes for users; yet, many of us overlook
opportunities to automate our own development processes. To that end, Automation for the people
is a series of articles dedicated to exploring
the practical uses of automating software development processes and
teaching you when
and how
to apply automation successfully.

Although this is not meant to be a comprehensive
list, it does represent some of the more common smells I've encountered
over the years in build scripts I've read and written
.
Also, some tools, such as Maven, which are designed to handle much of
the plumbing associated with builds, can help alleviate a portion of
these smells, but many of these issues can occur no matter which tool
you use.

Avoid the aroma of IDE-only builds

An IDE-only build is a build that can be executed only
through a developer's IDE and, unfortunately, this seems to be one of
the more common build smells. The problem with an IDE-only build is
that it can perpetuate the "works on my machine" problem where software
works in a developer's environment but not in anyone else's
environment. What's more, because IDE only builds are not very
automatable, they are extremely challenging to integrate into a
Continuous Integration environment; in fact, IDE-only builds are often
impossible to automate without human intervention.

Let me be clear: It's fine to use an IDE to execute a build, but your IDE shouldn't be the only thing capable
of building software. In particular, a fully scripted build enables
teams to use multiple IDEs because the dependencies will be from the
IDE to the build and not the other way around, as shown in Figure 1:

Figure 1. IDE and build dependencies

IDE-only builds prohibit automation, and the only way to fix this
stench is to create a scriptable build. There is enough documentation
and a plethora of books out there to guide you on your way (see Resources
),
and projects like Maven make it extremely easy to define a build from
scratch too. Either way, pick a build platform and make your project
scriptable as soon as possible.

Back to top

Copy-and-paste is like cheap perfume

Duplicate
code is a common problem on software projects. In fact, even many
popular open source projects have duplication percentages in the 20-30
percent range. And just as code duplication can make a software program
more difficult to maintain, so too does duplicate code in build
scripts. For instance, imagine you need to reference specific files
through Ant's

fileset

type, as shown in Listing 1:

Listing 1. Copy-and-paste Ant script

<fileset dir="./brewery/src" >

<include name="**/*.java"/>

<exclude name="**/*.groovy"/>

</fileset>

If you need to refer to this set of files elsewhere, say for
compilation, inspection, or generating documentation, you may end up
using the same

fileset

in multiple places, and if, at a later point, you need
to make a change to that

fileset

(say to exclude

.groovy

files), you may end up needing to make the change in multiple places.
Clearly, this isn't a maintainable solution; however, fixing this smell
is simple.

Ant's

patternset

type, shown in Listing 2, allows me
to reference a logical name, which represents the files I need. Now
when I need to add (or remove) additional files to the

fileset

, I have to do it only once
.

Listing 2. Copy-and-paste Ant script

<patternset id="sources.pattern">

<include name="**/*.java"/>

<exclude name="**/*.groovy"/>

</patternset>

...

<fileset dir="./brewery/src">

<patternset refid="sources.pattern"/>

</fileset>

This fix will look familiar to anyone versed in object-oriented
programming: Rather than defining the same logic over and over again in
various classes, an established practice is to place that logic into a
method, which can be called in various places. This method then becomes
a single point of maintenance, limiting cascading defects and fostering
reuse.

Back to top

Don't savor long targets

In his book, Refactoring,
Martin Fowler describes the issue with the Long Method
code smell quite nicely as "the longer a procedure is, the more
difficult it is to understand." Long methods, in essence, also end up
having too much responsibility. When it comes to builds, the Long Target
build smell presents a script that is more difficult to understand and maintain. Listing 3 shows a relatively long target:

Listing 3. Long target

<target name="run-tests">

<mkdir dir="${classes.dir}"/>

<javac destdir="${classes.dir}" debug="true">

<src path="${src.dir}" />

<classpath refid="project.class.path"/>

</javac>

<javac destdir="${classes.dir}" debug="true">

<src path="${test.unit.dir}"/>

<classpath refid="test.class.path"/>

</javac>

<mkdir dir="${logs.junit.dir}" />

<junit fork="yes" haltonfailure="true" dir="${basedir}" printsummary="yes">

<classpath refid="test.class.path" />

<classpath refid="project.class.path"/>

<formatter type="plain" usefile="true" />

<formatter type="xml" usefile="true" />

<batchtest fork="yes" todir="${logs.junit.dir}">

<fileset dir="${test.unit.dir}">

<patternset refid="test.sources.pattern"/>

</fileset>

</batchtest>

</junit>

<mkdir dir="${reports.junit.dir}" />

<junitreport todir="${reports.junit.dir}">

<fileset dir="${logs.junit.dir}">

<include name="TEST-*.xml" />

<include name="TEST-*.txt" />

</fileset>

<report format="frames" todir="${reports.junit.dir}" />

</junitreport>

</target>

This long target (believe me, I've seen much
longer ones) is
performing four distinct processes: compiling source, compiling tests,
running JUnit tests, and creating a JUnitReport. That's a lot of
responsibility, not to mention adding to the associated complexity of
all that XML in one place. This target can be broken into four
distinct, logical, targets as demonstrated in Listing 4:

Listing 4. Extract targets

<target name="compile-src">

<mkdir dir="${classes.dir}"/>

<javac destdir="${classes.dir}" debug="true">

<src path="${src.dir}" />

<classpath refid="project.class.path"/>

</javac>

</target>

<target name="compile-tests">

<mkdir dir="${classes.dir}"/>

<javac destdir="${classes.dir}" debug="true">

<src path="${test.unit.dir}"/>

<classpath refid="test.class.path"/>

</javac>

</target>

<target name="run-tests" depends="compile-src,compile-tests">

<mkdir dir="${logs.junit.dir}" />

<junit fork="yes" haltonfailure="true" dir="${basedir}" printsummary="yes">

<classpath refid="test.class.path" />

<classpath refid="project.class.path"/>

<formatter type="plain" usefile="true" />

<formatter type="xml" usefile="true" />

<batchtest fork="yes" todir="${logs.junit.dir}">

<fileset dir="${test.unit.dir}">

<patternset refid="test.sources.pattern"/>

</fileset>

</batchtest>

</junit>

</target>

<target name="run-test-report" depends="compile-src,compile-tests,run-tests">

<mkdir dir="${reports.junit.dir}" />

<junitreport todir="${reports.junit.dir}">

<fileset dir="${logs.junit.dir}">

<include name="TEST-*.xml" />

<include name="TEST-*.txt" />

</fileset>

<report format="frames" todir="${reports.junit.dir}" />

</junitreport>

</target>

As you can see, because each target has one responsibility, the code
in Listing 4 is much easier to follow. By isolating targets based on
purpose, you can reduce the complexity and, furthermore, provide the
capability to use the targets in different contexts, enabling reuse if
necessary.

Back to top

Large build files also have a strong scent

Fowler also identifies the Large Class
as a code smell. With build scripts, a similar smell is with large
build files, which are amazingly difficult to read. It's hard to know
which target is doing what and what the target's dependencies are.
This, again, creates a maintenance issue; what's more, enormous build
files usually have quite a lot of cut-and-paste aspects to them.

To reduce the size of build files, you can seek portions of the script
that are logically related and extract those aspects into smaller build
files that are executed by the main build file (for example, in Ant you
can call other build files using the

ant

task).

Typically, I like to break up build scripts by core function and
ensure
they can be executed as stand-alone scripts (think build
componentization). For example, I like to define four types of
developer tests in my Ant builds: unit, component, system, and
functional. Furthermore, I also like to run four types of automated
inspectors: coding standard, dependency analysis, code coverage, and
code complexity. Instead of placing the execution of these tests and
inspectors in one monolithic build script (along with compilation,
database integration, and deployment), I extract the test and inspector
execution targets into two separate build files as demonstrated in
Figure 2:

Figure 2. Extract build files

Smaller, more concise build files are much easier to maintain and
understand; in fact, this pattern happens to hold true for code as
well. Seems like we're seeing a pattern here, no?

Back to top

Not cleaning up

Builds
that don't strictly reduce all underlying assumptions are a disaster
waiting to happen. For instance, if your build doesn't reduce simple
assumptions, such as removing generated binaries with stale data, an
error could arise from a leftover file from a previous build. Or,
perhaps (and even worse), a build may be "successful" because
there were files from a previous build.

Fortunately, the solution is straightforward: You can easily eliminate
assumptions by removing all generated directories and files from any
previous builds. This simple action reduces assumptions and assures
that your build's success or failure status is accurate. Listing 5
demonstrates an
example of cleaning a build environment using the

delete

Ant task to
remove any files or directories used in previous builds:

Listing 5. Cleaning up before yourself

<target name="clean">

<delete dir="${logs.dir}" quiet="true" failonerror="false"/>

<delete dir="${build.dir}" quiet="true" failonerror="false"/>

<delete dir="${reports.dir}" quiet="true" failonerror="false"/>

<delete file="cobertura.ser" quiet="true" failonerror="false"/>

</target>

Stray files from older builds have been known to cause many an
unnecessary headache. Do yourself a favor and always remove any
artifact your build creates before running a build.

Back to top

The stench of hard-codedness

Just
as copy-and-paste programming prohibits reuse, so too do hard-coded
values. When build scripts contains hard-coded values, if an aspect
requires modifications, you need to modify that value in more than one
location. Or worse, you could miss one and have subtle errors
associated with mismatching values. Moreover, if you follow my advice
and choose to use multiple build scripts, hard-coded values can become
the ultimate challenge in build maintenance. Trust me on that one!

For example, in Listing 6, the

run-simian

task has a number of hard-coded paths and values, namely the

_reports

directory:

Listing 6. Hard-coded values

<target name="run-simian">

<taskdef resource="simiantask.properties"

classpath="simian.classpath" classpathref="simian.classpath" />

<delete dir="./_reports
" quiet="true" />

<mkdir dir="./_reports
" />

<simian threshold="2
" language="java
"

ignoreCurlyBraces="true" ignoreIdentifierCase="true" ignoreStrings="true"

ignoreStringCase="true" ignoreNumbers="true"  ignoreCharacters="true">

<fileset dir="${src.dir}"/>

<formatter type="xml" toFile="./_reports/simian-log.xml" />

</simian>

<xslt taskname="simian"

in="./_reports
/simian-log.xml"

out="./_reports
/Simian-Report.html"

style="./_config
/simian.xsl" />

</target>

Hard-coding the

_reports

directory may make it
difficult should I decide to push my Simian reports to another
directory; furthermore, if other tools use this directory elsewhere in
the script, someone could easily mistype the directory name, causing
reports to show up in different directories. It is much easier and more
maintainable to define a property value that points to this directory.
Then throughout the script, I can reference the property, which means
changes can be localized to one spot, the property definition. Listing
7 shows a refactored

run-simian

task:

Listing 7. Using properties

<target name="run-simian">

<taskdef resource="simiantask.properties"

classpath="simian.classpath" classpathref="simian.classpath" />

<delete dir="${reports.simian.dir}
" quiet="true" />

<mkdir dir="${reports.simian.dir}
" />

<simian threshold="${simian.threshold}
" language="${language.type}
"

ignoreCurlyBraces="true" ignoreIdentifierCase="true" ignoreStrings="true"

ignoreStringCase="true" ignoreNumbers="true"  ignoreCharacters="true">

<fileset dir="${src.dir}"/>

<formatter type="xml" toFile="${reports.simian.dir}/${simian.log.file}" />

</simian>

<xslt taskname="simian"

in="${reports.simian.dir}
/${simian.log.file}"

out="${reports.simian.dir}
/${simian.report.file}"

style="${config.dir}
/${simian.xsl.file}" />

</target>

Hard-coded values don't facilitate flexibility, they inhibit it. Just as it's easy to hard-code database connection

String

s in your source code, you should also avoid hard-coding things like paths in build scripts.

Back to top

Build succeeds when tests reek (or fail)

A
build is much more than just source code compilation, it also may
include the execution of automated developer tests, and if you want to
keep your software functioning, don't let even one failed test creep
into a build. After all, what's the point of having tests if they can't
be trusted?

Listing 8 is an example of this build smell. Notice the

haltonfailure

attribute of the

junit

Ant task is set to

false

(its default value). This means the build will not
fail even if any JUnit tests fail.

Listing 8. Smell: Build succeeds although the tests fail

<junit fork="yes" haltonfailure="false"
dir="${basedir}" printsummary="yes">

<classpath refid="test.class.path" />

<classpath refid="project.class.path"/>

<formatter type="plain" usefile="true" />

<formatter type="xml" usefile="true" />

<batchtest fork="yes" todir="${logs.junit.dir}">

<fileset dir="${test.unit.dir}">

<patternset refid="test.sources.pattern"/>

</fileset>

</batchtest>

</junit>

There are a couple of approaches to preventing this build smell. The first is simply
to set the

haltonfailure

attribute to

true

. This will prevent a build from succeeding even if a test fails.

The only thing I don't like about this solution is that I like to
see what percentage of my tests have failed so that I can see patterns
in the failure. Therefore, the second approach is to set a property if
any of the tests fail. Then, I configure Ant to fail the build after it
has executed all of the tests. Either approach will work. Listing 9
demonstrates the second approach using the

tests.failed

property:

Listing 9. Tests fail the build

<junit dir="${basedir}" haltonfailure="false"
printsummary="yes"

errorProperty="tests.failed" failureproperty="tests.failed"
>

<classpath>

<pathelement location="${classes.dir}" />

</classpath>

<batchtest fork="yes" todir="${logs.junit.dir}" unless="testcase">

<fileset dir="${src.dir}">

<include name="**/*Test*.java" />

</fileset>

</batchtest>

<formatter type="plain" usefile="true" />

<formatter type="xml" usefile="true" />

</junit>

<fail if="tests.failed" message="Test(s) failed."
/>

Builds that pass, even though tests fail, provide a false sense of
security. If tests fail, fail the build: better to deal with a problem
early than late one night when you'd rather be sleeping.

Back to top

Magic machine smells

Of all the smells covered in this article, this one is probably the most fetid, for
magic machines
are those one-of-a-kind magical pieces of hardware that happen to be the only
machines capable of building a company's software application. This
scenario isn't as far-fetched as it may seem. I've run across these
wizardly beasts a number of times in my career. These machines turn
demonic, though, when dependencies are lost or when the inevitable bit
rot strikes.

It's easy to see how a normal machine in a company's infrastructure
can turn enchanted: over time, developers inadvertently added hard
dependencies into the machine's script, made references to fully
qualified directory paths, or even installed tools that only exist on a
select machine, which slowly prevented the build from being able to run
on any other machine. See Figure 3 for an example:

Figure 3. Magic machine

Hard-coded references to a machine, paths that include
specific drives (like C:), and specific machine tools are all red flags
that will quickly hex a machine. Any time
you see a reference to the C: drive or a call to a specific tool (like

grep

), change the script immediately. If you catch yourself saying "but the

C:/Program Files/

directory is on every
machine" or some variation of this statement, think again.

Back to top

Bad style stinks

As with programming style in mainstream languages, there are analogous considerations when managing build scripts.
When considering programming style for build scripts, you need to account for the following:

property names

target names

directory names

environment variable names

indentation

line length

Personally, I prefer to leverage the rules of others as much as possible when dealing with stylistic conventions. Fortunately,
a group of individuals have created such a reference called The Elements of Ant Style
(see Resources
).
In it, the authors describe rules such as naming targets using
lowercase with hyphens separating words, line length, and indentation.
Whichever resource you choose, consistently applying stylistic rules
will help in the long-term maintenance of build files.

Back to top

Builds never smelled so nice

I
can put up with the smell of cheap perfume; however, if there's one
thing I can't stand, anthropomorphically speaking, it's the odor of
unmaintainable build scripts. Just like smelly code will surely cost
you valuable time down the road, so too can poorly designed builds. If
the waft of inconsistent, unrepeatable, and unmaintainable builds is in
the air, take the time now to refactor these vital assets. Your
development environment will smell like roses.

Resources

Learn

The Elements of Ant Style

: Scripting Ant build scripts.

"Apache Ant 101: Make Java™ builds a snap
"
(Matt Chapman, developerWorks, December 2003): Whether you're a veteran
user of Apache Ant in need of a refresher or just starting out with
this open source Java-based build tool, this tutorial provides a wealth
of information.

"Make Ant easy with Eclipse
"
(Prashant Deva, developerWorks, April 2006): Discover the Ant
integration features in the Eclipse integrated development environment
and learn how to write, build, and debug code in Eclipse through the
Ant editor.

"Project management: Maven makes it easy
"
(Charles Chan, developerWorks, April 2003): Java developer Charles Chan
introduces Maven's features and walks you through a complete Maven
project setup.

"The Magic Machine AntiPattern
" (testearly.com, July 2006): More on the Magic machine build smell.

"10 Bad Build Practices - Part I
" (testearly.com, July 2006): More on the bad practice of hard-coding values.

"Build file refactoring
" (testearly.com, March 2006): See the Replace Magic Number with Symbolic Constant technique in action.

"Automating the build and test process
" (Erik Hatcher, developerWorks, August 2001): Learn how to use Ant to build and test your software.

"Top 15 Ant Best Practices
" (Eric Burke, OnJava, December 2003): Effective ways to use the Ant build scripting tool.

The Java technology zone
: Hundreds of articles about every aspect of Java programming.

Get products and technologies

Apache Ant
: The mother of Java build platforms.

Maven
: A powerful build platform built using lessons learned from Ant.

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航