Logging is a code smell

Use events instead of repetitious logging calls to avoid a well-known code smell.

Introduction

Code smells are issues with either the design or coding style of software such that they negatively impact a program’s comprehensibility and maintainability. Code duplication, a well-known code smell, is when the same code structure appears many times. Studies have shown both positive and negative impacts on duplicating source code, with findings weighted towards detrimental effects. Drawbacks to duplicating code include: propagating bugs, introducing new bugs, increased difficulty to improve or modify a system, increased maintenance costs, and additional resource requirements.

Logging is an example of code duplication. When I started working on my plain text editor, KeenWrite, I deliberately avoided inserting logging statements directly into the code. Topically enough, an overarching goal of the editor is to ease inserting and referencing variables within documents, which helps reduce duplicated content. Later, we’ll circle back to how the editor handles log messages. First, let’s take a deep technical dive into logging.

Logging

Logging is useful because it:

In Java, logging statements are often cloned in the following form, where the call to debug will vary depending on the situation’s severity (e.g., by calling error or info instead):

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

private static final Logger sLogger =
  LogManager.getLogger( Classy.class );

sLogger.debug( "descriptive text" );

Effectively, those lines interweave logging framework knowledge into application logic, without strong justification. This repeated anti-pattern is endemic (as evidenced by Californium, Cassandra, ElasticSearch, Intellij IDEA, JBoss, Jitsi, Selenium, SonarQube, Splunk, and countless others). Although Aspect-Oriented Programming (AOP) can help decouple logging, using AOP can lead to intractable debugging and testing problems. Further, despite Logger being an interface, swapping out the implementation may range from trivial to laborious depending on the extent of its API usage.

In each case—debugging, information, or intervention—logging statements describe the system’s state at a particular place in a program. That is, log messages capture system events.

Events

Conceptually, an event bus is responsible for receiving events and dispatching the events to listeners that have subscribed to particular events. An entity that posts events to the event bus is called a publisher and an entity that receives events is called a subscriber. To draw analogy from the bingo card game: the random number generator is the publisher, the caller is the event bus, the players are the subscribers, and a randomly selected number is an event.

Rather than log in situ, publishing events to an event bus offers the following advantages:

The main disadvantages include:

Transitioning from direct logging to an event-based mechanism is mostly a mechanical change, albeit many of them. Generally speaking, the changes entail reading the log message then transforming the message into a suitably equivalent event class, such as:

// Before
sLogger.debug( "system entered new state" );

// After
NewSystemStateEvent.publish();

The next sections transform real-world log messages into events, based on code fragments from the software applications mentioned previously.

Cassandra

The following code checks a condition then logs when the condition remains true:

if( preciseTime.now() - lastPause < MAX_LOCAL_PAUSE_IN_NANOS ) {
  sLogger.debug( "Not marking nodes down due to local pause" );
  return;
}

As an event, the code would resemble:

final var localPause = preciseTime.now() - lastPause;

if( localPause < MAX_LOCAL_PAUSE_IN_NANOS ) {
  NodeMarkingInactiveEvent.publish( localPause );
  return;
}

Notice how the revised code no longer has a concept of logging, a hard-coded string, or any concerns regarding the frequency that messages are logged. Further, whether the value of the local pause is also logged can be changed without needing to update the core logic, which adheres to the single responsibility principle.

ElasticSearch

The next code snippet embeds a ternary conditional when logging:

sLogger.debug(
  "[{}] testing snapshot [{}] deletion eligibility: {}",
  repository,
  snapshot.snapshotId(),
  eligible ? "ELIGIBLE" : "INELIGIBLE"
);

Using an event removes the log message formatting altogether, giving the event handler the responsibility of formatting the ternary condition (possibly using a resource bundle for internationalization purposes):

SnapshotEligibilityTestEvent.publish( snapshot, repo, eligible );

When developing object-oriented software, keeping concerns separate allows them to vary independently.

JBoss

Here’s a classic logging example pertaining to a security change:

sLogger.debug( "Assign user to role " + roleName );
userRoles.addMember( p );

We can imagine additional behaviours to hook into that event: clearing web session cache, creating an audit trail, emailing a system administrator, forcing a password update, etc. To open that door, we could write:

userRoles.addMember( p );
RoleAssignedEvent.publish( p, roleName );

Using an event helps fix a subtle issue with the original implementation: If the call to addMember fails to assign the role, the log will contain a message that implies the user’s role was assigned. Nobody enjoys debugging red herrings. Knowing that an event may be used by other components, presumably a unit test would exist to verify the program’s correctness; only after a successful role assignment would the event be published, not before.

Jitsi

In the next example, an event kicks off a log message:

PropertyChangeEvent event = // ...

sLogger.error(
  "A PropertyChangeListener threw an exception"
  + " while handling a PropertyChangeEvent.",
  t
);

The event happens when attempting to change a mutable object’s value. If a problem is encountered when changing the value, an error is logged. We can improve system diagnostics by having the event publish a related event as follows:

PropertyChangeErrorEvent.publish( event, t );

Passing the original PropertyChangeEvent instance into our new event affords the opportunity to inform the end user of both the failure and the data that caused the error (i.e., the old and new values). This gives more control over the output format by not having to rely on the stack trace message including the old and new values, to say nothing of internationalization.

Selenium

Another randomly selected example shows a double dose of duplication alongside logging call usage differences:

try {
  if( !service.awaitTermination( 5, SECONDS ) ) {
    LOG.warning( format( "Failed to shutdown %s", name ) );
    service.shutdownNow();
  }
} catch( InterruptedException e ) {
  Thread.currentThread().interrupt();
  LOG.log( WARNING, format( "Failed to shutdown %s", name ), e );
  service.shutdownNow();
}

We can remove the duplication by inverting the conditional and publishing a single event before the end:

Exception ex = new RuntimeException();

try {
  if( service.awaitTermination( 5, SECONDS ) ) {
    return;
  }
} catch( final InterruptedException cause ) {
  Thread.currentThread().interrupt();
  ex = new RuntimeException( cause );
}

ShutdownFailedEvent.publish( service, ex );
service.shutdown();

To avoid coupling individual event handlers to a specific logging implementation, we could coerce the shutdown failure event into a log event, such as:

@Subscribe
private void handle( final ShutdownFailedEvent event ) {
  LogEvent.publish( event.getMessage(), event.getException() );
}

Doing so would lead to architectural duplication because that pattern would be repeated everywhere dispatching log events is required. We can improve the design: We have the technology. Well, we don’t, but we will. Read on.

Reporting framework

Let’s separate some concepts that would otherwise lead to repetitious code:

A rough design of each follows.

Register

The register is responsible for adding subscribers to the event bus. Although the primary purpose is for logging, with due care it is possible to build a general framework. Such a framework hinges on using a builder-like pattern to link events with subscribers via a bus. This may resemble the following diagram:

Register

Broadly, one possible sequence for registering events follows:

  1. whenEvents( Event.class ) — register a set of classes
  2. thenFormat( formatter1 ) — define how to format a string
  3. thenFormat( formatter2 ) — optionally, define a second formatting
  4. thenApply( strategy ) — instruct how to handle the formatted string
  5. subscribe() — register the events

For example, we’d like to register the events as follows:

ReportRegister
  .whenEvents(
    NodeMarkingInactiveEvent.class,
    SnapshotEligibilityTestEvent.class,
    RoleAssignedEvent.class,
    ShutdownFailedEvent.class
  )
  .thenFormat( new ClassNameFormatter() )
  .thenFormat( new TextCountFormatter() )
  .thenApply( formatter -> new PeriodicStrategy( formatter, ... ) )
  .subscribe();

When run, the above snippet would log each of the listed events periodically, to avoid spamming the logs. Next, we’ll review formatters, followed up by strategies.

Formats

Formatters are responsible for creating text strings from incoming events. Quite often events need no context. That is, the event itself is sufficient to understand the system state by indicating control flow. Consider the following class diagram:

Formatter

The formatters include:

Formatters separate generating the event message text strings from the actions applied to those strings.

Strategies

Strategies dictate how events are handled. Keeping a generalized framework in mind, we can conceive of the following type hierarchy:

Strategy

The register does not mandate using a particular strategy; however, since the subject is logging, we’ll focus on designing various logging strategies.

Logging strategy

The logging strategy is responsible for writing preformatted text messages to the log. This can be accomplished in a few ways. First, by having a reference (the only reference) to the logger implementation. Second, by publishing its own LogEvent instances to the event bus. Architecturally, either implementation will accomplish the same goal of isolating the logger from the rest of the system.

In the tables that follow, a value in the Event column indicates that an event was published at the time listed in the Time column. Entries in the Log column mark the messages written to the log; a blank entry means the log message for the event was suppressed.

Regulating strategy

A regulating strategy logs the event after a predefined interval has elapsed since the event was logged previously. Consider using a regulating strategy with an interval of five minutes, with example events fired at the times listed in the following table:

Regulating Strategy
EventTimeLog
12:00.00[12:00.00] Message
12:02.61
12:03.17
12:04.98
12:05.02[12:05.02] Message
12:10.00
12:10.02[12:10.02] Message

The first event is logged because it is the first instance. The three subsequent events are suppressed (not logged) because the interval has not elapsed. After the interval elapses, the next event instance is logged immediately. The second last event message is suppressed because five minutes has not passed since the most recent event.

Periodic strategy

A periodic strategy logs whether an event has occurred, but only reports on a fixed schedule. The following table lists entries logged with a five-minute schedule:

Periodic Strategy
EventTimeLog
12:00.00[12:00.00] Message
12:02.00
12:03.50
12:05.00[12:05.00] Message
12:09.12
12:10.00[12:10.00] Message
12:15.00

The second and third events are suppressed because the strategy needs only track whether a particular event type has fired since the previous message was logged. When the timer finally activates, a single message is logged. The final entry in the table is blank because no event transpired between 12:10.00 and 12:15.00.

Note that implementations can share a single timer thread to avoid creating unnecessary threads.

Immediate strategy

Also known as the spamming strategy, the immediate strategy logs each event when it occurs, without delay or further processing. For completeness, this is captured in the following table:

Immediate Strategy
EventTimeLog
12:00.00[12:00.00] Message
12:01.47[12:01.47] Message
12:03.98[12:03.98] Message
12:06.52[12:06.52] Message

Strategy and formatter

Configuring the report register with a periodic strategy, class name formatter, and text count formatter would produce results similar to the following listing:

Periodic Strategy with Multiple Formatters
EventTimeLog
12:00.00[12:00.00] 1 ◆ event
12:02.00
12:03.00
12:05.00[12:05.00] 2 ◆ events; 1 ◇ event
12:08.11
12:10.00[12:10.00] 2 ◆ events; 2 ◇ events
12:15.00

Notice how the behaviour of when the event message is logged is separated from what message is logged, allowing for powerful constructs using very little code. In this manner, both the event and event counts are logged and tracked from a single location in the code, without having to change the core application logic.

Text editor events

My text editor uses a simpler approach because it is a desktop application that has no complex background tasks. Rather than implement formatters, strategies, and a report registry, it formats event messages directly at the call site. There are multiple ways a user is informed that a problem has occurred, which include:

In this contrived example, the R erroneous code (x?) triggers an exception when evaluated by the R engine, as the following source listing shows:

private static String evaluate( final String r ) {
  try {
    return sEngine.eval( r ).toString();
  } catch( final Exception ex ) {
    final var expr = r.substring( 0, min( r.length(), 50 ) );
    clue( get( "status.error.r", expr, ex.getMessage() ), ex );
    throw new IllegalArgumentException( r );
  }
}

The clue method creates a StatusEvent and subsequently publishes it to the event bus. Each handler, therefore, receives the exact same event instance that was published at a single location in the code base (the call to clue, above). Each handler’s code is specific to how the event is presented.

The log view’s simplified event handler adds the event information as an entry into a table widget:

@Subscribe
public void log( final StatusEvent event ) {
  final var logEntry = new LogEntry( event );

  mItems.add( logEntry );
  mTable.scrollTo( logEntry );
}

The status bar’s handler code can only show a single line, so any text in the event message is truncated after a new line prior to changing the widget’s text:

@Subscribe
public void handle( final StatusEvent event ) {
  final var m = event.toString();
  final var i = m.indexOf( '\n' );

  setText( m.substring( 0, i > 0 ? i : m.length() ) );
}

The standard output’s code writes the entire event message to the console verbatim, provided that the application isn’t suppressing status event messages:

@Subscribe
public void handle( final StatusEvent event ) {
  if( !mArgs.quiet() ) {
    System.out.println( event );
  }
}

The same event is handled three different ways without any need to couple the event publication call site to the various status event handlers.

Meta-programming

A footgun is usually a tool that’s designed such that the tool is extremely likely to be used in a way that will—inevitably and invariably—cause problems.

Give developers a footgun, and they’ll shoot their own feet off. Teach developers to use a footgun, and everyone will need crutches.

Sometimes adding debugging to help track down a problem can be too arduous, time-consuming, or overwhelming. Rather than working harder, we can work a little smarter by programming the computer to change the source code on our behalf.

Meta-programming footguns tools can greatly simplify this process of reading Java source code into an abstract syntax tree (AST). An AST is a programmable object model that can simplify updating and saving source files. Here are a few tools that provide such capabilities:

At time of writing, logger.debug returned 12,444,188 results. Those usages include a familiar pattern:

public void arbitraryAction( final Object arg ) {
  logger.debug( "start arbitrary action: " + arg );

By and large those 12 million lines of code were probably hand-written. Instead, a little time spent up-front with an AST API would allow us to implement a tool that can be run as follows:

$ java injector.jar --package=com.company.project.subpackage

The output from the tool could be a new class:

public class MethodEnterEvent {
  static Token publish( final String name, final Object... args ) {
    // ...
  }
}

For every method within every class in the given package, the injector could publish events along the lines of:

public void action( final Object arg ) {
  final var token = MethodEnterEvent.publish( "action", arg );

  // No need to duplicate the method name.
  MethodExitEvent.publish( token );
  return;

Of course, since it’s our injector, we can program it to inject at any scope: package, class, method, private static methods, etc. This technique is similar to AOP, but without modifying class binaries as part of the build process.

Libraries

Numerous publisher-subscriber (a.k.a. pub-sub) event libraries abound. A couple of gems include:

Summary

Logging is a form of code duplication. Code duplication is a well-known code smell. Tightly coupling systems to particular logging implementations by invoking the logger liberally throughout an application makes replacing the logger difficult in practice. Ideally, switching logger implementations would take no more than revising a single source file and creating a new configuration for the replacement library.

See CVE-2021-44832, the exploitable remote code vulnerability in Apache’s Log4J.

Contact

About the Author

My career has spanned tele- and radio communications, enterprise-level e-commerce solutions, finance, transportation, modernization projects in both health and education, and much more.

Delighted to discuss opportunities to work with revolutionary companies combatting climate change.