Delimited flat-file parsing often leads to brittle index-based code. In this post, I show how enums make field positions easier to read and maintain.

In the examples below, we assume the input has already been split:

String[] split = delimitedData.split("\\|");

There are caveats of using the split method this way, but they are outside the scope of this post.

Direct indexing

DbData existingData = dbHandler.getExistingData(
    split[2],
    split[7],
    split[8],
    split[9]
);

Direct indexing is compact, but brittle and hard to scan.

Local variables

String id = split[2];
String date = split[7];
String time = split[8];
String reason = split[9];

DbData existingData = dbHandler.getExistingData(
    id, 
    date, 
    time, 
    reason
);

Local variables improves readability at the call site, but field mappings are still scattered across the code base.

Enum mapping

public enum FlatFileField {
    ID(2),
    DATE(7),
    TIME(8),
    REASON(9);

    private final int index;

    FlatFileField(int index) {
        this.index = index;
    }

    public int index() {
        return index;
    }
}

DbData existingData = dbHandler.getExistingData(
    split[FlatFileField.ID.index()],
    split[FlatFileField.DATE.index()],
    split[FlatFileField.TIME.index()],
    split[FlatFileField.REASON.index()]
);

Comparison

Approach Pros Cons
Direct indexing Concise Uses magic numbers, hard to maintain, higher cognitive load
Local variables Readable at call site Field mapping still scattered
Enum mapping Centralized field positions, clearer intent Require an additional enum

Takeaway

Enums are a simple way to replace magic numbers with meaningful names when working with delimited data. They improve readability and centralize field positions. When parsing logic grows beyond simple positional access, a dedicated parser or DTO is usually a better choice.

  • Baizey@feddit.dk
    cake
    link
    fedilink
    arrow-up
    2
    ·
    4 days ago

    This is just having constants like

    final int FLATFILE_DATE_INDEX = 2

    with more steps and boilerplate?

    • The frustrated developer@programming.devOP
      link
      fedilink
      arrow-up
      1
      ·
      3 days ago

      That’s absolutely solid feedback and if you’re doing only a handful of field mappings, I’d agree that this pattern could be used. However, as mappings grow, you tent to end up with a whole bunch of constants, something that can be hard to maintain, or you simply have an overeager developer who thinks it’s a good idea to refactor the code into static final int TWO = 2, which of course leads you right back to the original problem.

      When grouping the constants together, you make it more clear how these constants tie into your domain model and you make it easier for maintainers to read and extend the code.

      Personally I wouldn’t call an enum boilerplate, since it is quite small and efficient, and I’d rather take that over Constants.java with a group of constants I cannot easily understand.