Data-Oriented Programming

What is DOP?
A Lengthy Example
That was DOP!

Data-Oriented Programming

What is DOP?
A Lengthy Example
That was DOP!

Programming Paradigms

The goal of any programming paradigm is to manage complexity.

  • complexity comes in many forms

  • not all paradigms handle all forms equally well

⇝ "It depends"

Object-Oriented Programming

Everything is an object

  • combines state and behavior

  • hinges on encapsulation

  • polymorphism through inheritance

Works best when defining/upholding boundaries.

Mixed Programming

Great use cases for OOP:

  • boundaries between libraries and clients

  • in large programs to enable modular reasoning

Consider a data-oriented approach for:

  • smaller (sub)systems

  • focused on data

Guiding principles:

Brian Goetz formulated them in June 2022:
Data Oriented Programming in Java

I offered an updated version in May 2024:
Data Oriented Programming in Java - Version 1.1

Data-Oriented Programming

Version 1.1:

  • Model data immutably and transparently.

  • Model the data, the whole data,
    and nothing but the data.

  • Make illegal states unrepresentable.

  • Separate operations from data.

Data-Oriented Programming

What is DOP?
A Lengthy Example
That was DOP!

Crawling GitHub

Starting with a seed URL:

  1. connect to URL

  2. identify kind of page

  3. identify interesting section

  4. identify outgoing links

  5. for each link, start at 1.

Crawling GitHub

That logic is implemented in:

public class PageTreeFactory {

	public static Page loadPageTree(/*...*/) {
		// [...]


What does Page look like?



  • all pages have a url

  • unresolved pages have an error

  • resolved pages have content

  • GitHub pages have:

    • outgoing links

    • issueNumber or prNumber


Possible operations on pages
(and their subtree):

  • display as interactive graph

  • compute graph properties

  • categorize pages by topic

  • analyze mood of interactions

  • process payment for analyses


Example operations going forward:

  • evaluate statistics

  • create pretty string

A Possible Implementation

A single Page class with this API:

public URL url();
public Exception error();
public String content();
public int issueNumber();
public int prNumber();
public Set<Page> links();
public Stream<Page> subtree();

public Stats evaluateStatistics();
public String toPrettyString();

A Possible Implementation


  • page "type" is implicit

  • legal combination of fields is unclear

  • clients must "divine" the type

  • disparate operations on same class

A Better Implementation

Data-oriented programming:

  • makes all data explicit and "obviously correct"

  • separates operations from data

  • models systems as production line

Applying DOP

Model data immutably and transparently.

We got a language construct that fits perfectly.

(Just in case: You can achieve this with regular classes, too.)

Detour: Records

[Finalized in Java 16 — JEP 395]

Transparent carriers for immutable data.

  • compiler understands internals

  • couples API to internals

  • reduces verbosity a lot

Detour: Records

record ExternalPage(URI url, String content) { }
  • ExternalPage is final

  • private final fields: URI url and String content

  • constructor: ExternalPage(URI url, String content)

  • accessors: URI url() and String content()

  • equals(), hashCode(), toString() that use the two fields

All method/constructor bodies can be customized.

Ensuring Immutability

Records are shallowly immutable,
but field types may not be.

⇝ Fix that during construction.

Ensuring Immutability

public record GitHubPrPage(..., Set<Page> links) {

	// compact constructor
	GitHubPrPage {
		links = Set.copyOf(links);


Applying DOP

Model the data, the whole data,
and nothing but the data.

There are four kinds of pages:

  • error page

  • external page

  • GitHub issue page

  • GitHub PR page

⇝ Use four records to model them!

Modeling The Data

public record ErrorPage(
	URI url, Exception ex) { }

public record ExternalPage(
	URI url, String content) { }

public record GitHubIssuePage(
	URI url, String content,
	int issueNumber, Set<Page> links) { }

public record GitHubPrPage(
	URI url, String content,
	int prNumber, Set<Page> links) { }

Modeling The Data

There are additional relations between them:

  • a page (load) is either successful or not

  • a successful page is either external or GitHub

  • a GitHub page is either for a PR or an issue

⇝ Use sealed types to model the alternatives!

Detour: Sealed Types

[Finalized in Java 17 — JEP 409]

Sealed types limit inheritance,
by only allowing specific subtypes.

  • communicates intention to developers

  • allows compiler to check exhaustiveness

Detour: Sealed Types

public sealed interface Page
		permits ErrorPage, SuccessfulPage {
	// ...

Only ErrorPage and SuccessfulPage
can implement/extend Page.

interface MyPage extends Page doesn’t compile

Modeling Alternatives

public sealed interface Page
		permits ErrorPage, SuccessfulPage {
	URI url();

public sealed interface SuccessfulPage
		extends Page permits ExternalPage, GitHubPage {
	String content();

public sealed interface GitHubPage
		extends SuccessfulPage
		permits GitHubIssuePage, GitHubPrPage {
	Set<Page> links();
	default Stream<Page> subtree() { ... }

Applying DOP

Make illegal states unrepresentable.

Many are already, e.g.:

  • with error and with content

  • with issueNumber and prNumber

  • with isseNumber or prNumber but no links


⇝ Reject other illegal states in constructors.

record ExternalPage(URI url, String content) {

	ExternalPage {
		if (content.isBlank())
			throw new IllegalArgumentException();


Where Are We?

  • page "type" is explicit in Java’s type

  • only legal combination of fields are possible

  • API is more self-documenting

  • code is easier to test

But where did the operations go?

Operations On Data

Separate operations from data.

⇝ Record methods should be limited to derived quantities.

public Stats evaluateStatistics();
public String toPrettyString();

This actually applies to our operations.

But what if it didn’t? 😁

Operations On Data

Pattern matching on sealed types is perfect
to apply polymorphic operations to data!

And records eschew encapsulation,
so everything is accessible.

Detour: Type Patterns

[Finalized in Java 16 — JEP 394]

Typecheck, cast, and declaration all in one.

if (rootPage instanceof GitHubPage ghPage)
	// do something with `ghPage`
  • checks rootPage instanceof GitHubPage

  • declares variable GitHubPage ghPage

Only where the check is passed, is ghPage in scope.

Detour: Flow Scoping

Only where the check is passed,
is ghPage in scope.

if (!(rootPage instanceof GitHubPage ghPage))
	// can't use `ghPage` here

// do something with `ghPage` here 😈

Detour: Patterns in Switch

[Finalized in Java 21 — JEP 441]

All patterns can be used in switches

switch (page) {
	case GitHubPrPage pr -> // use `pr`
	case ExternalPage ext -> // use `ext`
	// ...
  • checks page against all listed types

  • executes matching branch with respective variable

Gathering Statistics

In class Statistician:

public static Stats evaluate(Page rootPage) {
	Statistician statistician = new Statistician();
	return statistician.result();

private void evaluateTree(Page page) {
	if (page instanceof GitHubPage ghPage)

Gathering Statistics

In class Statistician:

private void evaluatePage(Page page) {
	// `numberOf...` are fields
	switch (page) {
		case GitHubIssuePage issue -> numberOfIssues++;
		case GitHubPrPage pr -> numberOfPrs++;
		case ExternalPage ext -> numberOfExternals++;
		case ErrorPage err -> numberOfErrors++;

Creating A Pretty String

In class Pretty:

public static String toPrettyString(Page rootPage) {
	if (!(rootPage instanceof GitHubPage ghPage))
		return createPrettyString(rootPage);

	return ghPage

Creating A Pretty String

In class Pretty:

private static String createPrettyString(Page page) {
	return switch (page) {
		case GitHubIssuePage issue
			-> "🐈 ISSUE #" + issue.issueNumber();
		case GitHubPrPage pr
			-> "🐙 PR #" + pr.prNumber();
		case ExternalPage ext
			-> "💤 EXTERNAL: " + ext.url().getHost();
		case ErrorPage err
			-> "💥 ERROR: " + err.url().getHost();

⇝ Simpler access with record/deconstruction patterns.

Detour: Record Patterns

[Finalized in Java 21 — JEP 440]

Records are transparent, so you can
deconstruct them in if and switch:

record ExternalPage(URI url, String content) { }

// elsewhere
switch (page) {
	case ExternalPage(var url, var content)
		-> // use `url` and `content` here

Deconstructing Data

Use deconstruction patterns:

public static String createPrettyString(Page page) {
	return switch (page) {
		case GitHubIssuePage(
				var url, var content,
				int issueNumber, var links)
			-> "🐈 ISSUE #" + issueNumber;
		case ErrorPage(var url, var ex)
			-> "💥 ERROR: " + url.getHost();
		// ...

⇝ Even simpler access with unnamed patterns.

Detour: Unnamed Patterns

[Preview in Java 21 — JEP 443 / Finalized in 22 — JEP 456]

Replace variables you don’t need with _:

case GitHubIssuePage(_, _, int issueNumber, _)
	-> "🐈 ISSUE #" + issueNumber;
case ErrorPage(var url, _)
	-> "💥 ERROR: " + url.getHost();

Deconstructing Data

Use record and unnamed patterns for simple access:

private static String createPrettyString(Page page) {
	return switch (page) {
		case GitHubIssuePage(_, _, int issueNumber, _)
			-> "🐈 ISSUE #" + issueNumber;
		case GitHubPrPage(_, _, int prNumber, _)
			-> "🐙 PR #" + prNumber;
		case ExternalPage(var url, _)
			-> "💤 EXTERNAL: " + url.getHost();
		case ErrorPage(var url, _)
			-> "💥 ERROR: " + url.getHost();

Operations On Data

Looks good?

"Isn’t switching over types icky?"

Yes, but why?

⇝ It fails unpredicatbly when new types are added.

Extending Operations On Data

This approach behaves much better:

  • let’s add GitHubCommitPage implements GitHubPage

  • follow the compile errors!

Follow the errors

Starting point:

record GitHubCommitPage(/*…*/) implements GitHubPage {

	// ...


Compile error because supertype is sealed.

⇝ Go to the sealed supertype.

Follow the errors

Next stop: the sealed supertype

⇝ Permit the new subtype!

public sealed interface GitHubPage
		extends SuccessfulPage
		permits GitHubIssuePage, GitHubPrPage,
				GitHubCommitPage {
	// [...]

Follow the errors

Next stop: all switches that are no longer exhaustive.

private static String createPrettyString(Page page) {
	return switch (page) {
		case GitHubIssuePage issue -> // ...
		case GitHubPrPage pr -> // ...
		case ExternalPage external -> // ...
		case ErrorPage error -> // ...

"Exhaustive?" 🤔

Detour: Exhaustiveness

Unlike an if-else-if-chain,
a pattern switch needs to cover all cases!

Two ways to achieve this:

  • have a default branch

  • enumerate all subtypes

We want the compile error on new types!

(⇝ Avoid the default branch.)

Follow the errors

Next stop: all switches that are no longer exhaustive.

private static String createPrettyString(Page page) {
	return switch (page) {
		case GitHubIssuePage issue -> // ...
		case GitHubPrPage pr -> // ...
		case ExternalPage external -> // ...
		case ErrorPage error -> // ...
		// missing case: GitHubCommitPage

Fix the errors

⇝ Handle the new subtype!

private static String createPrettyString(Page page) {
	return switch (page) {
		case GitHubIssuePage issue -> // ...
		case GitHubPrPage pr -> // ...
		case GitHubCommitPage -> // ...
		case ExternalPage external -> // ...
		case ErrorPage error -> // ...

Operations On Data

To keep operations maintainable:

  • switch over sealed types

  • enumerate all possible types
    (even if you need to ignore some)

  • avoid default branch

⇝ Compile error when new type is added.

Avoiding Default

Sometimes you have "defaulty" behavior:

public static String createPageEmoji(Page page) {
	return switch (page) {
		case GitHubIssuePage issue -> "🐈";
		case GitHubPrPage pr -> "🐙";
		default -> "n.a.";

But we need to avoid default!

Avoiding Default

Write explicit branches:

public static String createPageEmoji(Page page) {
	return switch (page) {
		case GitHubIssuePage issue -> "🐈";
		case GitHubPrPage pr -> "🐙";
		// duplication 😢
		case ErrorPage err -> "n.a.";
		case ExternalPage ext -> "n.a.";

Avoiding Default

Use _ to combine "default branches":

public static String createPageEmoji(Page page) {
	return switch (page) {
		case GitHubIssuePage issue -> "🐈";
		case GitHubPrPage pr -> "🐙";
		case ErrorPage _, ExternalPage _ -> "n.a.";

⇝ Default behavior without default branch.

Where Are We?

  • operations separate from data

  • adding new operations is easy

  • adding new data types is more work,
    but supported by the compiler

⇝ Like the visitor pattern, but less painful.

Data-Oriented Programming

What is DOP?
A Lengthy Example
That was DOP!

Dynamic dispatch

Dynamic dispatch selects the invoked method by type.

As language feature:

  • via inheritance

  • makes method part of API

What if methods shouldn’t be part of the API?

Dynamic dispatch

Without methods becoming part of the API.

Via visitor pattern:

  • makes "visitation" part of API

  • cumbersome and indirect

Dynamic dispatch

Without methods becoming part of the API.

Via pattern matching (new):

  • makes "sealed" part of type

  • straight-forward

Patterns and language

Design patterns make up gaps in the language.

Good example is the strategy pattern:

  • used to be "a thing" in Java

  • you use it everytime you pass a lambda

But do you still think of it a design pattern?
(I don’t.)

Pattern matching does the same for the visitor pattern.


  • ad-hoc data structures

  • complex return types

  • complex domains

Ad-hoc Data Structures

Often local, throw-away types used in one class or package.

record PageWithLinks(Page page, Set<URI> links) {

	PageWithLinks {
		links = new HashSet<>(links);

	public PageWithLinks(Page page) {
		this(page, Set.of());


Complex Return Types

Return values that are deconstructed immediately:

// type declaration
sealed interface Match<T> { }

record None<T>() implements Match<T> { }
record Exact<T>(T entity) implements Match<T> { }
record Fuzzies<T>(Collection<Fuzzy<T>> entities)
	implements Match<T> { }

record Fuzzy<T>(T entity, int distance) { }

// method declaration
Match<User> findUser(String userName) { ... }

Complex Return Types

Return values that are deconstructed immediately:

// calling the method
switch (findUser("John Doe")) {
	case None<> none -> // ...
	case Exact<> exact -> // ...
	case Fuzzies<> fuzzies -> // ...

Complex Domains

Long-living objects that are part
of the program’s domain.

For example Page.

Algebraic Data Types

  • records are product types

  • sealed types are sum types

This simple combination of mechanisms — aggregation and choice — is deceptively powerful

Functional Programming?!

  • immutable data structures

  • methods (functions?) that operate on them

Isn’t this just functional programming?!

Kind of.


Functional programming:

Everything is a function

⇝ Focus on creating and composing functions.

Data-oriented programming:

Model data as data.

⇝ Focus on correctly modeling the data.


OOP is not dead (again):

  • valuable for complex entities or rich libraries

  • use whenever encapsulation is needed

  • still a good default on high level

DOP —  consider when:

  • mainly handling outside data

  • working with simple or ad-hoc data

  • data and behavior should be separated

Data-Oriented Programming

Use Java’s strong typing to model data as data:

  • use classes to represent data, particularly:

    • data as data with records

    • alternatives with sealed classes

  • use methods (separately) to model behavior, particularly:

    • exhaustive switch without default

    • pattern matching to destructure polymorphic data


So long…​

