public class PageTreeFactory {
public static Page loadPageTree(/*...*/) {
// [...]
}
}
What is DOP? |
A Lengthy Example |
That was DOP! |
Slides at slides.nipafx.dev.
What is DOP? |
A Lengthy Example |
That was DOP! |
The goal of any programming paradigm is to manage complexity.
complexity comes in many forms
not all paradigms handle all forms equally well
⇝ "It depends"
Everything is an object
combines state and behavior
hinges on encapsulation
polymorphism through inheritance
Works best when defining/upholding boundaries.
Great use cases for OOP:
boundaries between libraries and clients
in large programs to enable modular reasoning
Consider a data-oriented approach for:
smaller (sub)systems
focused on data
Brian Goetz formulated them in June 2022:
Data Oriented Programming in Java
I offered an updated version in May 2024:
Data Oriented Programming in Java - Version 1.1
Version 1.1:
Model data immutably and transparently.
Model the data, the whole data,
and nothing but the data.
Make illegal states unrepresentable.
Separate operations from data.
What is DOP? |
A Lengthy Example |
That was DOP! |
Starting with a seed URL:
connect to URL
identify kind of page
identify interesting section
identify outgoing links
for each link, start at 1.
(Code on github.com/nipafx/modern-java-demo.)
That logic is implemented in:
public class PageTreeFactory {
public static Page loadPageTree(/*...*/) {
// [...]
}
}
What does Page
look like?
Pages:
all pages have a url
unresolved pages have an error
resolved pages have content
GitHub pages have:
outgoing links
issueNumber
or prNumber
Possible operations on pages
(and their subtree):
display as interactive graph
compute graph properties
categorize pages by topic
analyze mood of interactions
process payment for analyses
Example operations going forward:
evaluate statistics
create pretty string
A single Page
class with this API:
public URL url();
public Exception error();
public String content();
public int issueNumber();
public int prNumber();
public Set<Page> links();
public Stream<Page> subtree();
public Stats evaluateStatistics();
public String toPrettyString();
Problems:
page "type" is implicit
legal combination of fields is unclear
clients must "divine" the type
disparate operations on same class
Data-oriented programming:
makes all data explicit and "obviously correct"
separates operations from data
models systems as production line
Model data immutably and transparently.
We got a language construct that fits perfectly.
(Just in case: You can achieve this with regular classes, too.)
[Finalized in Java 16 — JEP 395]
Transparent carriers for immutable data.
compiler understands internals
couples API to internals
reduces verbosity a lot
record ExternalPage(URI url, String content) { }
ExternalPage
is final
private final fields: URI url
and String content
constructor: ExternalPage(URI url, String content)
accessors: URI url()
and String content()
equals()
, hashCode()
, toString()
that use the two fields
All method/constructor bodies can be customized.
Records are shallowly immutable,
but field types may not be.
⇝ Fix that during construction.
public record GitHubPrPage(..., Set<Page> links) {
// compact constructor
GitHubPrPage {
links = Set.copyOf(links);
}
}
Model the data, the whole data,
and nothing but the data.
There are four kinds of pages:
error page
external page
GitHub issue page
GitHub PR page
⇝ Use four records to model them!
public record ErrorPage(
URI url, Exception ex) { }
public record ExternalPage(
URI url, String content) { }
public record GitHubIssuePage(
URI url, String content,
int issueNumber, Set<Page> links) { }
public record GitHubPrPage(
URI url, String content,
int prNumber, Set<Page> links) { }
There are additional relations between them:
a page (load) is either successful or not
a successful page is either external or GitHub
a GitHub page is either for a PR or an issue
⇝ Use sealed types to model the alternatives!
[Finalized in Java 17 — JEP 409]
Sealed types limit inheritance,
by only allowing specific subtypes.
communicates intention to developers
allows compiler to check exhaustiveness
public sealed interface Page
permits ErrorPage, SuccessfulPage {
// ...
}
Only ErrorPage
and SuccessfulPage
can implement/extend Page
.
⇝ interface MyPage extends Page
doesn’t compile
public sealed interface Page
permits ErrorPage, SuccessfulPage {
URI url();
}
public sealed interface SuccessfulPage
extends Page permits ExternalPage, GitHubPage {
String content();
}
public sealed interface GitHubPage
extends SuccessfulPage
permits GitHubIssuePage, GitHubPrPage {
Set<Page> links();
default Stream<Page> subtree() { ... }
}
Make illegal states unrepresentable.
Many are already, e.g.:
with error
and with content
with issueNumber
and prNumber
with isseNumber
or prNumber
but no links
⇝ Reject other illegal states in constructors.
record ExternalPage(URI url, String content) {
ExternalPage {
Objects.requireNonNull(url);
Objects.requireNonNull(content);
if (content.isBlank())
throw new IllegalArgumentException();
}
}
page "type" is explicit in Java’s type
only legal combination of fields are possible
API is more self-documenting
code is easier to test
But where did the operations go?
Separate operations from data.
⇝ Record methods should be limited to derived quantities.
public Stats evaluateStatistics();
public String toPrettyString();
This actually applies to our operations.
But what if it didn’t? 😁
Pattern matching on sealed types is perfect
to apply polymorphic operations to data!
And records eschew encapsulation,
so everything is accessible.
[Finalized in Java 16 — JEP 394]
Typecheck, cast, and declaration all in one.
if (rootPage instanceof GitHubPage ghPage)
// do something with `ghPage`
checks rootPage instanceof GitHubPage
declares variable GitHubPage ghPage
Only where the check is passed, is ghPage
in scope.
(Flow-scoping)
Only where the check is passed,
isghPage
in scope.
if (!(rootPage instanceof GitHubPage ghPage))
// can't use `ghPage` here
return;
// do something with `ghPage` here 😈
[Finalized in Java 21 — JEP 441]
All patterns can be used in switches
switch (page) {
case GitHubPrPage pr -> // use `pr`
case ExternalPage ext -> // use `ext`
// ...
};
checks page
against all listed types
executes matching branch with respective variable
In class Statistician
:
public static Stats evaluate(Page rootPage) {
Statistician statistician = new Statistician();
statistician.evaluateTree(rootPage);
return statistician.result();
}
private void evaluateTree(Page page) {
if (page instanceof GitHubPage ghPage)
ghPage.subtree().forEach(this::evaluatePage);
else
evaluatePage(page);
}
In class Statistician
:
private void evaluatePage(Page page) {
// `numberOf...` are fields
switch (page) {
case GitHubIssuePage issue -> numberOfIssues++;
case GitHubPrPage pr -> numberOfPrs++;
case ExternalPage ext -> numberOfExternals++;
case ErrorPage err -> numberOfErrors++;
}
}
In class Pretty
:
public static String toPrettyString(Page rootPage) {
if (!(rootPage instanceof GitHubPage ghPage))
return createPrettyString(rootPage);
return ghPage
.subtree()
.map(Pretty::createPrettyString)
.collect(joining("\n"));
}
In class Pretty
:
private static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage issue
-> "🐈 ISSUE #" + issue.issueNumber();
case GitHubPrPage pr
-> "🐙 PR #" + pr.prNumber();
case ExternalPage ext
-> "💤 EXTERNAL: " + ext.url().getHost();
case ErrorPage err
-> "💥 ERROR: " + err.url().getHost();
};
}
⇝ Simpler access with record/deconstruction patterns.
[Finalized in Java 21 — JEP 440]
Records are transparent, so you can
deconstruct them in if
and switch
:
record ExternalPage(URI url, String content) { }
// elsewhere
switch (page) {
case ExternalPage(var url, var content)
-> // use `url` and `content` here
}
Use deconstruction patterns:
public static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage(
var url, var content,
int issueNumber, var links)
-> "🐈 ISSUE #" + issueNumber;
case ErrorPage(var url, var ex)
-> "💥 ERROR: " + url.getHost();
// ...
};
}
⇝ Even simpler access with unnamed patterns.
Use record and unnamed patterns for simple access:
private static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage(_, _, int issueNumber, _)
-> "🐈 ISSUE #" + issueNumber;
case GitHubPrPage(_, _, int prNumber, _)
-> "🐙 PR #" + prNumber;
case ExternalPage(var url, _)
-> "💤 EXTERNAL: " + url.getHost();
case ErrorPage(var url, _)
-> "💥 ERROR: " + url.getHost();
};
}
Looks good?
"Isn’t switching over types icky?"
Yes, but why?
⇝ It fails unpredicatbly when new types are added.
This approach behaves much better:
let’s add GitHubCommitPage implements GitHubPage
follow the compile errors!
Starting point:
record GitHubCommitPage(/*…*/) implements GitHubPage {
// ...
}
Compile error because supertype is sealed.
⇝ Go to the sealed supertype.
Next stop: the sealed supertype
⇝ Permit the new subtype!
public sealed interface GitHubPage
extends SuccessfulPage
permits GitHubIssuePage, GitHubPrPage,
GitHubCommitPage {
// [...]
}
Next stop: all switches that are no longer exhaustive.
private static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage issue -> // ...
case GitHubPrPage pr -> // ...
case ExternalPage external -> // ...
case ErrorPage error -> // ...
};
}
"Exhaustive?" 🤔
Unlike an if
-else
-if
-chain,
a pattern switch
needs to cover all cases!
Two ways to achieve this:
have a default branch
enumerate all subtypes
We want the compile error on new types!
(⇝ Avoid the default branch.)
Next stop: all switches that are no longer exhaustive.
private static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage issue -> // ...
case GitHubPrPage pr -> // ...
case ExternalPage external -> // ...
case ErrorPage error -> // ...
// missing case: GitHubCommitPage
};
}
⇝ Handle the new subtype!
private static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage issue -> // ...
case GitHubPrPage pr -> // ...
case GitHubCommitPage -> // ...
case ExternalPage external -> // ...
case ErrorPage error -> // ...
};
}
To keep operations maintainable:
switch over sealed types
enumerate all possible types
(even if you need to ignore some)
avoid default
branch
⇝ Compile error when new type is added.
Sometimes you have "defaulty" behavior:
public static String createPageEmoji(Page page) {
return switch (page) {
case GitHubIssuePage issue -> "🐈";
case GitHubPrPage pr -> "🐙";
default -> "n.a.";
};
}
But we need to avoid default
!
Write explicit branches:
public static String createPageEmoji(Page page) {
return switch (page) {
case GitHubIssuePage issue -> "🐈";
case GitHubPrPage pr -> "🐙";
// duplication 😢
case ErrorPage err -> "n.a.";
case ExternalPage ext -> "n.a.";
};
}
Use _
to combine "default branches":
public static String createPageEmoji(Page page) {
return switch (page) {
case GitHubIssuePage issue -> "🐈";
case GitHubPrPage pr -> "🐙";
case ErrorPage _, ExternalPage _ -> "n.a.";
};
}
⇝ Default behavior without default
branch.
operations separate from data
adding new operations is easy
adding new data types is more work,
but supported by the compiler
⇝ Like the visitor pattern, but less painful.
What is DOP? |
A Lengthy Example |
That was DOP! |
Dynamic dispatch selects the invoked method by type.
As language feature:
via inheritance
makes method part of API
What if methods shouldn’t be part of the API?
Without methods becoming part of the API.
Via visitor pattern:
makes "visitation" part of API
cumbersome and indirect
Without methods becoming part of the API.
Via pattern matching (new):
makes "sealed" part of type
straight-forward
Design patterns make up gaps in the language.
Good example is the strategy pattern:
used to be "a thing" in Java
you use it everytime you pass a lambda
But do you still think of it a design pattern?
(I don’t.)
Pattern matching does the same for the visitor pattern.
ad-hoc data structures
complex return types
complex domains
Often local, throw-away types used in one class or package.
record PageWithLinks(Page page, Set<URI> links) {
PageWithLinks {
requireNonNull(page);
requireNonNull(links);
links = new HashSet<>(links);
}
public PageWithLinks(Page page) {
this(page, Set.of());
}
}
Return values that are deconstructed immediately:
// type declaration
sealed interface Match<T> { }
record None<T>() implements Match<T> { }
record Exact<T>(T entity) implements Match<T> { }
record Fuzzies<T>(Collection<Fuzzy<T>> entities)
implements Match<T> { }
record Fuzzy<T>(T entity, int distance) { }
// method declaration
Match<User> findUser(String userName) { ... }
Return values that are deconstructed immediately:
// calling the method
switch (findUser("John Doe")) {
case None<> none -> // ...
case Exact<> exact -> // ...
case Fuzzies<> fuzzies -> // ...
}
Long-living objects that are part
of the program’s domain.
For example Page
.
records are product types
sealed types are sum types
This simple combination of mechanisms — aggregation and choice — is deceptively powerful
immutable data structures
methods (functions?) that operate on them
Isn’t this just functional programming?!
Kind of.
Functional programming:
Everything is a function
⇝ Focus on creating and composing functions.
Data-oriented programming:
Model data as data.
⇝ Focus on correctly modeling the data.
OOP is not dead (again):
valuable for complex entities or rich libraries
use whenever encapsulation is needed
still a good default on high level
DOP — consider when:
mainly handling outside data
working with simple or ad-hoc data
data and behavior should be separated
Use Java’s strong typing to model data as data:
use classes to represent data, particularly:
data as data with records
alternatives with sealed classes
use methods (separately) to model behavior, particularly:
exhaustive switch
without default
pattern matching to destructure polymorphic data