public class PageTreeFactory {
public static Page loadPageTree(/*...*/) {
// [...]
}
}
What is DOP? |
A Lengthy Example |
That was DOP! |
Slides at slides.nipafx.dev.
(This is all just snitched.)
Seminal article by Brian Goetz on InfoQ:
Data Oriented Programming in Java
What is DOP? |
A Lengthy Example |
That was DOP! |
Paradigms often come with an
"Everything is a …" sentence.
The goal of any programming paradigm is to manage complexity.
complexity comes in many forms
not all paradigms handle all forms equally well
⇝ "It depends"
Everything is an object
combines state and behavior
works best when defining/upholding boundaries
Great use cases:
boundaries between libraries and clients
in large programs to enable modular reasoning
Smaller programs/subsystems have less need for boundaries.
Use Java’s strong typing to model data as data:
use classes to represent data, particularly:
data as data with records
alternatives with sealed classes
use methods (separately) to model behavior, particularly:
exhaustive switch
without default
pattern matching to destructure polymorphic data
model the data, the whole data,
and nothing but the data
data is immutable
validate at the boundary
make illegal states unrepresentable
What is DOP? |
A Lengthy Example |
That was DOP! |
Starting with a seed URL:
connect to URL
identify kind of page
identify interesting section
identify outgoing links
for each link, start at 1.
That logic is implemented in:
public class PageTreeFactory {
public static Page loadPageTree(/*...*/) {
// [...]
}
}
What does Page
look like?
Page
Requirementsall pages have a url
pages that couldn’t be resolved, have an error
pages that could be resolved have content
GitHub pages have:
outgoing links
issueNumber
or prNumber
Operations on pages and their subtree:
pretty print
collect statistics
A single Page
class with this API:
public URL url();
public Exception error();
public String content();
public int issueNumber();
public int prNumber();
public Set<Page> links();
public Stream<Page> subtree();
public Stats evaluateStatistics();
public String printPageList();
Problems:
page "type" is implicit
legal combination of fields is unclear
clients must "divine" the type
disparate operations on same class
Model the data, the whole data, and nothing but the data.
There are four kinds of pages:
error page
external page
GitHub issue page
GitHub PR page
⇝ Use four records to model them!
public record ErrorPage(
URI url, Exception ex) { }
public record ExternalPage(
URI url, String content) { }
public record GitHubIssuePage(
URI url, int issueNumber,
String content, Set<Page> links) { }
public record GitHubPrPage(
URI url, int prNumber,
String content, Set<Page> links) { }
Records are transparent data carriers.
record ExternalPage(URI url, String content) { }
:
private final fields: URI url
and String content
constructor: ExternalPage(URI url, String content)
accessors: URI url()
and String content()
equals()
, hashCode()
, toString()
that use the two fields
ExternalPage
is final
All method/constructor bodies can be customized!
Model the data, the whole data, and nothing but the data.
There are additional relations between them:
a page (load) is either successful or not
a successful page is either external or GitHub
a GitHub page is either for a PR or an issue
⇝ Use sealed types to model the alternatives!
public sealed interface Page
permits ErrorPage, SuccessfulPage {
URI url();
}
public sealed interface SuccessfulPage
extends Page permits ExternalPage, GitHubPage {
String content();
}
public sealed interface GitHubPage
extends SuccessfulPage
permits GitHubIssuePage, GitHubPrPage {
Set<Page> links();
default Stream<Page> subtree() { ... }
}
Use sealed types to limit inheritance.
public sealed interface Page
permits ErrorPage, SuccessfulPage {
// ...
}
Only ErrorPage
and SuccessfulPage
can implement/extend Page
.
⇝ interface MyPage extends Page
doesn’t compile
public sealed interface Page
permits ErrorPage, SuccessfulPage {
// ...
}
Inheriting types must be:
in the same module (package) as sealed type
directly inherit from sealed type
final
, sealed
, or non-sealed
Make illegal states unrepresentable.
Many are already, e.g.:
with error
and with content
with issueNumber
and prNumber
with isseNumber
or prNumber
but no links
Validate at the boundary.
⇝ Reject other illegal states in constructors.
public ExternalPage {
Objects.requireNonNull(url);
Objects.requireNonNull(content);
if (content.isBlank())
throw new IllegalArgumentException();
}
Data is immutable.
Records are shallowly immutable,
but field types may not be.
⇝ Fix that during construction.
public GitHubPrPage {
// [...]
links = Set.copyOf(links);
}
page "type" is explicit in Java’s type
only legal combination of fields are possible
API is more self-documenting
code is easier to test
But where did the operations go?
Model the data, the whole data, and nothing but the data.
⇝ Operations should be limited to derived quantities.
public Stats evaluateStatistics();
public String printPageList();
This actually applies to our operations.
But what if it didn’t? 😁
Pattern matching on sealed types is perfect
to apply polymorphic operations to data!
And records eschew encapsulation,
so everything is accessible.
In class Pretty
:
public static String printPageList(Page rootPage) {
if (!(rootPage instanceof GitHubPage ghPage))
return createPageName(rootPage);
return ghPage
.subtree()
.map(Pretty::createPageName)
.collect(joining("\n"));
}
In class Pretty
:
private static String createPageName(Page page) {
return switch (page) {
case ErrorPage err
-> "💥 ERROR: " + err.url().getHost();
case ExternalPage ext
-> "💤 EXTERNAL: " + ext.url().getHost();
case GitHubIssuePage issue
-> "🐈 ISSUE #" + issue.issueNumber();
case GitHubPrPage pr
-> "🐙 PR #" + pr.prNumber();
};
}
Typecheck, cast, and declaration all in one.
if (rootPage instanceof GitHubPage ghPage)
// do something with `ghPage`
checks rootPage instanceof GitHubPage
declares variable GitHubPage ghPage
Only where the check is passed,
is ghPage
in scope.
(Flow-scoping)
All patterns can be used in switches (soon):
return switch (page) {
case ErrorPage err -> // use `ErrorPage err`
case ExternalPage ext -> // use `ext`
case GitHubIssuePage issue -> // use `issue`
case GitHubPrPage pr -> // use `pr`
};
checks page
against all listed types
executes matching branch with respective variable
requires no default branch if exhaustive
In class Statistician
:
public static Stats evaluate(Page rootPage) {
Statistician statistician = new Statistician();
statistician.evaluateTree(rootPage);
return statistician.result();
}
private void evaluateTree(Page page) {
if (page instanceof GitHubPage ghPage)
ghPage.subtree().forEach(this::evaluatePage);
else
evaluatePage(page);
}
In class Statistician
:
private void evaluatePage(Page page) {
// `numberOf...` are fields
switch (page) {
case ErrorPage __ -> numberOfErrors++;
case ExternalPage __ -> numberOfExternalLinks++;
case GitHubIssuePage __ -> numberOfIssues++;
case GitHubPrPage __ -> numberOfPrs++;
}
}
Yes, switching over types is icky.
But switching over sealed types is safe.
What happens when we add:
public record GitHubCommitPage(
URI url, String hash,
String content, Set<Page> links)
implements GitHubPage {
// [...]
}
Follow the compile errors!
First stop: the sealed supertype.
⇝ Permit the new subtype!
public sealed interface GitHubPage
extends SuccessfulPage
permits GitHubIssuePage, GitHubPrPage,
GitHubCommitPage {
// [...]
}
Next stop: all switch
without default
.
⇝ Handle the new subtype!
switch (page) {
case ErrorPage __ -> numberOfErrors++;
case ExternalPage __ -> numberOfExternalLinks++;
case GitHubIssuePage __ -> numberOfIssues++;
case GitHubPrPage __ -> numberOfPrs++;
case GitHubCommitPage __ -> numberOfCommits++;
}
operations separate from data
adding new operations is easy
adding new data types is more work,
but supported by the compiler
⇝ Like the visitor pattern, but less painful.
What is DOP? |
A Lengthy Example |
That was DOP! |
records are product types
sealed types are sum types
This simple combination of mechanisms — aggregation and choice — is deceptively powerful
ad-hoc data structures
complex return types
complex domains
Often local, throw-away types used in one class or package.
record PageWithLinks(Page page, Set<URI> links) {
PageWithLinks {
requireNonNull(page);
requireNonNull(links);
links = new HashSet<>(links);
}
public PageWithLinks(Page page) {
this(page, Set.of());
}
}
Return values that are deconstructed immediately:
sealed interface MatchResult<T> {
record NoMatch<T>() implements MatchResult<T> { }
record ExactMatch<T>(T entity)
implements MatchResult<T> { }
record FuzzyMatches<T>(
Collection<FuzzyMatch<T>> entities)
implements MatchResult<T> { }
record FuzzyMatch<T>(T entity, int distance) { }
}
MatchResult<User> findUser(String userName) { ... }
Long-living objects that are part
of the program’s domain.
For example Page
.
immutable data structures
methods (functions?) that operate on them
Isn’t this just functional programming?!
Kind of.
Functional programming:
Everything is a function
⇝ Focus on creating and composing functions.
Data-oriented programming:
Model data as data.
⇝ Focus on correctly modeling the data.
OOP is not dead (again):
valuable for complex entities
or rich libraries
use whenever encapsulation is needed
still a good default on high level
DOP — consider when:
handling outside data (like JSON)
working with simple or ad-hoc data
data and behavior should be separated
model the data, the whole data,
and nothing but the data
data is immutable
validate at the boundary
make illegal states unrepresentable