public class PageTreeFactory {
public static Page loadPageTree(/*...*/) {
// [...]
}
}
What is DOP? |
A Lengthy Example |
That was DOP! |
Slides at slides.nipafx.dev.
What is DOP? |
A Lengthy Example |
That was DOP! |
Paradigms often come with an
"Everything is a …" sentence.
The goal of any programming paradigm is to manage complexity.
complexity comes in many forms
not all paradigms handle all forms equally well
⇝ "It depends"
Everything is an object
combines state and behavior
hinges on encapsulation
polymorphism through inheritance
Works best when defining/upholding boundaries.
Great use cases for OOP:
boundaries between libraries and clients
in large programs to enable modular reasoning
Consider a data-oriented approach for:
smaller (sub)systems
focused on data
Guiding principles:
model the data, the whole data,
and nothing but the data
data is immutable
validate at the boundary
make illegal states unrepresentable
From Brian Goetz' seminal article:
Data Oriented Programming in Java
What is DOP? |
A Lengthy Example |
That was DOP! |
Starting with a seed URL:
connect to URL
identify kind of page
identify interesting section
identify outgoing links
for each link, start at 1.
That logic is implemented in:
public class PageTreeFactory {
public static Page loadPageTree(/*...*/) {
// [...]
}
}
What does Page
look like?
Pages:
all pages have a url
unresolved pages have an error
resolved pages have content
GitHub pages have:
outgoing links
issueNumber
or prNumber
Operations on pages
(and their subtree):
pretty print
evaluate statistics
A single Page
class with this API:
public URL url();
public Exception error();
public String content();
public int issueNumber();
public int prNumber();
public Set<Page> links();
public Stream<Page> subtree();
public Stats evaluateStatistics();
public String printPageList();
Problems:
page "type" is implicit
legal combination of fields is unclear
clients must "divine" the type
disparate operations on same class
Model the data, the whole data,
and nothing but the data.
There are four kinds of pages:
error page
external page
GitHub issue page
GitHub PR page
⇝ Use four records to model them!
[Finalized in Java 16]
[T]ransparent carriers for immutable data
opt out of encapsulation
allow compiler to understand internals
Most obvious consequence: less boilerplate.
record ExternalPage(URI url, String content) { }
ExternalPage
is final
private final fields: URI url
and String content
constructor: ExternalPage(URI url, String content)
accessors: URI url()
and String content()
equals()
, hashCode()
, toString()
that use the two fields
All method/constructor bodies can be customized!
public record ErrorPage(
URI url, Exception ex) { }
public record ExternalPage(
URI url, String content) { }
public record GitHubIssuePage(
URI url, int issueNumber,
String content, Set<Page> links) { }
public record GitHubPrPage(
URI url, int prNumber,
String content, Set<Page> links) { }
Model the data, the whole data,
and nothing but the data.
There are additional relations between them:
a page (load) is either successful or not
a successful page is either external or GitHub
a GitHub page is either for a PR or an issue
⇝ Use sealed types to model the alternatives!
[Finalized in Java 17]
Sealed types limit inheritance,
by only allowing specific subtypes.
communicates intention to developers
allows compiler to check exhaustiveness
public sealed interface Page
permits ErrorPage, SuccessfulPage {
// ...
}
Only ErrorPage
and SuccessfulPage
can implement/extend Page
.
⇝ interface MyPage extends Page
doesn’t compile
public sealed interface Page
permits ErrorPage, SuccessfulPage {
// ...
}
Inheriting types must be:
in the same module (package) as sealed type
directly inherit from sealed type
final
, sealed
, or non-sealed
public sealed interface Page
permits ErrorPage, SuccessfulPage {
URI url();
}
public sealed interface SuccessfulPage
extends Page permits ExternalPage, GitHubPage {
String content();
}
public sealed interface GitHubPage
extends SuccessfulPage
permits GitHubIssuePage, GitHubPrPage {
Set<Page> links();
default Stream<Page> subtree() { ... }
}
Make illegal states unrepresentable.
Many are already, e.g.:
with error
and with content
with issueNumber
and prNumber
with isseNumber
or prNumber
but no links
Validate at the boundary.
⇝ Reject other illegal states in constructors.
record ExternalPage(URI url, String content) {
// compact constructor
ExternalPage {
Objects.requireNonNull(url);
Objects.requireNonNull(content);
if (content.isBlank())
throw new IllegalArgumentException();
}
}
Data is immutable.
Records are shallowly immutable,
but field types may not be.
⇝ Fix that during construction.
// compact constructor
GitHubPrPage {
// [...]
links = Set.copyOf(links);
}
page "type" is explicit in Java’s type
only legal combination of fields are possible
API is more self-documenting
code is easier to test
But where did the operations go?
Model the data, the whole data,
and nothing but the data.
⇝ Operations should be limited to derived quantities.
public Stats evaluateStatistics();
public String printPageList();
This actually applies to our operations.
But what if it didn’t? 😁
Pattern matching on sealed types is perfect
to apply polymorphic operations to data!
And records eschew encapsulation,
so everything is accessible.
[Finalized in Java 16]
Typecheck, cast, and declaration all in one.
if (rootPage instanceof GitHubPage ghPage)
// do something with `ghPage`
checks rootPage instanceof GitHubPage
declares variable GitHubPage ghPage
Only where the check is passed, is ghPage
in scope.
(Flow-scoping)
Only where the check is passed,
isghPage
in scope.
if (!(rootPage instanceof GitHubPage ghPage))
// can't use `ghPage` here
return;
// do something with `ghPage` here 😈
[Preview since Java 17; probably final in 21 - JEP 441]
All patterns can be used in switches
switch (page) {
case ExternalPage ext -> // use `ext`
case GitHubPrPage pr -> // use `pr`
// ...
};
checks page
against all listed types
executes matching branch with respective variable
In class Statistician
:
public static Stats evaluate(Page rootPage) {
Statistician statistician = new Statistician();
statistician.evaluateTree(rootPage);
return statistician.result();
}
private void evaluateTree(Page page) {
if (page instanceof GitHubPage ghPage)
ghPage.subtree().forEach(this::evaluatePage);
else
evaluatePage(page);
}
In class Statistician
:
private void evaluatePage(Page page) {
// `numberOf...` are fields
switch (page) {
case ErrorPage __ -> numberOfErrors++;
case ExternalPage __ -> numberOfExternals++;
case GitHubIssuePage __ -> numberOfIssues++;
case GitHubPrPage __ -> numberOfPrs++;
}
}
In class Pretty
:
public static String printPageList(Page rootPage) {
if (!(rootPage instanceof GitHubPage ghPage))
return createPageName(rootPage);
return ghPage
.subtree()
.map(Pretty::createPageName)
.collect(joining("\n"));
}
In class Pretty
:
private static String createPageName(Page page) {
return switch (page) {
case ErrorPage err
-> "💥 ERROR: " + err.url().getHost();
case ExternalPage ext
-> "💤 EXTERNAL: " + ext.url().getHost();
case GitHubIssuePage issue
-> "🐈 ISSUE #" + issue.issueNumber();
case GitHubPrPage pr
-> "🐙 PR #" + pr.prNumber();
};
}
⇝ Simpler access with record/deconstruction patterns.
[Preview since Java 19; probably final in 21 - JEP 440]
Records are transparent, so you can
deconstruct them in if
and switch
:
record ExternalPage(URI url, String content) { }
// elsewhere
Object obj = // ...
if (obj instanceof ExternalPage(var url, var content))
// use `url` and `content` here
switch (obj) {
case ExternalPage(var url, var content) ->
// use `url` and `content` here
}
Use deconstruction patterns:
public static String createPageName(Page page) {
return switch (page) {
case ErrorPage(var url, var ex)
-> "💥 ERROR: " + url.getHost();
case GitHubIssuePage(
var url, var content, var links,
int issueNumber)
-> "🐈 ISSUE #" + issueNumber;
// ...
};
}
⇝ Eve simpler access with unnamed patterns.
[Maybe preview in Java 21 - JEP 443]
Replace variables you don’t need with _
:
case ErrorPage(var url, _)
-> "💥 ERROR: " + url.getHost();
case GitHubIssuePage(_, _, _, int issueNumber)
-> "🐈 ISSUE #" + issueNumber;
Use record and unnamed patterns for simple access:
private static String createPageName(Page page) {
return switch (page) {
case ErrorPage(var url, _)
-> "💥 ERROR: " + url.getHost();
case ExternalPage(var url, _)
-> "💤 EXTERNAL: " + url.getHost();
case GitHubIssuePage(_, _, _, issueNumber)
-> "🐈 ISSUE #" + issueNumber;
case GitHubPrPage(_, _, _, prNumber)
-> "🐙 PR #" + prNumber;
};
}
Looks good?
"Isn’t switching over types icky?"
Yes, but why?
What happens when we add:
public record GitHubCommitPage(
URI url, String hash,
String content, Set<Page> links)
implements GitHubPage {
// [...]
}
Follow the compile errors!
First stop: the sealed supertype.
⇝ Permit the new subtype!
public sealed interface GitHubPage
extends SuccessfulPage
permits GitHubIssuePage, GitHubPrPage,
GitHubCommitPage {
// [...]
}
Next stop: all switch
without default
.
// non-exhaustive ⇝ compile error
switch (page) {
case ErrorPage __ -> numberOfErrors++;
case ExternalPage __ -> numberOfExternalLinks++;
case GitHubIssuePage __ -> numberOfIssues++;
case GitHubPrPage __ -> numberOfPrs++;
}
⇝ Handle the new subtype!
switch (page) {
case ErrorPage __ -> numberOfErrors++;
case ExternalPage __ -> numberOfExternalLinks++;
case GitHubIssuePage __ -> numberOfIssues++;
case GitHubPrPage __ -> numberOfPrs++;
case GitHubCommitPage __ -> numberOfCommits++;
}
To keep operations maintainable:
switch over sealed types
enumerate all possible types
(even if you need to ignore some)
avoid default
branch
⇝ Compile error when new type is added.
operations separate from data
adding new operations is easy
adding new data types is more work,
but supported by the compiler
⇝ Like the visitor pattern, but less painful.
What is DOP? |
A Lengthy Example |
That was DOP! |
records are product types
sealed types are sum types
This simple combination of mechanisms — aggregation and choice — is deceptively powerful
ad-hoc data structures
complex return types
complex domains
Often local, throw-away types used in one class or package.
record PageWithLinks(Page page, Set<URI> links) {
PageWithLinks {
requireNonNull(page);
requireNonNull(links);
links = new HashSet<>(links);
}
public PageWithLinks(Page page) {
this(page, Set.of());
}
}
Return values that are deconstructed immediately:
// type declaration
sealed interface Match<T> { }
record None<T>() implements MatchResult<T> { }
record Exact<T>(T entity) implements Match<T> { }
record Fuzzies<T>(Collection<Fuzzy<T>> entities)
implements MatchResult<T> { }
record Fuzzy<T>(T entity, int distance) { }
// method declaration
Match<User> findUser(String userName) { ... }
Return values that are deconstructed immediately:
// calling the method
switch (findUser("John Doe")) {
case None<> none -> // ...
case Exact<> exact -> // ...
case Fuzzies<> fuzzies -> // ...
}
Long-living objects that are part
of the program’s domain.
For example Page
.
immutable data structures
methods (functions?) that operate on them
Isn’t this just functional programming?!
Kind of.
Functional programming:
Everything is a function
⇝ Focus on creating and composing functions.
Data-oriented programming:
Model data as data.
⇝ Focus on correctly modeling the data.
OOP is not dead (again):
valuable for complex entities or rich libraries
use whenever encapsulation is needed
still a good default on high level
DOP — consider when:
mainly handling outside data
working with simple or ad-hoc data
data and behavior should be separated
Use Java’s strong typing to model data as data:
use classes to represent data, particularly:
data as data with records
alternatives with sealed classes
use methods (separately) to model behavior, particularly:
exhaustive switch
without default
pattern matching to destructure polymorphic data
model the data, the whole data,
and nothing but the data
data is immutable
validate at the boundary
make illegal states unrepresentable