public class PageTreeFactory {
public static Page loadPageTree(/*...*/) {
// [...]
}
}
What is DOP? |
A Lengthy Example |
That was DOP! |
Slides at slides.nipafx.dev.
What is DOP? |
A Lengthy Example |
That was DOP! |
Paradigms often come with an
"Everything is a …" sentence.
The goal of any programming paradigm is to manage complexity.
complexity comes in many forms
not all paradigms handle all forms equally well
⇝ "It depends"
Everything is an object
combines state and behavior
hinges on encapsulation
polymorphism through inheritance
Works best when defining/upholding boundaries.
Great use cases for OOP:
boundaries between libraries and clients
in large programs to enable modular reasoning
Consider a data-oriented approach for:
smaller (sub)systems
focused on data
Guiding principles:
model the data, the whole data,
and nothing but the data
data is immutable
validate at the boundary
make illegal states unrepresentable
From Brian Goetz' seminal article:
Data Oriented Programming in Java
What is DOP? |
A Lengthy Example |
That was DOP! |
Starting with a seed URL:
connect to URL
identify kind of page
identify interesting section
identify outgoing links
for each link, start at 1.
(Code on github.com/nipafx/loom-lab.)
That logic is implemented in:
public class PageTreeFactory {
public static Page loadPageTree(/*...*/) {
// [...]
}
}
What does Page
look like?
Pages:
all pages have a url
unresolved pages have an error
resolved pages have content
GitHub pages have:
outgoing links
issueNumber
or prNumber
Possible operations on pages
(and their subtree):
display as interactive graph
compute graph properties
categorize pages by topic
analyze mood of interactions
process payment for analyses
Example operations going forward:
evaluate statistics
create pretty string
A single Page
class with this API:
public URL url();
public Exception error();
public String content();
public int issueNumber();
public int prNumber();
public Set<Page> links();
public Stream<Page> subtree();
public Stats evaluateStatistics();
public String toPrettyString();
Problems:
page "type" is implicit
legal combination of fields is unclear
clients must "divine" the type
disparate operations on same class
Model the data, the whole data,
and nothing but the data.
There are four kinds of pages:
error page
external page
GitHub issue page
GitHub PR page
⇝ Use four records to model them!
[Finalized in Java 16 — JEP 395]
Transparent carriers for immutable data.
compiler understands internals
couples API to internals
reduces verbosity a lot
record ExternalPage(URI url, String content) { }
ExternalPage
is final
private final fields: URI url
and String content
constructor: ExternalPage(URI url, String content)
accessors: URI url()
and String content()
equals()
, hashCode()
, toString()
that use the two fields
All method/constructor bodies can be customized.
public record ErrorPage(
URI url, Exception ex) { }
public record ExternalPage(
URI url, String content) { }
public record GitHubIssuePage(
URI url, String content,
int issueNumber, Set<Page> links) { }
public record GitHubPrPage(
URI url, String content,
int prNumber, Set<Page> links) { }
Model the data, the whole data,
and nothing but the data.
There are additional relations between them:
a page (load) is either successful or not
a successful page is either external or GitHub
a GitHub page is either for a PR or an issue
⇝ Use sealed types to model the alternatives!
[Finalized in Java 17 — JEP 409]
Sealed types limit inheritance,
by only allowing specific subtypes.
communicates intention to developers
allows compiler to check exhaustiveness
public sealed interface Page
permits ErrorPage, SuccessfulPage {
// ...
}
Only ErrorPage
and SuccessfulPage
can implement/extend Page
.
⇝ interface MyPage extends Page
doesn’t compile
public sealed interface Page
permits ErrorPage, SuccessfulPage {
URI url();
}
public sealed interface SuccessfulPage
extends Page permits ExternalPage, GitHubPage {
String content();
}
public sealed interface GitHubPage
extends SuccessfulPage
permits GitHubIssuePage, GitHubPrPage {
Set<Page> links();
default Stream<Page> subtree() { ... }
}
Make illegal states unrepresentable.
Many are already, e.g.:
with error
and with content
with issueNumber
and prNumber
with isseNumber
or prNumber
but no links
Validate at the boundary.
⇝ Reject other illegal states in constructors.
record ExternalPage(URI url, String content) {
// compact constructor
ExternalPage {
Objects.requireNonNull(url);
Objects.requireNonNull(content);
if (content.isBlank())
throw new IllegalArgumentException();
}
}
Data is immutable.
Records are shallowly immutable,
but field types may not be.
⇝ Fix that during construction.
// compact constructor
GitHubPrPage {
// [...]
links = Set.copyOf(links);
}
page "type" is explicit in Java’s type
only legal combination of fields are possible
API is more self-documenting
code is easier to test
But where did the operations go?
Model the data, the whole data,
and nothing but the data.
⇝ Operations should be limited to derived quantities.
public Stats evaluateStatistics();
public String toPrettyString();
This actually applies to our operations.
But what if it didn’t? 😁
Pattern matching on sealed types is perfect
to apply polymorphic operations to data!
And records eschew encapsulation,
so everything is accessible.
[Finalized in Java 16 — JEP 394]
Typecheck, cast, and declaration all in one.
if (rootPage instanceof GitHubPage ghPage)
// do something with `ghPage`
checks rootPage instanceof GitHubPage
declares variable GitHubPage ghPage
Only where the check is passed, is ghPage
in scope.
(Flow-scoping)
Only where the check is passed,
isghPage
in scope.
if (!(rootPage instanceof GitHubPage ghPage))
// can't use `ghPage` here
return;
// do something with `ghPage` here 😈
[Finalized in Java 21 — JEP 441]
All patterns can be used in switches
switch (page) {
case GitHubPrPage pr -> // use `pr`
case ExternalPage ext -> // use `ext`
// ...
};
checks page
against all listed types
executes matching branch with respective variable
In class Statistician
:
public static Stats evaluate(Page rootPage) {
Statistician statistician = new Statistician();
statistician.evaluateTree(rootPage);
return statistician.result();
}
private void evaluateTree(Page page) {
if (page instanceof GitHubPage ghPage)
ghPage.subtree().forEach(this::evaluatePage);
else
evaluatePage(page);
}
In class Statistician
:
private void evaluatePage(Page page) {
// `numberOf...` are fields
switch (page) {
case GitHubIssuePage issue -> numberOfIssues++;
case GitHubPrPage pr -> numberOfPrs++;
case ExternalPage ext -> numberOfExternals++;
case ErrorPage err -> numberOfErrors++;
}
}
In class Pretty
:
public static String toPrettyString(Page rootPage) {
if (!(rootPage instanceof GitHubPage ghPage))
return createPrettyString(rootPage);
return ghPage
.subtree()
.map(Pretty::createPrettyString)
.collect(joining("\n"));
}
In class Pretty
:
private static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage issue
-> "🐈 ISSUE #" + issue.issueNumber();
case GitHubPrPage pr
-> "🐙 PR #" + pr.prNumber();
case ExternalPage ext
-> "💤 EXTERNAL: " + ext.url().getHost();
case ErrorPage err
-> "💥 ERROR: " + err.url().getHost();
};
}
⇝ Simpler access with record/deconstruction patterns.
[Finalized in Java 21 — JEP 440]
Records are transparent, so you can
deconstruct them in if
and switch
:
record ExternalPage(URI url, String content) { }
// elsewhere
switch (page) {
case ExternalPage(var url, var content)
-> // use `url` and `content` here
}
Use deconstruction patterns:
public static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage(
var url, var content,
int issueNumber, var links)
-> "🐈 ISSUE #" + issueNumber;
case ErrorPage(var url, var ex)
-> "💥 ERROR: " + url.getHost();
// ...
};
}
⇝ Even simpler access with unnamed patterns.
Use record and unnamed patterns for simple access:
private static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage(_, _, int issueNumber, _)
-> "🐈 ISSUE #" + issueNumber;
case GitHubPrPage(_, _, int prNumber, _)
-> "🐙 PR #" + prNumber;
case ExternalPage(var url, _)
-> "💤 EXTERNAL: " + url.getHost();
case ErrorPage(var url, _)
-> "💥 ERROR: " + url.getHost();
};
}
Looks good?
"Isn’t switching over types icky?"
Yes, but why?
⇝ It fails unpredicatbly when new types are added.
This approach behaves much better:
let’s add GitHubCommitPage implements GitHubPage
follow the compile errors!
Starting point:
record GitHubCommitPage(/*…*/) implements GitHubPage {
// ...
}
Compile error because supertype is sealed.
⇝ Go to the sealed supertype.
Next stop: the sealed supertype
⇝ Permit the new subtype!
public sealed interface GitHubPage
extends SuccessfulPage
permits GitHubIssuePage, GitHubPrPage,
GitHubCommitPage {
// [...]
}
Next stop: all switches that are no longer exhaustive.
private static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage issue -> // ...
case GitHubPrPage pr -> // ...
case ExternalPage external -> // ...
case ErrorPage error -> // ...
};
}
"Exhaustive?" 🤔
Unlike an if
-else
-if
-chain,
a pattern switch
needs to cover all cases!
Two ways to achieve this:
have a default branch
enumerate all subtypes
We want the compile error on new types!
(⇝ Avoid the default branch.)
Next stop: all switches that are no longer exhaustive.
private static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage issue -> // ...
case GitHubPrPage pr -> // ...
case ExternalPage external -> // ...
case ErrorPage error -> // ...
// missing case: GitHubCommitPage
};
}
⇝ Handle the new subtype!
private static String createPrettyString(Page page) {
return switch (page) {
case GitHubIssuePage issue -> // ...
case GitHubPrPage pr -> // ...
case GitHubCommitPage -> // ...
case ExternalPage external -> // ...
case ErrorPage error -> // ...
};
}
To keep operations maintainable:
switch over sealed types
enumerate all possible types
(even if you need to ignore some)
avoid default
branch
⇝ Compile error when new type is added.
Sometimes you have "defaulty" behavior:
public static String createPageEmoji(Page page) {
return switch (page) {
case GitHubIssuePage issue -> "🐈";
case GitHubPrPage pr -> "🐙";
default -> "n.a.";
};
}
But we need to avoid default
!
Write explicit branches:
public static String createPageEmoji(Page page) {
return switch (page) {
case GitHubIssuePage issue -> "🐈";
case GitHubPrPage pr -> "🐙";
// duplication 😢
case ErrorPage err -> "n.a.";
case ExternalPage ext -> "n.a.";
};
}
Use _
to combine "default branches":
public static String createPageEmoji(Page page) {
return switch (page) {
case GitHubIssuePage issue -> "🐈";
case GitHubPrPage pr -> "🐙";
case ErrorPage _, ExternalPage _ -> "n.a.";
};
}
⇝ Default behavior without default
branch.
operations separate from data
adding new operations is easy
adding new data types is more work,
but supported by the compiler
⇝ Like the visitor pattern, but less painful.
What is DOP? |
A Lengthy Example |
That was DOP! |
Dynamic dispatch selects the invoked method by type.
As language feature:
via inheritance
makes method part of API
What if methods shouldn’t be part of the API?
Without methods becoming part of the API.
Via visitor pattern:
makes "visitation" part of API
cumbersome and indirect
Without methods becoming part of the API.
Via pattern matching (new):
makes "sealed" part of type
straight-forward
Design patterns make up gaps in the language.
Good example is the strategy pattern:
used to be "a thing" in Java
you use it everytime you pass a lambda
But do you still think of it a design pattern?
(I don’t.)
Pattern matching does the same for the visitor pattern.
ad-hoc data structures
complex return types
complex domains
Often local, throw-away types used in one class or package.
record PageWithLinks(Page page, Set<URI> links) {
PageWithLinks {
requireNonNull(page);
requireNonNull(links);
links = new HashSet<>(links);
}
public PageWithLinks(Page page) {
this(page, Set.of());
}
}
Return values that are deconstructed immediately:
// type declaration
sealed interface Match<T> { }
record None<T>() implements Match<T> { }
record Exact<T>(T entity) implements Match<T> { }
record Fuzzies<T>(Collection<Fuzzy<T>> entities)
implements Match<T> { }
record Fuzzy<T>(T entity, int distance) { }
// method declaration
Match<User> findUser(String userName) { ... }
Return values that are deconstructed immediately:
// calling the method
switch (findUser("John Doe")) {
case None<> none -> // ...
case Exact<> exact -> // ...
case Fuzzies<> fuzzies -> // ...
}
Long-living objects that are part
of the program’s domain.
For example Page
.
records are product types
sealed types are sum types
This simple combination of mechanisms — aggregation and choice — is deceptively powerful
immutable data structures
methods (functions?) that operate on them
Isn’t this just functional programming?!
Kind of.
Functional programming:
Everything is a function
⇝ Focus on creating and composing functions.
Data-oriented programming:
Model data as data.
⇝ Focus on correctly modeling the data.
OOP is not dead (again):
valuable for complex entities or rich libraries
use whenever encapsulation is needed
still a good default on high level
DOP — consider when:
mainly handling outside data
working with simple or ad-hoc data
data and behavior should be separated
Use Java’s strong typing to model data as data:
use classes to represent data, particularly:
data as data with records
alternatives with sealed classes
use methods (separately) to model behavior, particularly:
exhaustive switch
without default
pattern matching to destructure polymorphic data
More on pattern matching: