public void categorize(Page page) {
if (page instanceof GitHubIssuePage) {
GitHubIssuePage issue = (GitHubIssuePage) page;
categorizeIssuePage(issue);
} else if (page instanceof GitHubPrPage pr) {
// ... etc. for all types
}
}
Karlsruhe
October 24th, 25th
50% discount: AccJUGCH23
I have two free tickets!
this talk covers Java 21
and whatever else we have time for
this is a showcase, not a tutorial
slides at slides.nipafx.dev/java-x
(hit "?" to get navigation help)
Pattern Matching |
Data-Oriented Programming |
Virtual Threads |
Preparing For Virtual Threads |
String Templates |
Sequenced Collections |
On-Ramp |
GenZGC |
Costs of running on old versions:
support contract for Java
waning support in libraries / frameworks
Costs of not running on new versions:
lower productivity
less observability and performance
(more on that later)
less access to talent
bigger upgrade costs
Resistence is futile.
Preparations:
stick to supported APIs
stick to standardized behavior
stick to well-maintained projects
keep dependencies and tools up to date
stay ahead of removals (jdeprscan
)
build on many JDK versions
Prepare by building on multiple JDK versions:
your baseline version
every supported version since then
latest version
EA build of next version
It’s not necessary to build β¦
β¦ each commit on all versions
β¦ the whole project on all versions
Build as much as feasible.
Within OpenJDK, there is no LTS.
β has no impact on features, reliability, etc.
It’s a vendor-centric concept
to offer continuous fixes
(usually for money).
You’re paying not to get new features.
Pattern Matching |
Data-Oriented Programming |
Virtual Threads |
Preparing For Virtual Threads |
String Templates |
Sequenced Collections |
On-Ramp |
GenZGC |
scrapes GitHub projects
creates Page
instances:
GitHubIssuePage
GitHubPrPage
ExternalPage
ErrorPage
further processes pages
Features:
display as interactive graph
compute graph properties
categorize pages by topic
analyze mood of interactions
process payment for analyses
etc.
How to implement features?
methods on Page
π§
visitor pattern π«
type checks π±
public void categorize(Page page) {
if (page instanceof GitHubIssuePage) {
GitHubIssuePage issue = (GitHubIssuePage) page;
categorizeIssuePage(issue);
} else if (page instanceof GitHubPrPage pr) {
// ... etc. for all types
}
}
Ignore the π± and let’s work on this.
[Finalized in Java 16 — JEP 394]
They combine:
type check
variable declaration
cast/assignment
public void categorize(Page page) {
if (page instanceof GitHubIssuePage issue)
categorizeIssuePage(issue);
else if (page instanceof GitHubPrPage pr)
// ... etc. for all types
}
β Standardizes and eases a common pattern.
Generally, patterns consist of three parts:
a boolean check
variable declaration(s)
extraction(s)/assignment(s)
[Finalized in Java 16 — JEP 395]
record ExternalPage(URI url, String content) { }
Transparent carriers for immutable data.
compiler understands internals
couples API to internals
reduces verbosity a lot
[Finalized in Java 21 — JEP 440]
check whether variable is of correct type
declare one variable per component
assign component values to variables
if (page instanceof
ExternalPage(var url, var content)) {
// use `url` and `content`
}
β Standardizes and eases a common pattern.
[Finalized in Java 21 — JEP 441]
public void categorize(Page page) {
switch (page) {
case GitHubIssuePage issue
-> categorizeIssuePage(issue);
case ExternalPage(var url, var content)
-> categorizeExternalUrl(url);
// ... etc. for all types
}
}
But:
error: the switch expression does not cover
all possible input values
Unlike an if
-else
-if
-chain,
a pattern switch
needs to be exhaustive:
public void categorize(Page page) {
switch (page) {
case GitHubIssuePage issue ->
categorizeIssuePage(issue);
// ... etc. for all types
default ->
throw new IllegalArgumentException();
}
}
That touches the π± nerve.
[Finalized in Java 17 — JEP 409]
Sealed types limit inheritance,
by only allowing specific subtypes.
public sealed interface Page
permits GitHubIssuePage, GitHubPrPage,
ExternalPage, ErrorPage {
// ...
}
β class MyPage implements Page
doesn’t compile
If all subtypes of a sealed types are covered,
the switch is exhaustive β¦
public void categorize(Page page) {
switch (page) {
case GitHubIssuePage issue -> // ...
case GitHubPrPage pr -> // ...
case ExternalPage external -> // ...
case ErrorPage error -> // ...
}
}
β¦ and the compiler is happy!
(But still watching.)
Why is switching over the type scary?
Because it may not be future proof!
But this one is!
Let’s add GitHubCommitPage implements Page
.
β Follow the compile errors!
First stop: the sealed supertype.
β Permit the new subtype!
public sealed interface Page
permits GitHubIssuePage, GitHubPrPage,
GitHubCommitPage,
ExternalPage, ErrorPage {
// ...
}
Next stop: all switches that are no longer exhaustive.
public void categorize(Page page) {
switch (page) {
case GitHubIssuePage issue -> // ...
case GitHubPrPage pr -> // ...
case ExternalPage external -> // ...
case ErrorPage error -> // ...
// missing case
}
}
Bingo!
(But only works without default branch.)
Dynamic dispatch selects the invoked method by type.
As language feature:
via inheritance
makes method part of API
What if methods shouldn’t be part of the API?
Without methods becoming part of the API.
Via visitor pattern:
makes "visitation" part of API
cumbersome and indirect
Via pattern matching (new):
makes "sealed" part of type
straight-forward
Design patterns make up gaps in the language.
Good example is the strategy pattern:
used to be "a thing" in Java
you use it everytime you pass a lambda
But do you still think of it a design pattern?
(I don’t.)
Pattern matching does the same for the visitor pattern.
Pattern matching will probably see
further improvements, e.g.:
unnamed patterns
(JEP 443, preview in Java 21)
deconstruction on assignment
(no JEP, but it’s coming)
with
expression
(design document from Aug 2020):
[Preview in Java 21 — JEP 443]
Use _
to ignore components:
public static String createPageName(Page page) {
return switch (page) {
case ErrorPage(var url, _)
-> "π₯ ERROR: " + url.getHost();
case GitHubIssuePage(_, _, _, int issueNumber)
-> "π ISSUE #" + issueNumber;
// ...
};
}
β Focus on what’s essential.
Use _
to define default behavior:
public static String createPageEmoji(Page page) {
return switch (page) {
case GitHubIssuePage issue -> "π";
case GitHubPrPage pr -> "π";
case ErrorPage _, ExternalPage _ -> "n.a.";
};
}
β Default behavior without default
branch.
When keeping functionality separate from types:
seal the supertype
switch over sealed types
enumerate all subtypes
avoid default branches!
Pattern Matching |
Data-Oriented Programming |
Virtual Threads |
Preparing For Virtual Threads |
String Templates |
Sequenced Collections |
On-Ramp |
GenZGC |
Use Java’s strong typing to model data as data:
use types to model data, particularly:
data as data with records
alternatives with sealed types
use (static) methods to model behavior, particularly:
exhaustive switch
without default
pattern matching to destructure polymorphic data
but it’s similar (data + functions)
first priority is data, not functions
use OOP to modularize large systems
use DOP to model small, data-focused (sub)systems
More on data-oriented programming:
seminal article by Brian Goetz on InfoQ
GitHub crawler on github.com/nipafx/loom-lab
intro in Inside Java Newscast #29
deeper tutorial in JEP Cafe #14
practical example in Inside Java Newscast #33
Pattern Matching |
Data-Oriented Programming |
Virtual Threads |
Preparing For Virtual Threads |
String Templates |
Sequenced Collections |
On-Ramp |
GenZGC |
Imagine a hypothetical HTTP request:
interpret request
query database (blocks)
process data for response
Resource utilization:
good for 1. and 3.
really bad for 2.
How to implement that request?
Align application’s unit of concurrency (request)
with Java’s unit of concurrency (thread):
use thread per request
simple to write, debug, profile
blocks threads on certain calls
limited number of platform threads
β bad resource utilization
β low throughput
Only use threads for actual computations:
use non-blocking APIs (futures / reactive streams)
harder to write, challenging to debug/profile
incompatible with synchronous code
shares platform threads
β great resource utilization
β high throughput
There’s a conflict between:
simplicity
throughput
There are other conflicts:
design vs performance (β Valhalla)
explicitness vs succinctness (β Amber)
flexibility vs safety (β Panama)
optimization vs specification (β Leyden)
A virtual thread:
is a regular Thread
low memory footprint ([k]bytes)
small switching cost
scheduled by the Java runtime
requires no OS thread when waiting
Virtual memory:
maps large virtual address space
to limited physical memory
gives illusion of plentiful memory
Virtual threads:
map large number of virtual threads
to a small number of OS threads
give the illusion of plentiful threads
Programs rarely care about virtual vs physical memory.
Programs need rarely care about virtual vs platform thread.
Instead:
write straightforward (blocking) code
runtime shares available OS threads
reduces the cost of blocking to near zero
try (var executor = Executors
.newVirtualThreadPerTaskExecutor()) {
IntStream
.range(0, 1_000_000)
.forEach(number -> {
executor.submit(() -> {
Thread.sleep(Duration.ofSeconds(1));
return number;
});
});
} // executor.close() is called implicitly, and waits
Virtual threads:
remove "number of threads" as bottleneck
match app’s unit of concurrency to Java’s
β simplicity && throughput
Virtual threads aren’t "faster threads":
same number of CPU cycles
each task takes the same time (same latency)
So why bother?
Parallelism | Concurrency | |
---|---|---|
Task origin | solution | problem |
Control | developer | environment |
Resource use | coordinated | competitive |
Metric | latency | throughput |
Abstraction | CPU cores | tasks |
# of threads | # of cores | # of tasks |
When workload is not CPU-bound:
start waiting as early as possible
for as many tasks as possible
β Virtual threads increase throughput:
when workload is not CPU-bound
when number of concurrent tasks is high
For servers:
request handling threads are started by web framework
frameworks will offer (easy) configuration options
We’re not there yet.
Replace executors:
@Bean
public TomcatProtocolHandlerCustomizer<?>
createExecutorForSyncCalls() {
return handler -> handler.setExecutor(
Executors.newVirtualThreadPerTaskExecutor());
}
@Bean
public AsyncTaskExecutor
createExecutorForAsyncCalls() {
return new TaskExecutorAdapter(
Executors.newVirtualThreadPerTaskExecutor());
}
Annotate request handling method:
@GET
@Path("api")
@RunOnVirtualThread
public String handle() {
// ...
}
(Requires --add-opens java.base/java.lang=ALL-UNNAMED
.)
Go forth and multiply (your threads)
Pattern Matching |
Data-Oriented Programming |
Virtual Threads |
Preparing For Virtual Threads |
String Templates |
Sequenced Collections |
On-Ramp |
GenZGC |
Virtual threads:
always work correctly
may not scale perfectly
Code changes can improve scalability
(and maintainability, debuggability, observability).
Only pool expensive resources
but virtual threads are cheap.
β Replace thread pools (for concurrency),
with virtual threads plus, e.g., semaphores.
// limits concurrent queries but pools ππΎ
private static final ExecutorService DB_POOL =
Executors.newFixedThreadPool(16);
public <T> Future<T> queryDatabase(Callable<T> query) {
return DB_POOL.submit(query);
}
// limits concurrent queries without pool ππΎ
private static final Semaphore DB_SEMAPHORE =
new Semaphore(16);
public <T> T queryDatabase(Callable<T> query)
throws Exception {
DB_SEMAPHORE.acquire();
try {
return query.call();
} finally {
DB_SEMAPHORE.release();
}
}
Where are the virtual threads? β Later.
To understand virtual thread caveats
we need to understand how they work.
(Also, it’s very interesting.)
The Java runtime manages virtual threads:
runs them on a pool of carrier threads
on blocking call:
internally calls non-blocking operation
unmounts from carrier thread!
when call returns:
mounts to (other) carrier thread
continues
A virtual thread stack:
when waiting, is stored on heap (stack chunk objects)
when continuing, is lazily streamed to stack
This keeps switching cheap.
Remember the hypothetical request:
interpret request
query database (blocks)
process data for response
In a virtual thread:
runtime submits task to carrier thread pool
when 2. blocks, virtual thread unmounts
runtime hands carrier thread back to pool
when 2. unblocks, runtime resubmits task
virtual thread mounts and continues with 3.
Virtual threads work correctly with everything:
all blocking operations
synchronized
Thread
, currentThread
, etc.
thread interruption
thread-locals
native code
But not all scale perfectly.
Some operations pin (operations don’t unmount):
native method call (JNI)
foreign function call (FFM)
synchronized
block (for now)
β No compensation
β οΈ Problematic when:
pinning is frequent
contains blocking operations
If possible:
avoid pinning operations
remove blocking operations
from pinning code sections.
// guarantees sequential access, but pins (for now) ππΎ
public synchronized String accessResource() {
return access();
}
// guarantees sequential access without pinning ππΎ
private static final ReentrantLock LOCK =
new ReentrantLock();
public String accessResource() {
// lock guarantees sequential access
LOCK.lock();
try {
return access();
} finally {
LOCK.unlock();
}
}
Thread-locals can hinder scalability:
can be inherited
to keep them thread-local,
values are copied
can occupy lots of memory
(There are also API shortcomings.)
β Refactor to scoped values (JEP 446).
// copies value for each inheriting thread ππΎ
static final ThreadLocal<Principal> PRINCIPAL =
new ThreadLocal<>();
public void serve(Request request, Response response) {
var level = request.isAdmin() ? ADMIN : GUEST;
var principal = new Principal(level);
PRINCIPAL.set(principal);
Application.handle(request, response);
}
// immutable, so no copies needed ππΎ
static final ScopedValue<Principal> PRINCIPAL =
new ScopedValue<>();
public void serve(Request request, Response response) {
var level = request.isAdmin() ? ADMIN : GUEST;
var principal = new Principal(level);
ScopedValue
.where(PRINCIPAL, principal)
.run(() -> Application
.handle(request, response));
}
Most importantly:
replace thread pools with semaphores
Also helpful:
remove long-running I/O from pinned sections
replace thread-locals with scoped values
replace synchronized
with locks
Pattern Matching |
Data-Oriented Programming |
Virtual Threads |
Preparing For Virtual Threads |
String Templates |
Sequenced Collections |
On-Ramp |
GenZGC |
Composing strings in Java is cumbersome:
String property = "last_name";
String value = "Doe";
// concatenation
String query =
"SELECT * FROM Person p WHERE p."
+ property + " = '" + value + "'";
// formatting
String query =
"SELECT * FROM Person p WHERE p.%s = '%s'"
.formatted(property, value);
Comes with free SQL injection! π³
Why not?
// (fictional syntax!)
String query =
"SELECT * FROM Person p "
+ "WHERE p.\{property} = '\{value}'";
Also comes with free SQL injection! π³
SQL injections aren’t the only concern.
These also need validation and sanitization:
HTML/XML
JSON
YAML
…
All follow format-specific rules.
[Preview in Java 21 — JEP 430]
String query = STR."""
SELECT * FROM Person p
WHERE p.\{property} = '\{value}'
""";
Template expression ingredients:
template with embedded expressions
~> StringTemplate
template processor (e.g. STR
):
transforms StringTemplate
into String
*
String form = STR."""
Desc Unit Qty Amount
\{desc} $\{price} \{qty} $\{price * qty}
Subtotal $\{price * qty}
Tax $\{price * qty * tax}
Total $\{price * qty * (1.0 + tax)}
""";
Desc Unit Qty Amount
hammer $7.88 3 $23.64
Subtotal $23.64
Tax $3.546
Total $27.186
String form = FMT."""
Desc Unit Qty Amount
%-10s\{desc} $%5.2f\{price} %5d\{qty} $%5.2f\{price * qty}
Subtotal $%5.2f\{price * qty}
Tax $%5.2f\{price * qty * tax}
Total $%5.2f\{price * qty * (1.0 + tax)}
""";
Desc Unit Qty Amount
hammer $ 7.88 3 $23.64
Subtotal $23.64
Tax $ 3.55
Total $27.19
Often, strings are just exchange format, e.g.:
start with: String
+ values
validate / sanitize (i.e. parse)
dumb down to: String
π€
parse to: JSONObject
, Statement
, β¦
Why the detour?
STR
is a singleton instance of
a Processor
implementation:
public interface Processor<RESULT, EX> {
RESULT process(StringTemplate s) throws EX;
}
RESULT
can be of any type!
// prevents SQL injections
Statement query = SQL."""
SELECT * FROM Person p
WHERE p.\{property} = '\{value}'
""";
// validates & escapes JSON
JSONObject doc = JSON."""
{
"name": "\{name}",
"year": "\{bday.getYear()}"
}
""";
String templates:
simplify string concatenation
enable domain-specific processing
incentivize the "right way"
Pattern Matching |
Data-Oriented Programming |
Virtual Threads |
Preparing For Virtual Threads |
String Templates |
Sequenced Collections |
On-Ramp |
GenZGC |
Collections with order and indexed access:
List
Collections with order without indexed access:
SortedSet
(sort order)
Deque
(insertion order)
LinkedHashSet
(insertion order)
and more
New interfaces capture the concept of order:
SequencedCollection
SequencedSet
SequencedMap
Use as parameter or return type
and enjoy new methods.
Getting the first element:
list.get(0)
sortedSet.first()
deque.getFirst()
linkedHashSet.iterator().next()
Now for all:
sequencedCollection.getFirst()
Getting the last element:
list.get(list.size() - 1)
sortedSet.last()
deque.getLast()
linkedHashSet.π€·πΎββοΈ()
Now for all:
sequencedCollection.getLast()
list.listIterator()
β ListIterator
navigableSet.descendingSet()
β NavigableSet
(view)
deque.descendingIterator()
β Iterator
linkedHashSet.π€·πΎββοΈ()
Now for all:
sequencedCollection.reversed()
β SequencedCollection
(view)
Pattern Matching |
Data-Oriented Programming |
Virtual Threads |
Preparing For Virtual Threads |
String Templates |
Sequenced Collections |
On-Ramp |
GenZGC |
Java 21 makes life easier
for new (Java) developers.
We all know Java, IDEs, build tools, etc.
do we all?
what about your kids?
what about students?
what about the frontend dev?
what about ML/AI folks?
Java needs to be approachable!
Java needs an on-ramp for new (Java) developers!
To write and run a simple Java program, you need:
a JDK
an editor (IDE?)
javac
(build tool? IDE?)
java
(IDE?)
some Java code
Minimal Java code:
public class Main {
public static void main(String[] args) {
System.out.println("Hello, World!");
}
}
visibility
classes & methods
static vs instance
returns & parameters
statements & arguments
That’s a lot of tools and concepts!
Java is great for large-scale development:
detailed toolchain
refined programming model
This make it less approachable.
Let’s change that!
Java 9 added jshell
:
all you need:
tools: JDK, jshell
concepts: statements & arguments
but:
not great for beginners (IMO)
no progression
More is needed.
Java 11 added single-file execution (JEP 330):
java Prog.java
removed: javac
but: no progression
Much better for beginners,
but just a section of an on-ramp.
Expand single-file execution in two directions:
simplify code: reduce required Java concepts
ease progression: run multiple files with java
Remove requirement of:
String[] args
parameter
main
being static
main
being public
the class itself
// all the code in Prog.java
void main() {
System.out.println("Hello, World!");
}
[Preview in Java 21 — JEP 445]
Say you have a folder:
MyFirstJava
ββ Prog.java
ββ Helper.java
ββ Lib
ββ library.jar
Run with:
java -cp 'Lib/*' Prog.java
Natural progression:
start with main()
need arguments? β add String[] args
need to organize code? β add methods
need shared state? β add fields
need more functionality? β explore JDK APIs
even more? β explore simple libraries
need more structure? β split into multiple files
even more β use visibility & packages
Doesn’t even have to be that order!
Java’s strengths for large-scale development
make it less approachable:
detailed toolchain
refined programming model
There are new features that:
make it easier to start
allow gradual progression
entice the future dev generation
Pattern Matching |
Data-Oriented Programming |
Virtual Threads |
Preparing For Virtual Threads |
String Templates |
Sequenced Collections |
On-Ramp |
GenZGC |
Compared to other GCs, ZGC:
optimizes ultra-low pause times
can have higher memory footprint or higher CPU usage
In Java 21, ZGC becomes generational.
most objects die young
those who don’t, grow (very) old
GCs can make use of this by tracking
young and old generations.
ZGC didn’t do this, but can do it now with:
-XX:+UseZGC -XX:+ZGenerational
A Cassandra 4 benchmark of ZGC vs GenZGC showed:
4x throughput with a fixed heap or
1/4x heap size with stable throughput
(Probably not representative but very promising.)
random generator API diagrams:
Nicolai Parlog
(CC-BY-NC 4.0)