public sealed interface Page
permits ErrorPage, SuccessfulPage {
URI url();
}
we’ll implement a GitHub crawler
we’ll aggressively use, abuse, and overuse
modern Java features
this is a showcase, not a tutorial
⇝ go to youtube.com/@java for more
ask questions at any time
Starting with a seed URL:
connect to URL
identify kind of page
identify interesting section
identify outgoing links
for each link, start at 1.
Then:
print statistics
print page list
show pages on localhost
Domain model:
create with records and sealed interfaces
operate on with pattern matching
Fetching pages:
HTTP client to fetch from GitHub
virtual threads via structured concurrency
Present results:
format with text blocks and string templates
host with simple file server
(And modules for reliability.)
JDK 23 EA with preview features!
Features that aren’t final in JDK 21:
launch multi-file programs (final in 22)
unnamed patterns (final in 22)
StructuredTaskScope
(preview in 21-23)
public sealed interface Page
permits ErrorPage, SuccessfulPage {
URI url();
}
public record GitHubPrPage(
URI url, String content, Set<Page> links, int number)
implements GitHubPage {
public GitHubPrPage {
// argument validation
}
public GitHubPrPage(
URI url, String content, int number) {
this(url, content, new HashSet<>(), number);
}
// `equals` and `hashcode` based on `url`
}
public static String pageName(Page page) {
return switch (page) {
case ErrorPage(var url, _)
-> "💥 ERROR: " + url.getHost();
case ExternalPage(var url, _)
-> "💤 EXTERNAL: " + url.getHost();
case GitHubIssuePage(_, _, _, int number)
-> "🐈 ISSUE #" + number;
case GitHubPrPage(_, _, _, int number)
-> "🐙 PR #" + number;
};
}
// creation
var client = HttpClient.newHttpClient();
// use
var request = HttpRequest
.newBuilder(url)
.GET()
.build();
return client
.send(request, BodyHandlers.ofString())
.body();
try (var scope =
new StructuredTaskScope.ShutdownOnFailure()) {
var futurePages = links.stream()
.map(link -> scope.fork(
() -> createPage(link, depth)))
.toList();
scope.join();
scope.throwIfFailed();
return futurePages.stream()
.map(Subtask::get)
.collect(toSet());
} catch (ExecutionException ex) {
// [...]
}
SimpleFileServer.createFileServer(
address,
serverDir.toAbsolutePath(),
OutputLevel.INFO)
.start();
java -p jars $mainClass $args
Let’s watch Jose’s exploration…
great domain modeling with
records, sealed classes, pattern matching
easy, structured (and scalable) concurrency
on-board HTTP client and simple web server
easy experimentation and packaging