🔗 java.net.URI — Precise, Immutable Identifiers¶
Essence:
URIis an immutable parser/builder for Uniform Resource Identifiers. It models pieces of an identifier (scheme, authority, path, query, fragment), does no I/O, and can resolve/relativize/normalize paths safely.
1) The anatomy of a URI¶
scheme ":" [//authority] path ["?" query] ["#" fragment]
authority = [userinfo "@"] host [ ":" port ]
Example:
https://alice:secret@example.com:8443/api/v1/users?q=bob#top
└────┬─┘ └──────┬──────┘└─────────authority──────────┘└path┘ └query┘ └fragment┘
scheme userinfo host port
Opaque vs hierarchical
- Hierarchical: has
//(e.g.,https://...) → has path, canresolve/relativize. - Opaque: no
/after scheme (e.g.,mailto:joe@example.com,urn:isbn:...) → no path semantics.
2) Getting/inspecting parts¶
URI u = URI.create("https://example.com:8443/a/b%20c?x=1&y=2#frag");
u.getScheme(); // "https"
u.getUserInfo(); // null (if present: "alice:secret")
u.getHost(); // "example.com"
u.getPort(); // 8443 (−1 if absent)
u.getPath(); // "/a/b c" (decoded view)
u.getRawPath(); // "/a/b%20c" (encoded view)
u.getQuery(); // "x=1&y=2"
u.getRawQuery(); // "x=1&y=2" (if encoded, you see raw)
u.getFragment(); // "frag"
u.isAbsolute(); // true (has scheme)
u.isOpaque(); // false
u.getAuthority(); // "example.com:8443"
u.getSchemeSpecificPart(); // "//example.com:8443/a/b%20c?x=1&y=2"
Use
getRaw*when you care about exact percent-encoding;get*returns decoded characters.
3) Building URIs correctly (no broken encoding)¶
A) Using multi-arg constructor (lets the JDK encode for you)¶
URI u = new URI(
"https", // scheme
null, // userInfo
"example.com", // host
8443, // port
"/a/b c", // path (unencoded input OK)
"q=a%2Bb&lang=en", // query (give pre-encoded OR build carefully)
"top" // fragment
);
// https://example.com:8443/a/b%20c?q=a%2Bb&lang=en#top
B) For opaque URIs (e.g., mailto)¶
C) If you must start from a string¶
Avoid manual string concatenation for paths. Let the constructor encode path segments. For complex query building, assemble the query string with careful encoding of values (see §6).
4) Resolve / relativize / normalize (path math)¶
URI base = URI.create("https://example.com/app/");
URI rel = URI.create("../img/logo.png");
base.resolve(rel); // https://example.com/img/logo.png
URI target = URI.create("https://example.com/a/b/c");
base.relativize(target); // "a/b/c" (relative from base)
URI messy = URI.create("https://x/y/./z/../a");
messy.normalize(); // https://x/y/a
Works only for hierarchical URIs.
5) Hostnames, IPv6, and IDNs¶
URI v6 = new URI("http", null, "[2001:db8::1]", 8080, "/api", null, null);
// http://[2001:db8::1]:8080/api
// Internationalized domain:
URI idn = URI.create("https://münich.example/straße");
idn.toASCIIString(); // punycode host + percent-encoded path
Rule: IPv6 hosts must be bracketed [addr].
6) Query parameters (safe construction)¶
Do not use URLEncoder on a whole query string; it’s for HTML form encoding of parameter values.
String q = "q=" + URLEncoder.encode("a+b c", StandardCharsets.UTF_8)
+ "&lang=en";
URI u = new URI("https", "example.com", "/search", q, null);
If you’re in Spring, prefer UriComponentsBuilder for ergonomic, safe query building.
7) URI vs URL vs URLConnection¶
| Type | I/O? | Represents | Notes |
|---|---|---|---|
URI |
❌ | Identifier only | Parsing, composition, math |
URL |
⚠️ can open | Location (protocol-bound) | Legacy, mixes ID with I/O |
HttpClient / URLConnection |
✅ | Network access | Build requests; pass URI to them |
Convert when you must:
8) Equality, normalization, case¶
URI a = URI.create("HTTP://EXAMPLE.com/%7Ealice");
URI b = URI.create("http://example.com/~alice");
a.equals(b); // false (character-by-character)
a.normalize().equals(b.normalize()); // still may be false (host case-insensitive, path not normalized by %7E vs ~)
a.toASCIIString().equalsIgnoreCase(b.toASCIIString()); // better, but not perfect
Takeaway: URI#equals is strict. For logical equality, normalize and compare components you care about.
9) Common operations cookbook¶
Join base + segment (without double slashes)
URI base = URI.create("https://api.example.com/users/");
URI next = base.resolve("42"); // https://api.example.com/users/42
Append query param
Strip fragment
Replace path but keep origin
URI replaced = new URI(u.getScheme(), u.getUserInfo(), u.getHost(), u.getPort(),
"/health", null, null);
10) Exceptions and validation¶
new URI(...)throwsURISyntaxException(checked) for invalid syntax.URI.create(...)throwsIllegalArgumentException(unchecked).- Typical bad inputs: unescaped spaces in single-string constructor, malformed IPv6, illegal chars in
host.
11) Performance & immutability¶
URIis immutable and thread-safe. Cache freely.- Parsing is cheap; the heavy part is your own encoding/decoding and network I/O (which
URIdoesn’t do).
12) Pitfalls (and fixes)¶
| Pitfall | Why it hurts | Do instead |
|---|---|---|
Hand-concatenating "/a" + "/" + b |
Double slashes, missed encoding | Use multi-arg constructor + resolve |
Using URLEncoder on entire URI |
Produces invalid %3A// etc. |
Encode values only, or let URI encode path |
| Forgetting brackets for IPv6 | Parser treats : as port |
Use [2001:db8::1] |
Comparing URIs with equals expecting “same resource” |
Too strict; doesn’t canonicalize | Normalize and/or compare components |
Assuming getPath() returns encoded string |
It’s decoded view | Use getRawPath() for exact bytes |
13) Minimal patterns to memorize¶
String → URI (validate)
Safe build (JDK encodes path)
Base + relative
Relativize
URI root = URI.create("https://ex.com/a/");
URI tgt = URI.create("https://ex.com/a/b/c");
root.relativize(tgt); // "b/c"
14) Where it plugs into your stack¶
HttpClient(Java 11+):HttpRequest.newBuilder(URI).build()- Spring (MVC/WebClient):
URIeverywhere; for building useUriComponentsBuilder. - JAX-RS:
UriBuildermirrors the same ideas.