Creating a URL
The easiest way to create a URL object is from a String that represents the human-readable form of the URL address. This is typically the form that another person will use for a URL. In your Java program, you can use a String containing this text to create a URL object:
URL myURL = new URL("http://example.com/");
The URL object created above represents an absolute URL. An absolute URL contains all of the information necessary to reach the resource in question. You can also create URL objects from a relative URL address.
Creating a URL Relative to Another
A relative URL contains only enough information to reach the resource relative to (or in the context of) another URL.
Relative URL specifications are often used within HTML files. For example, suppose you write an HTML file called JoesHomePage.html . Within this page, are links to other pages, PicturesOfMe.html and MyKids.html , that are on the same machine and in the same directory as JoesHomePage.html . The links to PicturesOfMe.html and MyKids.html from JoesHomePage.html could be specified just as file names, like this:
These URL addresses are relative URLs. That is, the URLs are specified relative to the file in which they are contained — JoesHomePage.html .
In your Java programs, you can create a URL object from a relative URL specification. For example, suppose you know two URLs at the site example.com :
http://example.com/pages/page1.html http://example.com/pages/page2.html
You can create URL objects for these pages relative to their common base URL: http://example.com/pages/ like this:
URL myURL = new URL("http://example.com/pages/"); URL page1URL = new URL(myURL, "page1.html"); URL page2URL = new URL(myURL, "page2.html");
This code snippet uses the URL constructor that lets you create a URL object from another URL object (the base) and a relative URL specification. The general form of this constructor is:
URL(URL baseURL, String relativeURL)
The first argument is a URL object that specifies the base of the new URL . The second argument is a String that specifies the rest of the resource name relative to the base. If baseURL is null, then this constructor treats relativeURL like an absolute URL specification. Conversely, if relativeURL is an absolute URL specification, then the constructor ignores baseURL .
This constructor is also useful for creating URL objects for named anchors (also called references) within a file. For example, suppose the page1.html file has a named anchor called BOTTOM at the bottom of the file. You can use the relative URL constructor to create a URL object for it like this:
URL page1BottomURL = new URL(page1URL,"#BOTTOM");
Other URL Constructors
The URL class provides two additional constructors for creating a URL object. These constructors are useful when you are working with URLs, such as HTTP URLs, that have host name, filename, port number, and reference components in the resource name portion of the URL. These two constructors are useful when you do not have a String containing the complete URL specification, but you do know various components of the URL.
For example, suppose you design a network browsing panel similar to a file browsing panel that allows users to choose the protocol, host name, port number, and filename. You can construct a URL from the panel’s components. The first constructor creates a URL object from a protocol, host name, and filename. The following code snippet creates a URL to the page1.html file at the example.com site:
new URL("http", "example.com", "/pages/page1.html");
new URL("http://example.com/pages/page1.html");
The first argument is the protocol, the second is the host name, and the last is the pathname of the file. Note that the filename contains a forward slash at the beginning. This indicates that the filename is specified from the root of the host.
The final URL constructor adds the port number to the list of arguments used in the previous constructor:
URL gamelan = new URL("http", "example.com", 80, "pages/page1.html");
This creates a URL object for the following URL:
http://example.com:80/pages/page1.html
If you construct a URL object using one of these constructors, you can get a String containing the complete URL address by using the URL object’s toString method or the equivalent toExternalForm method.
URL addresses with Special characters
Some URL addresses contain special characters, for example the space character. Like this:
http://example.com/hello world/
To make these characters legal they need to be encoded before passing them to the URL constructor.
URL url = new URL("http://example.com/hello%20world");
Encoding the special character(s) in this example is easy as there is only one character that needs encoding, but for URL addresses that have several of these characters or if you are unsure when writing your code what URL addresses you will need to access, you can use the multi-argument constructors of the java.net.URI class to automatically take care of the encoding for you.
URI uri = new URI("http", "example.com", "/hello world/", "");
And then convert the URI to a URL.
MalformedURLException
Each of the four URL constructors throws a MalformedURLException if the arguments to the constructor refer to a null or unknown protocol. Typically, you want to catch and handle this exception by embedding your URL constructor statements in a try / catch pair, like this:
try < URL myURL = new URL(. ); >catch (MalformedURLException e) < // exception handler code here // . >
See Exceptions for information about handling exceptions.
URL s are «write-once» objects. Once you’ve created a URL object, you cannot change any of its attributes (protocol, host name, filename, or port number).
Class URL
Class URL represents a Uniform Resource Locator, a pointer to a «resource» on the World Wide Web. A resource can be something as simple as a file or a directory, or it can be a reference to a more complicated object, such as a query to a database or to a search engine. More information on the types of URLs and their formats can be found at: Types of URL
In general, a URL can be broken into several parts. Consider the following example:
http://www.example.com/docs/resource1.html
The URL above indicates that the protocol to use is http (HyperText Transfer Protocol) and that the information resides on a host machine named www.example.com . The information on that host machine is named /docs/resource1.html . The exact meaning of this name on the host machine is both protocol dependent and host dependent. The information normally resides in a file, but it could be generated on the fly. This component of the URL is called the path component.
A URL can optionally specify a «port», which is the port number to which the TCP connection is made on the remote host machine. If the port is not specified, the default port for the protocol is used instead. For example, the default port for http is 80 . An alternative port could be specified as:
http://www.example.com:1080/docs/resource1.html
The syntax of URL is defined by RFC 2396: Uniform Resource Identifiers (URI): Generic Syntax, amended by RFC 2732: Format for Literal IPv6 Addresses in URLs. The Literal IPv6 address format also supports scope_ids. The syntax and usage of scope_ids is described here.
A URL may have appended to it a «fragment», also known as a «ref» or a «reference». The fragment is indicated by the sharp sign character «#» followed by more characters. For example,
http://www.example.com/index.html#chapter1
This fragment is not technically part of the URL. Rather, it indicates that after the specified resource is retrieved, the application is specifically interested in that part of the document that has the tag chapter1 attached to it. The meaning of a tag is resource specific.
An application can also specify a «relative URL», which contains only enough information to reach the resource relative to another URL. Relative URLs are frequently used within HTML pages. For example, if the contents of the URL:
http://www.example.com/index.html
The relative URL need not specify all the components of a URL. If the protocol, host name, or port number is missing, the value is inherited from the fully specified URL. The file component must be specified. The optional fragment is not inherited.
Constructing instances of URL
The java.net.URL constructors are deprecated. Developers are encouraged to use java.net.URI to parse or construct a URL . In cases where an instance of java.net.URL is needed to open a connection, URI can be used to construct or parse the URL string, possibly calling URI.parseServerAuthority() to validate that the authority component can be parsed as a server-based authority, and then calling URI.toURL() to create the URL instance.
The URL constructors are specified to throw MalformedURLException but the actual parsing/validation that is performed is implementation dependent. Some parsing/validation may be delayed until later, when the underlying stream handler’s implementation is called. Being able to construct an instance of URL doesn’t provide any guarantee about its conformance to the URL syntax specification.
The URL class does not itself encode or decode any URL components according to the escaping mechanism defined in RFC2396. It is the responsibility of the caller to encode any fields, which need to be escaped prior to calling URL, and also to decode any escaped fields, that are returned from URL. Furthermore, because URL has no knowledge of URL escaping, it does not recognise equivalence between the encoded or decoded form of the same URL. For example, the two URLs:
http://foo.com/hello world/ and http://foo.com/hello%20world
Note, the URI class does perform escaping of its component fields in certain circumstances. The recommended way to manage the encoding and decoding of URLs is to use URI , and to convert between these two classes using toURI() and URI.toURL() .
The URLEncoder and URLDecoder classes can also be used, but only for HTML form encoding, which is not the same as the encoding scheme defined in RFC2396.
API Note: Applications working with file paths and file URIs should take great care to use the appropriate methods to convert between the two. The Path.of(URI) factory method and the File(URI) constructor can be used to create Path or File objects from a file URI. Path.toUri() and File.toURI() can be used to create a URI from a file path, which can be converted to URL using URI.toURL() . Applications should never try to construct or parse a URL from the direct string representation of a File or Path instance.
Before constructing a URL from a URI , and depending on the protocol involved, applications should consider validating whether the URI authority can be parsed as server-based.
Some components of a URL or URI, such as userinfo, may be abused to construct misleading URLs or URIs. Applications that deal with URLs or URIs should take into account the recommendations advised in RFC3986, Section 7, Security Considerations.
All URL constructors may throw MalformedURLException . In particular, if the underlying URLStreamHandler implementation rejects, or is known to reject, any of the parameters, MalformedURLException may be thrown. Typically, a constructor that calls the stream handler’s parseURL method may throw MalformedURLException if the underlying stream handler implementation of that method throws IllegalArgumentException . However, which checks are performed, or not, by the stream handlers is implementation dependent, and callers should not rely on such checks for full URL validation.