Host from url javascript

Содержание

Parsing Hostname and Domain from a Url with Javascript
Parsing the Hostname From a Url
Parsing the Domain From a Url
Checking If a Url is an External Link
Try the Javascript Yourself @ JSFiddle
About the Author
Как разобрать URL в JavaScript?
1. Структура URL
2. Конструктор URL()
3. Строка запроса (query string)
4. Название хоста (hostname)
5. Путь (pathname)
6. Хеш (hash)
7. Проверка (валидация) URL
8. Работа с URL
9. Заключение
URL: host property
Value
Examples
Specifications
Browser compatibility
See also
Found a content problem with this page?

Parsing Hostname and Domain from a Url with Javascript

Parsing urls is a common task in a variety of web applications, including both server-side based code (as in the case of C# ASP .NET or node.js web applications) and client based code, including Javascript. Parsing urls to extract the hostname and domain can present its own set of challenges, as the format of a url may vary greater. Typically, extracting the hostname itself is often easier than determining the actual domain. This is simply due to the format of a hostname, which may include multiple sub-domains preceding the actual domain (for example, sub.site.co.uk).

In this tutorial, we’ll describe two basic Javascript methods for parsing a url on the client and extracting the host name and domain. We’ll also include a bonus Javascript method for determining if a url points to an external domain from the existing web page.

Parsing the Hostname From a Url

Extracting the hostname from a url is generally easier than parsing the domain. The hostname of a url consists of the entire domain plus sub-domain. We can easily parse this with a regular expression, which looks for everything to the left of the double-slash in a url. We remove the “www” (and associated integers e.g. www2), as this is typically not needed when parsing the hostname from a url.

The Javascript for parsing the hostname from a url appears as follows:

The above code will successfully parse the hostnames for the following example urls:

http://WWW.first.com/folder/page.html
first.com

http://mail.google.com/folder/page.html
mail.google.com

https://mail.google.com/folder/page.html
mail.google.com

http://www2.somewhere.com/folder/page.html?q=1
somewhere.com

https://www.another.eu/folder/page.html?q=1
another.eu

Parsing the Domain From a Url

We can extract the domain from a url by leveraging our method for parsing the hostname. Since the above getHostName() method gets us very close to a solution, we just need to remove the sub-domain and clean-up special cases (such as .co.uk). Our Javascript code for parsing the domain from a url appears as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
function getDomain(url) {
var hostName = getHostName(url);
var domain = hostName;

if (hostName != null) {
var parts = hostName.split(‘.’).reverse();

if (parts != null && parts.length > 1) {
domain = parts[1] + ‘.’ + parts[0];

if (hostName.toLowerCase().indexOf(‘.co.uk’) != —1 && parts.length > 2) {
domain = parts[2] + ‘.’ + domain;
}
}
}

return domain;
}

In the above code, we take the hostname from the url, split its parts by period, and then reverse the list of parts. We then concatenate the first two parts of the hostname (actually, the last two parts of the hostname, but reversed). We optionally pre-pend any additional pieces, per the TLD rules for the domain, such as in the case of .co.uk.

The above code will successfully parse the domains for the following example urls:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
http://sub.first-STUFF.com/folder/page.html?q=1
first-STUFF.com

http://www.amazon.com/gp/registry/wishlist/3B513E3J694ZL/?tag=123
amazon.com

http://sub.this-domain.co.uk/folder
this-domain.co.uk

http://mail.google.com/folder/page.html
google.com

https://mail.google.com/folder/page.html
google.com

http://www2.somewhere.com/folder/page.html?q=1
somewhere.com

https://www.another.eu/folder/page.htmlq=1
another.eu

https://my.sub.domain.get.com:567/folder/page.html?q=1
get.com

Note, the above code works for a large percentage of urls. However, for perfect accuracy you would require a TLD lookup table. This would allow you to determine the exact types of sub-domains to parse for country-specific domains. In the above method, we’re hard-coding in Javascript the determination for .co.uk. Each country has their own rules and definitions for their TLD. The above code successfully parses the majority, but not 100% of all country-specific urls.

Checking If a Url is an External Link

Our final Javascript method determines if a link is an external link on a page. An external link is a url that points to a 3rd-party outside domain, different from the domain the web browser is currently displaying. A different domain includes a different root domain, different sub-domain, different protocol (http vs https), or different port number. We can use a Javascript regular expression to parse the url and compare the url parts, as described above, to determine if the url is an external link. The Javascript code appears as follows:

function isExternal(url) {
var match = url.match(/^([^:\/?#]+:)?(?:\/\/([^\/?#]*))?([^?#]+)?(\?[^#]*)?(#.*)?/);
if (match != null && typeof match[1] === ‘string’ &&
match[1].length > 0 && match[1].toLowerCase() !== location.protocol)
return true;

if (match != null && typeof match[2] === ‘string’ &&
match[2].length > 0 &&
match[2].replace(new RegExp(‘:(‘+{‘http:’:80,‘https:’:443}[location.protocol]+‘)?$’),»)
!== location.host) {
return true;
}
else {
return false;
}
}

Try the Javascript Yourself @ JSFiddle

You can try the above Javascript to parse the hostname and domain from a url online at JSFiddle.

About the Author

This article was written by Kory Becker, software developer and architect, skilled in a range of technologies, including web application development, machine learning, artificial intelligence, and data science.

Источник

Как разобрать URL в JavaScript?

Представляю Вашему вниманию перевод заметки «How to Parse URL in JavaScript: hostname, pathname, query, hash» автора Dmitri Pavlutin.

Унифицированный указатель ресурса или, сокращенно, URL — это ссылка на веб-ресурс (веб-страницу, изображение, файл). URL определяет местонахождения ресурса и способ его получения — протокол (http, ftp, mailto).

Например, вот URL данной статьи:

https://dmitripavlutin.com/parse-url-javascript

Часто возникает необходимость получить определенные элементы URL. Это может быть название хоста (hostname, dmitripavlutin.com ) или путь (pathname, /parse-url-javascript ).

Удобным способом получить отдельные компоненты URL является конструктор URL() .

В этой статье мы поговорим о структуре и основных компонентах URL.

1. Структура URL

Изображение лучше тысячи слов. На представленном изображении Вы можете видеть основные компоненты URL:

2. Конструктор URL()

Конструктор URL() — это функция, позволяющая разбирать (парсить) компоненты URL:

const url = new URL(relativeOrAbsolute [, absoluteBase])

Аргумент relativeOrAbsolute может быть абсолютным или относительным URL. Если первый аргумент — относительная ссылка, то второй аргумент, absoluteBase , является обязательным и представляет собой абсолютный URL — основу для первого аргумента.

Например, инициализируем URL() с абсолютным URL:

const url = new URL('http://example.com/path/index.html') url.href // 'http://example.com/path/index.html'

Теперь скомбинируем относительный и абсолютный URL:

const url = new URL('/path/index.html', 'http://example.com') url.href // 'http://example.com/path/index.html'

Свойство href экземпляра URL() возвращает переданную URL-строку.

После создания экземпляра URL() , Вы можете получить доступ к компонентам URL. Для справки, вот интерфейс экземпляра URL() :

Здесь тип USVString означает, что JavaScript должен возвращать строку.

3. Строка запроса (query string)

Свойство url.search позволяет получить строку запроса URL, начинающуюся с префикса ? :

const url = new URL( 'http://example.com/path/index.html?message=hello&who=world' ) url.search // '?message=hello&who=world'

Если строка запроса отсутствует, url.search возвращает пустую строку (»):

const url1 = new URL('http://example.com/path/index.html') const url2 = new URL('http://example.com/path/index.html?') url1.search // '' url2.search // ''

3.1. Разбор (парсинг) строки запроса

Вместо получения исходной строки запроса, мы можем получать ее параметры.

Легкий способ это сделать предоставляет свойство url.searchParams . Значением данного свойства является экземпляр интерфейса URLSeachParams.

Объект URLSearchParams предоставляет множество методов для работы с параметрами строки запроса ( get(param), has(param) и т.д.).

Давайте рассмотрим пример:

const url = new Url( 'http://example.com/path/index.html?message=hello&who=world' ) url.searchParams.get('message') // 'hello' url.searchParams.get('missing') // null

url.searchParams.get(‘message’) возвращает значение параметра message строки запроса.

Доступ к несуществующему параметру url.searchParams.get(‘missing’) возвращает null .

4. Название хоста (hostname)

Значением свойства url.hostname является название хоста URL:

const url = new URL('http://example.com/path/index.html') url.hostname // 'example.com'

5. Путь (pathname)

Свойство url.pathname содержит путь URL:

const url = new URL('http://example.com/path/index.html?param=value') url.pathname // '/path/index.html'

Если URL не имеет пути, url.pathname возвращает символ / :

const url = new URL('http://example.com/'); url.pathname; // '/'

6. Хеш (hash)

Наконец, хеш может быть получен через свойство url.hash :

const url = new URL('http://example.com/path/index.html#bottom') url.hash // '#bottom'

Если хеш отсутствует, url.hash возвращает пустую строку (»):

const url = new URL('http://example.com/path/index.html') url.hash // ''

7. Проверка (валидация) URL

При вызове конструктора new URL() не только создается экземпляр, но также осуществляется проверка переданного URL. Если URL не является валидным, выбрасывается TypeError .

Например, http ://example.com не валидный URL, поскольку после http имеется пробел.

Попробуем использовать этот URL:

Поскольку ‘http ://example.com’ неправильный URL, как и ожидалось, new URL(‘http ://example.com’) выбрасывает TypeError .

8. Работа с URL

Такие свойства, как search, hostname, pathname, hash доступны для записи.

Например, давайте изменим название хоста существующего URL с red.com на blue.io :

const url = new URL('http://red.com/path/index.html') url.href // 'http://red.com/path/index.html' url.hostname = 'blue.io' url.href // 'http://blue.io/path/index.html'

Свойства origin, searchParams доступны только для чтения.

9. Заключение

Конструктор URL() является очень удобным способом разбора (парсинга) и проверки (валидации) URL в JavaScript.

new URL(relativeOrAbsolute, [, absoluteBase] в качестве первого параметра принимает абсолютный или относительный URL. Если первый параметр является относительным URL, вторым параметром должен быть абсолютный URL — основа для первого аргумента.

После создания экземпляра URL() , Вы можете получить доступ к основным компонентам URL:

url.search — исходная строка запроса
url.searchParams — экземпляр URLSearchParams для получения параметров строки запроса
url.hostname — название хоста
url.pathname — путь
url.hash — значение хеша