Xml read using javascript

Xml read using javascript

Reads XML documents and emits JavaScript objects with a simple, easy to use structure.

  • Small, fast and simple
  • Runs everywhere (browser, node.js, React Native, ServiceWorkers, WebWorkers. )
  • Event driven and synchronous API
  • Can process input piece-by-piece in a serial fashion
  • Stream mode (low memory usage)
  • Reads CDATA sections
npm install --save xml-reader

Objects emitted by the reader are trees where each node has the following structure:

interface XmlNode
name: string; // element name (empty for text nodes)
type: string; // node type (element or text), see NodeType constants
value: string; // value of a text node
parent: XmlNode; // reference to parent node (null with parentNodes option disabled or root node)
attributes: [name: string]: string>; // map of attributes name => value
children: XmlNode[]; // array of children nodes
>

Breaking changes in version 2

Added the tagPrefix option with a default value of ‘tag:’ . This way we avoid possible name collisions with the done event. To keep the old behavior, set it to an empty string.

Check the xml-query package! It is very useful to read values from the structures returned by xml-reader .

Read document (event driven)

Basic example. Read and parse a XML document.

const XmlReader = require('xml-reader');
const reader = XmlReader.create();
const xml =
`
Alice
Bob
Hello
This is a demo!
`;
reader.on('done', data => console.log(data));
reader.parse(xml);
/*
Console output:
< name: 'message',
type: 'element',
children: [
< name: 'to',
type: 'element',
children: [< type: 'text', value: 'Alice' >]>,
< name: 'from',
type: 'element',
children: [< type: 'text', value: 'Bob' >]>,
< name: 'heading',
type: 'element',
attributes: < color: 'blue' >,
children: [< type: 'text', value: 'Hello' >]>,
< name: 'body',
type: 'element',
attributes: < color: 'red' >,
children: [< type: 'text', value: 'This is a demo!' >]>]>
Note: empty values and references to parent nodes removed for brevity!
*/

Read document (synchronous)

This mode is only valid for reading complete documents (root node must be closed).

const XmlReader = require('xml-reader');
const xml = 'Hello!';
const result = XmlReader.parseSync(xml/*, options*/);

In stream mode, nodes are removed from root as they are emitted. This way memory usage does not increases.

const XmlReader = require('xml-reader');
const reader = XmlReader.create(stream: true>);
const xml =
`
`;
reader.on('tag:item', (data) => console.log(data));
// , children: []>
// , children: []>
// , children: []>
reader.on('done', (data) => console.log(data.children.length));
// 0
reader.parse(xml);

You can also listen to all tags:

reader.on('tag', (name, data) => console.log(`received a $name> tag:`, data));

In this example we are calling multiple times to the parser. This is useful if your XML document is a stream that comes from a TCP socket or WebSocket (for example XMPP streams).

Simply feed the parser with the data as it arrives. As you can see, the result is exactly the same as the previous one.

const XmlReader = require('xml-reader');
const reader = XmlReader.create(stream: true>);
const xml =
`
`;
reader.on('tag:item', (data) => console.log(data));
// , children: []>
// , children: []>
// , children: []>
reader.on('done', (data) => console.log(data.children.length));
// 0
// Note that we are calling the parse function providing just one char each time
xml.split('').forEach(char => reader.parse(char));

Use the reset() method to reset the reader. This is useful if a stream gets interrupted and you want to start a new one or to use the same reader instance to parse multiple documents (just reset the reader between them).

const doc1 = '. ';
const doc2 = '. ';
reader.parse(doc1);
// when the document ends, the reader stops emitting events
reader.reset();
// now you can parse a new document
reader.parse(doc2);
stream: false,
parentNodes: true,
tagPrefix: 'tag:',
doneEvent: 'done',
emitTopLevelOnly: false,
>

If true (default), each node of the AST has a parent node which point to its parent. If false the parent node is always null .

Enable or disable stream mode. In stream mode nodes are removed from root after being emitted. Default false . Ignored in parseSync ;

Default value is ‘done’ . This is the name of the event emitted when the root node is closed and the parse is done. Ignored in parseSync ;

Default value is ‘tag:’ . The event driven API emits an event each time a tag is read. Use this option to set a name prefix. Ignored in parseSync ;

Default value is false . When true, tag events are only emitted by top level nodes (direct children from root). This is useful for XMPP streams like XMPP where each top level child is a stanza.

For example, given the following XML stream:

stream>
message from="alice" to="bob">
body>hellobody>
date>2016-10-06date>
message>
message from="alice" to="bob">
body>byebody>
date>2016-10-07date>
message>

tags emitted with emitTopLevelOnly=false

body
date
message
body
date
message

tags emitted with emitTopLevelOnly=true

Источник

Xml read using javascript

Создание игр на Unreal Engine 5

Создание игр на Unreal Engine 5

Данный курс научит Вас созданию игр на Unreal Engine 5. Курс состоит из 12 модулей, в которых Вы с нуля освоите этот движок и сможете создавать самые разные игры.

В курсе Вы получите всю необходимую теоретическую часть, а также увидите массу практических примеров. Дополнительно, почти к каждому уроку идут упражнения для закрепления материала.

Помимо самого курса Вас ждёт ещё 8 бесплатных ценных Бонусов: «Chaos Destruction», «Разработка 2D-игры», «Динамическая смена дня и ночи», «Создание динамической погоды», «Создание искусственного интеллекта для NPC», «Создание игры под мобильные устройства», «Создание прототипа RPG с открытым миром» и и весь курс «Создание игр на Unreal Engine 4» (актуальный и в 5-й версии), включающий в себя ещё десятки часов видеоуроков.

Подпишитесь на мой канал на YouTube, где я регулярно публикую новые видео.

YouTube

Подписаться

Подписавшись по E-mail, Вы будете получать уведомления о новых статьях.

Подписка

Подписаться

Добавляйтесь ко мне в друзья ВКонтакте! Отзывы о сайте и обо мне оставляйте в моей группе.

Мой аккаунт

Мой аккаунт Моя группа

Какая тема Вас интересует больше?

Основы Unreal Engine 5

— Вы получите необходимую базу по Unreal Engine 5

— Вы познакомитесь с множеством инструментов в движке

— Вы научитесь создавать несложные игры

Общая продолжительность курса 4 часа, плюс множество упражнений и поддержка!

Чтобы получить Видеокурс,
заполните форму

Как создать профессиональный Интернет-магазин

Как создать профессиональный Интернет-магазин

— Вы будете знать, как создать Интернет-магазин.

— Вы получите бесплатный подарок с подробным описанием каждого шага.

— Вы сможете уже приступить к созданию Интернет-магазина.

Источник

XML: read and write with Node.js

title image reading

This post demonstrates reading and writing XML in Node.js using fast-xml-parser . We’ll use the Docusauruses XML sitemap as an example.

Docusaurus sitemap

I was prompted to write this post by wanting to edit the sitemap on my Docusaurus blog. I wanted to remove the /page/ and /tag/ routes from the sitemap. They effectively serve as duplicate content and I don’t want them to be indexed by search engines. (A little more is required to remove them from search engines — see the section at the end of the post.) I was able to find the sitemap in the build folder of my Docusaurus site. It’s called sitemap.xml and it’s in the root of the build folder. It looks like this:

  xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">  https://blog.johnnyreilly.com/2012/01/07/standing-on-shoulders-of-giants weekly 0.5   https://blog.johnnyreilly.com/2022/09/20/react-usesearchparamsstate weekly 0.5   https://blog.johnnyreilly.com/page/10 weekly 0.5   https://blog.johnnyreilly.com/tags/ajax weekly 0.5  

fast-xml-parser

After experimenting with a few different XML parsers I settled on fast-xml-parser . It’s fast, it’s simple and it’s well maintained. It also handles XML namespaces and attributes well. (This appears to be rare in XML parsers.) Let’s scaffold up an example project alongside our Docusaurus site:

mkdir trim-xml cd trim-xml npx typescript --init yarn init yarn add @types/node fast-xml-parser ts-node typescript 
 "scripts":  "start": "ts-node index.ts" > > 

Reading XML

Our Docusaurus sitemap is in the build folder of our Docusaurus site. Let’s read it in and parse it into a JavaScript object:

import  XMLParser, XMLBuilder > from 'fast-xml-parser'; import fs from 'fs'; import path from 'path'; interface Sitemap  urlset:  url:  loc: string; changefreq: string; priority: number >[]; >; > async function trimXML()  const sitemapPath = path.resolve( '..', 'blog-website', 'build', 'sitemap.xml' ); console.log(`Loading $sitemapPath>`); const sitemapXml = await fs.promises.readFile(sitemapPath, 'utf8'); const parser = new XMLParser( ignoreAttributes: false, >); let sitemap: Sitemap = parser.parse(sitemapXml); console.log(sitemap); > trimXML(); 

We’re using the XMLParser class to parse the XML into a JavaScript object. We’re also using the ignoreAttributes option to ensure that attributes are included in the parsed object. When we run this we get the following output:

Loading /home/john/code/github/blog.johnnyreilly.com/blog-website/build/sitemap.xml  '?xml':  '@_version': '1.0', '@_encoding': 'UTF-8' >, urlset:  url: [ [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], [Object], . 1481 more items ], '@_xmlns': 'http://www.sitemaps.org/schemas/sitemap/0.9', '@_xmlns:news': 'http://www.google.com/schemas/sitemap-news/0.9', '@_xmlns:xhtml': 'http://www.w3.org/1999/xhtml', '@_xmlns:image': 'http://www.google.com/schemas/sitemap-image/1.1', '@_xmlns:video': 'http://www.google.com/schemas/sitemap-video/1.1' > > 

As we can see, the fast-xml-parser library has parsed the XML into a JavaScript object. We can see that the urlset element has an array of url elements. Each url element has a loc , changefreq and priority element. We can also see that the urlset element has a number of attributes. This matches the XML we saw earlier and the interface we defined.

Filtering and writing XML

Now that we have the XML parsed into a JavaScript object we can filter it just like we would any other JavaScript object. We have all the power of JavaScript at our fingertips! As I mentioned earlier, I want to remove all the URLs that represent duplicate content. This includes «pagination» URLs. These are URLs that are used to navigate between pages of content. For example, the URL https://blog.johnnyreilly.com/page/10 is a pagination URL. I want to remove these URLs from the sitemap. I also want to get rid of the «tags» URLs. These are URLs that are used to navigate between posts that have a particular tag. For example, the URL https://blog.johnnyreilly.com/tags/ajax is a tag URL. I want to remove these URLs from the sitemap too. This is simplicity itself now we’re in JavaScript land. We can use the filter method on the url array to remove the URLs we don’t want:

const rootUrl = 'https://blog.johnnyreilly.com'; const filteredUrls = sitemap.urlset.url.filter( (url) => url.loc !== `$rootUrl>/tags` && !url.loc.startsWith(rootUrl + '/tags/') && !url.loc.startsWith(rootUrl + '/page/') ); 
sitemap.urlset.url = filteredUrls; 

Источник

Читайте также:  Php var class variable
Оцените статью