Skip to main content

The xmlstarlet utility for Linux

ยท 3 min read
Christophe
Markdown, WSL and Docker lover ~ PHP developer ~ Insatiable curious.

The xmlstarlet utility for Linux

xmlstarlet is a powerful utility for Linux allowing manipulating XML data from the command line and can be integrated into shell scripts.

Using xmlstarlet you can beautify XML output but also filter it like f.i. showing only a given node.

To verify if xmlstarlet is already installed on your system, simply run which xmlstarlet. If you get xmlstarlet not found as an answer, please install it: sudo apt-get update && sudo apt-get -y install xmlstarlet

Let's playโ€‹

For the illustration, please start a Linux shell and run mkdir -p /tmp/xmlstarlet && cd $_.

Create a new file called data.xml with this content:

<?xml version="1.0" encoding="UTF-8"?><bookstore><book category="cooking"><title lang="en">Everyday Italian</title><author>Giada De Laurentiis</author><year>2005</year><price>30.00</price></book><book category="children"><title lang="en">Harry Potter</title><author>J K. Rowling</author><year>2005</year><price>29.99</price></book><book category="web"><title lang="en">XQuery Kick Start</title><author>James McGovern</author><author>Per Bothner</author><author>Kurt Cagle</author><author>James Linn</author><author>Vaidyanathan Nagarajan</author><year>2003</year><price>49.99</price></book><book category="web"><title lang="en">Learning XML</title><author>Erik T. Ray</author><year>2003</year><price>39.95</price></book></bookstore>

As you can see, our XML has no format, everything on the same line.

We can beautify it using the format action:

โฏ cat "data.xml" | xmlstarlet format --indent-spaces 4
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>

We can also use Xpath to specify our desired output:

โฏ cat "data.xml" | xmlstarlet sel -t -v "/bookstore/book/title"
Everyday Italian
Harry Potter
XQuery Kick Start
Learning XML

If you don't known XPath yet, we've used "/bookstore/book/title" because our XML is constructed like that. As you can see below, our root node is called bookstore, then we have one or more book and each book has a title.

<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
[...]
</book>
[...]
</bookstore>

We can also make some filtering like getting books for children:

โฏ cat "data.xml" | xmlstarlet sel -t -v "//book[@category='children']/title"
Harry Potter

And here, the XPath expression //book[@category='children']/title means: give me each book; it doesn't matter where the book node is located; but only if it has an attribute named category and whose value is children. Then, if found, display his title.

<bookstore>
<book category="children">
<title lang="en">Harry Potter</title>
[...]
</book>
</bookstore>

Read the official documentation to learn more about xmlstarlet.