xmllint
is a command-line XML tool used to validate and pretty-print XML documents. More importantly, it offers an interactive shell mode which allows you to use xpaths to print out elements. For example, //body
will print out the body
element of an HTML document.
I wrote a useful bash function, which uses xmllint
to evaluate xpaths really easily:
xpath()
{
if [ $# -ne 2]; then
echo "Usage: xpath xpath file"
return 1
fi
xmllint --shell $2 <<< "cat $1" | sed '/^\/ >/d'
}
Example:
sharfah@starship:~> xpath "//body" index.html
<body>Hello World!</body>