Skip to main content

jq, xq and yq - Handy tools for the command line

The classic Unix command line tools like grep, sed and awk do not cope well with structured data formats like JSON, YAML or even XML. Although I did use them with some acrobatics for such formats, a proper solution requires tools that natively understand those formats.

The jq tool for JSON seems to be already well known and indeed is a very handy tool as many software packages use JSON for serialized data. Here is a short example for parsing metadata of the npm package manager:

dzu@krikkit:~$ jq .name ~/node_modules/bash-language-server/package.json
dzu@krikkit:~$ jq .repository ~/node_modules/bash-language-server/package.json
  "type": "git",
  "url": "git+"
dzu@krikkit:~$ jq .repository.url ~/node_modules/bash-language-server/package.json

The GNU/Linux distributions I use daily, e.g. Debian and Ubuntu, include the package so it is easy to install and keep up to date.

While working on the scripts for my GPX collection I wondered if there is an equivalent for the XML format used here. It would be a significant advantage over the improvised grep and sed commands which I know to be not very robust.

It turns out that there are converters for YAML and XML to JSON and thus can be easily combined into an command-line YAML and XML processor.

Having installed yq with the Python packager pip, we get the yp binary for working with YAML files and the xq binary for XML. Using the jq manual, it is straightforward to apply it to GPX files in my collection. Here are a few examples intended to convey an idea of the possibilities. Extracting the name of the track is easy, also finding the number of track points in it:

dzu@krikkit:/tmp/gpx$ xq 20191231.gpx
"GC - Laushalde"
dzu@krikkit:/tmp/gpx$ xq '.gpx.trk.trkseg.trkpt | length' 20191231.gpx

Array indices make it easy to extract specific waypoints and for example the start and end time of the whole track:

dzu@krikkit:/tmp/gpx$ xq '.gpx.trk.trkseg.trkpt[0]' 20191231.gpx
  "@lat": "48.46705179",
  "@lon": "10.03310901",
  "ele": "382",
  "time": "2019-12-31T11:49:09.000Z"
dzu@krikkit:/tmp/gpx$ xq '.gpx.trk.trkseg.trkpt[0,-1].time' 20191231.gpx

So jq and its relatives are definitely a worthwhile addition to every command toolbox, but for my GPX files I would love to see tools capable of doing real processing of the track points. Although I did look for such tools, I was as yet not successful. So if you know about command line tools to calculate the usual statistics like distance, average speed, max speed, etc. I would love to see pointers in the comment section.


Comments powered by Disqus