A much better Python script to rename all tracknames in a gpx file with the first trackpoint date

Posted on 2011-02-05 by Daniel Belasco Rogers

Of course, just as I predicted, Peter beat me to it and seemed to learn as much about Python as me in a few days whereas its taken me a few years to get this far (sigh). Last night, as I was making progress, he posted me a script he’d already made that does the exact same job as mine.

The job is to name all the tracks in a GPX file with the time of the first trackpoint. See the earlier post for reasons why.

But, I finally got my head around parsing an XML (GPX) file with one of Python’s XML modules. The one in question is the etree subpackage in the lxml package (search the Ubuntu repositories for pythonl-lxml) .

#!/usr/bin/env python
#-*- coding:utf-8 -*-
#
# a script to change the track name of each track in a gpx file to the
# date time of the first trackpoint.
# 
# This script is an update of the very stupid renameTracks.py which
# did the same thing but with string functions
#
# TODO: make it take two arguments, one input file, one output
#
# Copyright 2011 Daniel Belasco Rogers
# 
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see
# .

from lxml import etree
import sys, os.path
from optparse import OptionParser

SUFFIX = '_tracknames'

def main():
    usage = "usage: %prog /path/to/gpx/file.gpx"
    parser = OptionParser(usage, version="%prog 0.1")
    (options, args) = parser.parse_args()
    if len(args) != 1:
        parser.error("\nplease define input GPX file")
    filename = args[0]
    
    if not(os.path.isfile(filename)):
        print "input file does not exist"
        exit(1)
    
    newfile1, newfile2 = filename.split('.')
    newfilename = '%s%s.%s' % (newfile1, SUFFIX, newfile2)
    print newfilename
    tree = etree.parse(filename)
    root = tree.getroot()
    # get namespace
    xmlns = root.nsmap[None]
    #print '\nxmlns = %s' % xmlns

    # the following variables are to simplify the searching under
    # root.iter() and item.find() below - there must be a better way than
    # this to not refer to the namespace all the time

    trk =       '{%s}trk' % xmlns
    name =      '{%s}name' % xmlns
    trkpttime = '{%s}trkseg/{%s}trkpt/{%s}time' % (xmlns, xmlns, xmlns)

    trknum=0 # number of trk tags in file
    for element in root.iter(trk):
        trknum += 1 
        print '-'*48
        print 'track %d' % trknum
        old_name = element.find(name)
        print 'old_name: %s' % old_name.text
        new_name = element.find(trkpttime)
        print 'new_name: %s' % new_name.text
        old_name.text = new_name.text
    print '-'*48

    print '\nwriting file %s\n' % newfilename

    writefile = open(newfilename, 'w')
    writefile.write(etree.tostring(tree, encoding="utf-8", xml_declaration=True))
    writefile.close()

    print 'done - script ends\n'

if __name__ == '__main__':
    sys.exit(main())

It took me a while to get my head around how to search forward in the document once you’d found a trk tag, dig down further to find the first trackpoint time which was under trk>trkseg>trkpt and use this text as the trk>name text. Did it in the end with element.find('NAMESPACE/tag')

This entry was posted in Code, Diary, Python and tagged etree, lxml, xml. Bookmark the permalink.