I think different identification systems might work better for different tasks.
For example, books are classified in libraries with different schemes: you have an author and a title, an isle and a shelf number to find the physical book, a code following the
Dewey Classification, but also an
ISBN number.
Each way to "address" a book is actually answering a question: who wrote this book and what is it about? where do I find it? how can I identify it and order it from the publisher? Which questions do
we have about Stunts tracks and replay?
In ZakStunts I need to answer the questions "which of the known tracks does this replay belong to?" and "have I seen this replay before?". For this, I have been using a full-file SHA1 hash, which has good distribution guarantees and makes collisions unlikely. Given the small space of the existing tracks and replay, you could keep the first 7-8 characters of the hash and still avoid collisions. But these identifiers are opaque and difficult to remember, they are not meant for humans.
This is also a strict file comparison, while other "similarity" measures are of course possible. One could consider only the track, minus scenery elements—but sometimes a well placed building can make a difference. Maybe there's a good "checksum" based on adding the "values" of each tile in the track, beginning with the start line and following all paths. And going full data-science, if you consider the map of a track as an image, you can apply
feature recognition techniques to compare tracks. But what question does this answer? Can it map tracks on a space which gives us useful insights, or is it merely a numeric exercise, and looking at the maps would be faster?
If the question instead is about storing files on disc, the ZCTnnnn format is not bad, and could be extended to other competitions. However, I've recently been working on splitting the concepts for "track" and "race", at least in the back-end.
A
track is just a track file, paired with a record (title, author, preferred 8-character name). A
race has a ZCT number, a track, start and end dates, extra rules, etc. So ZCT079 is a race that ran between the 1st and the 31st of December 2007, on the DEFAULT track, which was created byt Distinctive Software in 1990. The split can of course be confusing, so when you download the track it's still called
ZCT079.TRK.
While for competition tracks we could use a "competition + counter" id, I can't think of a good way to classify non-competition tracks, which also abound. The reason might be that we don't have corridors and shelves to explore, and that's what we need to find out first, because it will inform our classification system.
Maybe it's by "track type" (flat, aerial, ...), or by "number of minutes for a Lambo track", or...
The number of numbering systems is infinite, but each one will suffer from the mismatch between universality and the subjectivity of defining it. For this reason, I am not too interested in finding the perfect naming pattern, and I'll stick to unreadable hashes. On top of that, we can build facets and tags and any sort of partial classification, each a different view of the track-space, answering different questions
