File Naming#

A logical and well thought through file naming convention enables others (and yourself) to easily locate files. A file name should provide an understanding and give clear context to the contents and the history of the file. Developing a file naming convention at the beginning of a research project has many benefits some of which are:

  • makes files easy to find and retrieve

  • allows for easy identification of the current or most recent version

  • collaborators can easily access and share files

  • files can be logically sorted and retrieved using key identifiers

  • makes retention and disposal information easily recognisable

Three principles for file names#

When deciding on best practice for developing a file naming convention there are three pinciples that should be considered.

The three principles include files that are:#

_images/Three-principles.png
Image: The three principles of file naming

Human readable#

Human readable file names can be easily read and understood by humans. The description within the file name should be informative without being too long. Any intended users of the file, either a collaborator or yourself should be able to understand what the file contains and how it should be used.

Looking at the example below, file names on the left, use a more descriptive format compared to the abbreviated counterparts on the right. The description provided is something that may save you time and could be an important identifier if reusing data after a project is complete.

_images/human_readable_not_options_v2.png
Image: How to name files

Machine readable#

Machine readable file names can be easily read and processed by computers. There are a few simple rules that can help you deicide on a file name structure that supports computational processing.

  • only use alphanumeric characters.
    Using special characters and spaces in your file name can create problems for machine readability. Some operataing systems use special characters to generate specific commands, adding a special character to a file name could potentially generate an error.

  • special characters are also used in glob patterns.
    This is where a wild card character such as * is inserted into a search or command prompt to extend a search return list. For example if searching a comprehensive list of birds, using the search term * bird will return any words within the search criteria that contain the word bird, i.e. balckbird, lyrebird, wattlebird.

  • avoid using spaces between words.
    Like special characters, spaces are often not readable by machines and may cause an error or problem within your file. Best practice for separating words or elements within a file name is to use a dash or underscore.

  • some operating systems are case sensitive and the use of lowercase letters in file names is recommended.

Plays well with default ordering#

There are a few ways to apply default ordering to your file naming convention.

  • Alphabetical ordering The first way is to order files alphabetically and this is often the default ordering process used by operating systems. Alphabetical ordering may not necessarily be a logical process for you and possibly prefixing the file name with a number or date will make more sense.

  • Numerical ordering Allows you to order your files by prefixing the file names with numbers. Importantly for this process, best practice is to left pad a number with a leading zero.

_images/left_pad_ordering.png
Image: Left Pad Ordering table
  • Chronological ordering Arranges files by date. The ISO standard is recommended when prefixing a file name using the date. This format lends well to sorting and is formatted as YYYY-MM-DD.

_images/iso_8601_2x_small.png
Image: ISO 8601

The three principles

The principles are not prescriptive and may not be applicible in all circumstances.

Creating a file naming schema that makes sense for your data#

The file naming princicples explained in this resource are not mutually exclusive. This means that it is okay for you to use various elements from each module to develop a file naming structure that works for you and your data.

Below is an example of a project that uses a file naming convention that is clear and logical. The naming convention uses unique identifiers to outline the contents of each file and the data set has been published along with a key making each identifier easily readable.

_images/zenodo-cat-meows.jpg
Zenodo: CatMeows

CatMeows project file naming convention key:#

_images/Key_zenodo-cat-meows.jpg

Listen#


Maine Coon being brushed


Maine Coon waiting for food


European shorthair isolated in an unfamiliar environment

Video: Organising and Naming your Data#

Note

In the video Organising and Naming your Data, Nicola Laurent a University of Melbourne project archivist talks about the file naming conventions used for the Find and Connect research project. Below is an example of the desribed structure. Click on the information points to read about each segment.

Question

Looking at the file naming convention used in the structure above, can you identify any possible issues with the file names?
Expand the dropdown arrows below to read about some potential issues.

Issue one The length of the file names Long file names can create issues for opperating systems and preservation of data. In a future state, if the file is converted the name may become truncated resulting in loss of essential and identifiable information.
Issue two The use of the special character & Within a file name, it is best to avoid using special characters. Some operating systems use special characters to perform certain tasks and may incorrectly interoperate or be unable to read a file.
Issue three Spaces used between words Like special characters, spaces create issues with some operating systems when reading a file. Spaces can be replaced with a dash or underscore to distinguish space between words.
Issue four Unclear flie naming topic/description The use of the term 'mockup 2' in the PowerPoint file (MACG-2016-1-23 Website mockup 2.pptx) has not been defined clearly. The number 2 could refer to a second website or the second version of the document.