Predictive Hacks

How to get the most recent Filename based on Creation or Modification Date

Let’s say that in your data directory you are dealing with many files and you want your script each time to read the most recent file. We will provide an example of how you can get the most recent file using the os module.

Note that we can either work with the creating date or with the modification date. In our example, we will keep track of both of them. Assume that our data are under the data directory and we want to read the file with the most recent modification date:

import os
import pandas as pd

my_files = os.listdir('data')
my_files
['file1.txt', 'file2.txt', 'file3.txt', 'file4.txt']

Now we will create a list of tuples of three elements, the filename, the creating_time and the modification_time.

creation_times = []
modification_times = []
for f in my_files:
    creation_times.append(os.path.getctime(os.path.join('data',f)))
    modification_times.append(os.path.getmtime(os.path.join('data',f)))

files_and_times = list(zip(my_files, creation_times, modification_times))
files_and_times

Output:

[('file1.txt', 1621337449.9179041, 1621337450.1148534),
 ('file2.txt', 1621337468.3333888, 1621337468.5313864),
 ('file3.txt', 1621337480.414948, 1621337486.8144865),
 ('file4.txt', 1621338525.3003578, 1621338525.5193233)]

Now, we want to get the most recent file name based on the modification time.

most_recent = sorted(files_and_times, key = lambda x:x[2], reverse=True)[0][0]
most_recent

And we get:

'file4.txt'

Finally, let’s read our most recent file. Note that we need to specify the full path.

df = pd.read_csv(os.path.join('data',most_recent))
df

Notes

We worked with the os module. For the creation time and modification time we used the following functions:

os.path.getctime(path)
Return the system’s ctime which, on some systems (like Unix) is the time of the last metadata change, and, on others (like Windows), is the creation time for path. The return value is a number giving the number of seconds since the epoch (see the time module). Raise OSError if the file does not exist or is inaccessible.

os.path.getmtime(path)¶
Return the time of last modification of path. The return value is a floating point number giving the number of seconds since the epoch (see the time module). Raise OSError if the file does not exist or is inaccessible.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s