Predictive Hacks

# How to Read and Write Files in Python

Most Data Scientists use Pandas for reading files, provided that the data are structured. In this tutorial, we will work with the “open” built-in function that takes two arguments, such as the file name and the mode. The mode indicates what action is required like reading, writing or creating and it also defines the format like text or binary. Below, we represent the description of the modes.

## File Modes

For this tutorial, we have created a simple txt file called myfile.txt with the following content:

This is the first line
This is the second line
This is the third line
This is the forth line
and this is the fith and final line

Let’s see how we can read it.

### Using the open function

We can read the file using the “open” function as follows:

# open the file with the mode r which means "read"
my_file = open('myfile.txt', mode = 'r')

# read the content of the file storing
# it in a variable called data

print(data)

# close the connection
my_file.close()



Output:

This is the first line
This is the second line
This is the third line
This is the forth line
and this is the fith and final line

Using the with open function

Alternatively, we can use the “with open” function. The main difference is that it closes the connection automatically, and this is very helpful for file and handling. Let’s code!

with open('myfile.txt', mode = 'r') as my_file:
print(data)


Output:

This is the first line
This is the second line
This is the third line
This is the forth line
and this is the fith and final line

### The Three Methods for Reading Files

The three methods for reading files in Python with the open function are:

It returns the entire contents of the file as a string that will contain all the characters. You can also pass in an integer to return only the specified number of characters in the file. For example, let’s return the first 10 characters.

with open('myfile.txt', mode = 'r') as my_file:

# read the 10 first characters
print(data)



Output:

This is th

It returns the first line of the file. For example:

with open('myfile.txt', mode = 'r') as my_file:

print(data)



Output:

This is the first line

Notice that the readline() function can take an integer argument for returning a specific number of characters of the first line.

It returns the entire content as a list, where each element corresponds to a line. For example:

with open('myfile.txt', mode = 'r') as my_file:

print(data)



Output:

['This is the first line\n', 'This is the second line\n', 'This is the third line\n', 'This is the forth line\n', 'and this is the fith and final line\n']

## Write Files

By changing the mode in the open function, we can create files. Let’s create a new empty file called “newfile.txt”.

# the 'w' mode is for write
with open('newfile.txt', mode='w') as my_file:
pass



### write() method

We have created an empty file called “newfile.txt”. Let’s see how we can add content to a new file.

# the 'w' mode is for write
with open('newfile.txt', mode='w') as my_file:

# add text to the new file
my_file.write('I write the first line')



So the “newfile.txt” has the line “I write the first line”.

### writelines() method

We can write multiple lines at once using the writelines method and passing a list. For example.

# the 'w' mode is for write
with open('newfile.txt', mode='w') as my_file:

# add text as a list and add the \n for the new lines
my_file.writelines(['first line\n', 'second line\n', 'third line\n'])



Let’s see the content of the file:

cat newfile.txt

first line
second line
third line

Notice that every time we run the open function with the “w” mode, it overwrites the file.

### Append New Lines

We can append new lines using the a mode that comes from append. For example, let’s add another three lines to our previous file.

# the 'w' mode is for write
with open('newfile.txt', mode='a') as my_file:

# append new lines
my_file.writelines(['forth line\n', 'fifth line\n', 'sixth line\n'])



Let’s see the content of the file.

cat newfile.txt

first line
second line
third line
forth line
fifth line
sixth line

## Error Handling

It is common in Data Engineering pipelines to read files that sometimes for some reason do not exist. So, it is necessary to handle the errors with exceptions. For example, assume that we try to open a file that does not exist.

with open('nonexisting.txt', mode = 'r') as my_file:

print(data)



Output:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_21724\2298896326.py in <module>
----> 1 with open('nonexisting.txt', mode = 'r') as my_file:
2
3     # read the 10 first characters
5     print(data)

FileNotFoundError: [Errno 2] No such file or directory: 'nonexisting.txt'

As we can see, we got the “FileNotFoundError” error. Let’s see how we can handle with the try-except.

try:
with open('nonexisting.txt', mode = 'r') as my_file:

# read the 10 first characters
print(data)
except FileNotFoundError as e:
print ('Error', e)



Output:

Error [Errno 2] No such file or directory: 'nonexisting.txt'

### Get updates and learn from the best

Miscellaneous

#### How to Redirect and Save Errors in Unix

In Unix, there are three types of redirection such as: Standard Input (stdin) that is denoted by 0. Usually, it’s

Python

#### Content-Based Recommender Systems with TensorFlow Recommenders

In this post, we will consider as a reference point the “Building deep retrieval models” tutorial from TensorFlow and we