Assume that you have multiple .txt
files and you want to concatenate all of them into a unique .txt
file. Assume that your .txt
files are within the dataset
folder. Then you will need to get the path of them:
import os # find all the txt files in the dataset folder inputs = [] for file in os.listdir("dataset"): if file.endswith(".txt"): inputs.append(os.path.join("dataset", file)) # concatanate all txt files in a file called merged_file.txt with open('merged_file.txt', 'w') as outfile: for fname in inputs: with open(fname, encoding="utf-8", errors='ignore') as infile: outfile.write(infile.read())
With the snippet above, we managed to concatenate all of them into one file called merged_file.txt
. In the case where the files are large, you can work as follows:
with open('merged_file.txt', 'w') as outfile: for fname in inputs: with open(fname, encoding="utf-8", errors='ignore') as infile: for line in infile: outfile.write(line)