You can easily create a train and test dataset with Pandas as follows:
# use a random state to be reproducible # 80% train and 20% for test train=df.sample(frac=0.8,random_state=5) test=df.drop(train.index) # if you want an absolute number for the train train=df.sample(n=1000000,random_state=5)