Part 5: Importing Data and Writing Files
Topic 1: Reading .csv, .txt, .xlsx Files
Reading in .csv Using read_csv() Function
To read in a csv file you can use the command pd.read_csv(PATH). The path is the location of the file on your computer it tends to look like “C:\Users\yourname\etc…”. When using a path you will have to use two backslashes instead of one. This applies anytime you use a path. The only time you won’t have to designate the path of a file is when the file is in the same place as the python file.
Reading in .txt Using open() Function
To read in a txt file you can use the “open” function. The open function has 3 modes “r, w, a”. The read mode limits you to opening the file and reading what’s inside without being able to change or add to it. Write mode creates a file.
Reading the file once it’s opened is done using the read() and readlines() function in Python.
The read function returns everything in the file but you can specify the amount of characters. Also From the output you see “\n” this indicates a new line. This is customary notation when reading txt files there are more but they’re not important to know at this moment.
Whenever you read in a file you’re going to want to make sure you data.close() the file because keeping it open takes away from your computer’s memory, system resources, and if left open it could corrupt the file when you terminate the program.
Reading Data using .xlsx
Reading in an excel sheet is the same as reading in a csv but when reading in an xlsx file you have to specify the sheet name or index to determine which sheet you want to import.
Topic 2: Reading in Data From URLs
To read in data from a URL you will have to use the urllib package this will allow you to takes files from the internet without downloading them and leaving your programming environment.
Using the URL retrieve function you can take the URL and assign the csv to a file and then read the file into your environment.
Topic 3: Writing Data to .csv, .xlsx
Reading out data is pretty simple when it comes to csv’s. Dataframe objects have the method to_csv. Within that you’re able to select the path where the file should be put, the name of the file, and whole lot of other things you can look at in the documentation .
Just like with csv files dataframes have a to_excel method but you again have to designate a sheet name to create it.