The CSV format is the most commonly used import and export format for databases and spreadsheets. This tutorial will give a introduction of the csv module in Python. We will learn about all the functions and classes that it makes available for you to read and write data to CSV files. We have also included some working examples to show you how to read and write data to an actual CSV file in Python.
What Is a CSV File?
The term CSV stands for Comma Separated Values. A CSV format file allows data to be saved in a tabular structure with a .csv extension. CSV files have been used extensively in e-commerce applications because they are considered very easy to process. Some of the areas where they have been used include:
- importing and exporting customer data
- importing and exporting products
- exporting orders
- exporting e-commerce analytic reports
Reader and Writer Functions
The csv module has several functions and classes available for working with CSV files. Lets begin by learning how to use the reader()
and writer()
functions.
csv.reader
This function returns a reader object which iterates over all the lines in the given CSV file. The csv.reader()
function takes the following parameters:
-
csvfile
: The first parameter can be any object which supports the iterator protocol and returns a string each time its__next__()
method is called. -
dialect='excel'
: An optional parameter used to define a set of parameters specific to a particular CSV dialect. It is set to excel by default. You can get the name of all the registered dialects by calling thelist_dialects()
function. -
fmtparams
: The third parameter indicates that you can optionally pass multiple keyword arguments. They are useful for overriding the individual formatting parameters in the current dialect.
Here is a very simple example showing how to use the csv.reader()
function.
import csv with open('people.csv', newline='') as File: reader = csv.reader(File) for row in reader: print(row)
Did you notice the newline=''
passed to open()
while opening example.csv? That’s ideally how you should open a csvfile
object if it is a file.
csv.writer
This function returns a writer object that has a bunch of helper methods to help you write your data into the CSV file. This function is similar to the csv.reader()
function. It takes the following parameters:
-
csvfile
: This can be any object with awrite()
method. Again, you should open it using thenewline=''
parameter if it is a file object. -
dialect='excel'
: An optional parameter used to define a set of parameters specific to a particular CSV. -
fmtparam
: This just indicates that you can pass a bunch of optional keyword arguments to override the behavior of the current dialect.
DictReader
and DictWriter
Classes
The DictReader
and DictWriter
classes from the csv module are also helpful for reading and writing to CSV file. Although they are similar to the reader and writer functions, these classes use dictionary objects for reading and writing the data.
DictReader
The DictReader
class creates an object which maps the information in each row of the given file to a dictionary whose keys can be specified by the optional fieldnames
parameter. If you omit the fieldnames
parameter, the values in the first row will be used as keys.
Here is a very basic example of reading the contents of a file called people.csv using the DictReader
class.
import csv with open('people.csv', newline='') as csvfile: reader = csv.DictReader(csvfile) for row in reader: print(row['Name'], row['Occuption'])
Ideally, you will have the same number of rows in your file as the number of fields you passed in the fieldnames
parameter. What happens if that’s not the case? There are two possibilities here.
First, a row that you are reading has more fields than those in fieldnames
. In this case, the remaining data will be put inside a list and then stored in a fieldname whose value is determined by the restkey
parameter. Second, a row has fewer fields than those in fieldnames
. In this case, the missing values will be filled-in using the value you specified for the restval
parameter.
DictWriter
The DictWriter
class creates an object which maps dictionaries onto rows that you want to output or write to your file. Just like DictReader
, this class also accepts fieldnames
as its second parameter. The order of different fields inside fieldnames
determines the order in which the values in different dictionaries will be passed to the writerow()
method.
Unlike the DictReader
class, the fieldnames
parameter in DictWriter
is not optional. This makes sense because it is possible for some of the dictionaries to have a different set of keys and using fieldnames
parameter will clear any ambiguity around usage of those keys.
The dictionaries that you want to write to a CSV file can have a missing key that is present in fieldnames
. In such cases, the value is determined by the optional restval
parameter. If any dictionary has an extra key this is not in fieldnames
, you can use the extrasaction
key to specify what to do. The default behavior is to raise a ValueError
exception.
Dialects and Formatting
A dialect is a helper class used to define the parameters for a specific reader
or writer
instance. Dialects and formatting parameters need to be declared when calling a reader or writer function or method.
There are several attributes which are supported by a dialect:
-
delimiter
: A one-character string used to separate fields. It defaults to','
. -
quotechar
: A one-character string used to quote fields containing special characters. This includes thedelimiter
or thequotechar
character. It defaults to'"'
. -
doublequote
: Controls how instances of quotechar appearing inside a field should be quoted. Can beTrue
orFalse
. Keep in mind that you will have to set a value of escapechar if doublequote is set toFalse
. -
escapechar
: A string used by the writer to escape the delimiter if quoting is set toQUOTE_NONE
. -
lineterminator
: A string used to terminate lines produced by thewriter
. It defaults to'rn'
. -
skipinitialspace
: If set toTrue
, any white space immediately following the delimiter is ignored. -
strict
: If set toTrue
, it raises an exception Error on bad CSV input. -
quoting
: Controls when quotes should be generated when reading or writing to a CSV.
The quoting attribute can have one of the four possible values. These are:
-
QUOTE_ALL
: This tells thewriter
to quote all fields. -
QUOTE_MINIMAL
: This tells thewriter
to only quote those fields which contain a special character like the delimiter or thequotechar
character. -
QUOTE_NONNUMERIC
: This tells thewriter
to quote all non-numeric fields and convert the numeric fields to floats. -
QUOTE_NONE
: This tells thewriter
to never quote any fields.
Reading a CSV File
Let’s see how to read a CSV file using the function and classes we have discussed above.
We will be working with a CSV file named state-data.csv which contains information about different states in the USA. Here we have our CSV file which contains the names of states, their capital, largest city and population.
Below is the code for reading the data in our CSV using both the csv.reader
function and the csv.DictReader
class.
Reading a CSV File With csv.reader
import csv with open('state-data.csv', newline='') as state_file: reader = csv.reader(state_file) for row in reader: print(row)
In the code above, we import the csv module and then open our CSV file as state_file
. We then define the reader object and use the csv.reader()
function to extract the data into the object. We then iterate over the reader
object and retrieve each row of our data.
We show the read data by printing its contents to the console. You can see that the data in individual rows has been returned as a list.
Output
['Name', 'Capital City', 'Largest City', 'Population'] ['Alabama', 'Montgomery', 'Huntsville', '5024279'] ['Alaska', 'Juneau', 'Anchorage', '733391'] ['Arizona', 'Phoenix', 'Phoenix', '7151502'] ['Arkansas', 'Little Rock', 'Little Rock', '3011524']
Reading a CSV File With DictReader
As I mentioned above, the DictReader
class allows us to read a CSV file by mapping the data to a dictionary instead of returning a list as in the case of the csv.reader()
function. Here’s how to read a CSV file using the DictReader
class.
import csv with open('state-data.csv', newline='') as state_file: reader = csv.DictReader(state_file) for row in reader: print(row)
As usual, we first import the csv module. We then define the reader object and use the DictReader
class to extract the data into the object. We then iterate over the reader
object and retrieve each row of our data.
Finally, we output the read data into the console. You can see that the data in individual rows was returned as a dictionary this time.
Output
{'Name': 'Alabama', 'Capital City': 'Montgomery', 'Largest City': 'Huntsville', 'Population': '5024279'} {'Name': 'Alaska', 'Capital City': 'Juneau', 'Largest City': 'Anchorage', 'Population': '733391'} {'Name': 'Arizona', 'Capital City': 'Phoenix', 'Largest City': 'Phoenix', 'Population': '7151502'} {'Name': 'Arkansas', 'Capital City': 'Little Rock', 'Largest City': 'Little Rock', 'Population': '3011524'}
Using the DictReader
class is better because it outputs our data as a dictionary which can be easier to work with in certain situations.
We did not pass any value for the fieldnames
parameter in the previous example so DictReader
used the fields in the first row as dictionary keys. Lets pass a list as the value for the fieldnames
parameter this time. I have intentionally left out Largest City and Population from the list of fieldnames
to show you how DictReader
behaves in this case. I also pass a value for the restkey
parameter because we have more items in each row than the fields in fieldnames
.
import csv fields = ['Name', 'Capital City'] with open('state-data.csv', newline='') as state_file: reader = csv.DictReader(state_file, fields, restkey='Extra Data') for row in reader: print(row)
The above code gives the following output:
{'Name': 'Name', 'Capital City': 'Capital City', 'Extra Data': ['Largest City', 'Population']} {'Name': 'Alabama', 'Capital City': 'Montgomery', 'Extra Data': ['Huntsville', '5024279']} {'Name': 'Alaska', 'Capital City': 'Juneau', 'Extra Data': ['Anchorage', '733391']} {'Name': 'Arizona', 'Capital City': 'Phoenix', 'Extra Data': ['Phoenix', '7151502']} {'Name': 'Arkansas', 'Capital City': 'Little Rock', 'Extra Data': ['Little Rock', '3011524']}
Since we were already passing the dictionary keys in fieldnames
, the first row is no longer used to create the keys and is output as a dictionary. Also note that all the extra values in each row are now being stored in a list and assigned to the key Extra Data in the dictionary.
Writing to a CSV File
Lets now see how to go about writing data into a CSV file using the csv.writer()
function and the Dictwriter
class discussed at the beginning of this tutorial.
Writing to a CSV File Using csv.writer
The code below writes the data defined to the state-data.csv
file.
import csv state_info = ["California", "Sacramento", "Los Angeles", "39538223"] with open('state-data.csv', 'a', newline='') as state_file: writer = csv.writer(state_file) writer.writerow(state_info)
First we import the csv module, and the writer()
function will create an object suitable for writing. We use the method writerow()
because there is a single row whose data we want to write to the file.
Here is our CSV with the data we have written to it.
Writing to a CSV File Using DictWriter
Lets write the following data to our CSV file. It contains information about three different US states in a list with individual elements being a dictionary.
state_info = [ { "Name": "Colorado", "Largest City": "Denver", "Capital City": "Denver", "Population": "5773714" }, { "Name": "Connecticut", "Largest City": "Hartford", "Capital City": "Bridgeport", "Population": "3605944" }, { "Name": "Delaware", "Largest City": "Dover", "Capital City": "Wilmington", "Population": "989948" } ]
The code is as shown below.
import csv fields = ['Name', 'Capital City', 'Largest City', 'Population']; with open('state-data.csv', 'a', newline='') as state_file: writer = csv.DictWriter(state_file, fields) writer.writerows(state_info)
We first define the fieldnames as a list and store them in fields
variable. This lets the writer
object know what would be the heading of each column in the CSV file. The writerows()
method will write all the rows that we supply to it at once to the CSV file. The only condition for writerows()
is that the rows that we want to write are iterable.
Each individual row itself should be also be an iterable of strings or numbers if we use the writer()
function like we did in the previous example. Otherwise, each row should be a dictionary that maps fieldnames
to strings or numbers for the DictWriter
class to process it.
Lets try to write the following data to our CSV file now:
state_info = [ { "Name": "Florida", "Capital City": "Tallahassee" }, { "Name": "Georgia", "Area": "153910", "Largest City": "Atlanta", "Population": "10711908", "Capital City": "Atlanta", } ]
There are two noteworthy things about the above data. First, our state Florida is missing some information. Second, the state Georgia has some extra information based what we are storing in the table. Third, the keys for Georgia state are not in the order in which we have defined the fields for our CSV file.
How can we handle non-standard data like this? The DictWriter
class offers a solution. For missing keys, we can simply provide a default value using the restval
parameter. It is set to an empty string by default. However, you can also specify a custom value such as Unknown. For extra keys, you can use the extrasaction
parameter to tell DictWriter
to ignore those keys. This parameter will raise a ValueError
by default.
Here is how to write to all the rows at once.
import csv fields = ['Name', 'Capital City', 'Largest City', 'Population'] with open('state-data.csv', 'a', newline='') as state_file: writer = csv.DictWriter(state_file, fields, restval='Unknown', extrasaction='ignore') writer.writerows(state_info)
Our CSV file will look like this after all the writing operations:
Conclusion
This tutorial has covered most of what is required to be able to successfully read and write to a CSV file using the different functions and classes provided by Python. CSV files have been widely used in software applications because they are easy to read and manage and their small size makes them relatively fast to process and transfer.
Learn Python
Learn Python with our complete python tutorial guide, whether you’re just getting started or you’re a seasoned coder looking to learn new skills.
This post has been updated with contributions from Monty Shokeen. Monty is a full-stack developer who also loves to write tutorials, and to learn about new JavaScript libraries.