How to Read and Write CSV Files in Python

The CSV format is the most commonly used import and export format for databases and spreadsheets. This tutorial will give a introduction of the csv module in Python. We will learn about all the functions and classes that it makes available for you to read and write data to CSV files. We have also included some working examples to show you how to read and write data to an actual CSV file in Python.

What Is a CSV File?

The term CSV stands for Comma Separated Values. A CSV format file allows data to be saved in a tabular structure with a .csv extension. CSV files have been used extensively in e-commerce applications because they are considered very easy to process. Some of the areas where they have been used include:

  • importing and exporting customer data
  • importing and exporting products
  • exporting orders
  • exporting e-commerce analytic reports

Reader and Writer Functions

The csv module has several functions and classes available for working with CSV files. Lets begin by learning how to use the reader() and writer() functions.

csv.reader

This function returns a reader object which iterates over all the lines in the given CSV file. The csv.reader() function takes the following parameters:

  • csvfile: The first parameter can be any object which supports the iterator protocol and returns a string each time its __next__() method is called.
  • dialect='excel': An optional parameter used to define a set of parameters specific to a particular CSV dialect. It is set to excel by default. You can get the name of all the registered dialects by calling the list_dialects() function.
  • fmtparams: The third parameter indicates that you can optionally pass multiple keyword arguments. They are useful for overriding the individual formatting parameters in the current dialect.

Here is a very simple example showing how to use the csv.reader() function.

Did you notice the newline='' passed to open() while opening example.csv? That’s ideally how you should open a csvfile object if it is a file.

csv.writer

This function returns a writer object that has a bunch of helper methods to help you write your data into the CSV file. This function is similar to the csv.reader() function. It takes the following parameters:

  • csvfile: This can be any object with a write() method. Again, you should open it using the newline='' parameter if it is a file object.
  • dialect='excel': An optional parameter used to define a set of parameters specific to a particular CSV.
  • fmtparam: This just indicates that you can pass a bunch of optional keyword arguments to override the behavior of the current dialect.

DictReader and DictWriter Classes

The DictReader and DictWriter classes from the csv module are also helpful for reading and writing to CSV file. Although they are similar to the reader and writer functions, these classes use dictionary objects for reading and writing the data.

DictReader

The DictReader class creates an object which maps the information in each row of the given file to a dictionary whose keys can be specified by the optional fieldnames parameter. If you omit the fieldnames parameter, the values in the first row will be used as keys.

Here is a very basic example of reading the contents of a file called people.csv using the DictReader class.

Ideally, you will have the same number of rows in your file as the number of fields you passed in the fieldnames parameter. What happens if that’s not the case? There are two possibilities here.

First, a row that you are reading has more fields than those in fieldnames. In this case, the remaining data will be put inside a list and then stored in a fieldname whose value is determined by the restkey parameter. Second, a row has fewer fields than those in fieldnames. In this case, the missing values will be filled-in using the value you specified for the restval parameter.

DictWriter

The DictWriter class creates an object which maps dictionaries onto rows that you want to output or write to your file. Just like DictReader, this class also accepts fieldnames as its second parameter. The order of different fields inside fieldnames determines the order in which the values in different dictionaries will be passed to the writerow() method.

Unlike the DictReader class, the fieldnames parameter in DictWriter is not optional. This makes sense because it is possible for some of the dictionaries to have a different set of keys and using fieldnames parameter will clear any ambiguity around usage of those keys.

The dictionaries that you want to write to a CSV file can have a missing key that is present in fieldnames. In such cases, the value is determined by the optional restval parameter. If any dictionary has an extra key this is not in fieldnames, you can use the extrasaction key to specify what to do. The default behavior is to raise a ValueError exception.

Dialects and Formatting

A dialect is a helper class used to define the parameters for a specific reader or writer instance. Dialects and formatting parameters need to be declared when calling a reader or writer function or method.

There are several attributes which are supported by a dialect:

  • delimiter: A one-character string used to separate fields. It defaults to ','.
  • quotechar: A one-character string used to quote fields containing special characters. This includes the delimiter or the quotechar character. It defaults to '"'.
  • doublequote: Controls how instances of quotechar appearing inside a field should be quoted. Can be True or False. Keep in mind that you will have to set a value of escapechar if doublequote is set to False.
  • escapechar: A string used by the writer to escape the delimiter if quoting is set to QUOTE_NONE.
  • lineterminator: A string used to terminate lines produced by the writer. It defaults to 'rn'.
  • skipinitialspace: If set to True, any white space immediately following the delimiter is ignored.
  • strict: If set to True, it raises an exception Error on bad CSV input.
  • quoting: Controls when quotes should be generated when reading or writing to a CSV.

The quoting attribute can have one of the four possible values. These are:

  • QUOTE_ALL: This tells the writer to quote all fields.
  • QUOTE_MINIMAL: This tells the writer to only quote those fields which contain a special character like the delimiter or the quotechar character.
  • QUOTE_NONNUMERIC: This tells the writer to quote all non-numeric fields and convert the numeric fields to floats.
  • QUOTE_NONE: This tells the writer to never quote any fields.

Reading a CSV File

Let’s see how to read a CSV file using the function and classes we have discussed above.

We will be working with a CSV file named state-data.csv which contains information about different states in the USA. Here we have our CSV file which contains the names of states, their capital, largest city and population.

Sample State Data in CSV FileSample State Data in CSV FileSample State Data in CSV File

Below is the code for reading the data in our CSV using both the csv.reader function and the csv.DictReader class.

Reading a CSV File With csv.reader

In the code above, we import the csv module and then open our CSV file as state_file. We then define the reader object and use the csv.reader() function to extract the data into the object. We then iterate over the reader object and retrieve each row of our data.

We show the read data by printing its contents to the console. You can see that the data in individual rows has been returned as a list.

Output

Reading a CSV File With DictReader

As I mentioned above, the DictReader class allows us to read a CSV file by mapping the data to a dictionary instead of returning a list as in the case of the csv.reader() function. Here’s how to read a CSV file using the DictReader class.

As usual, we first import the csv module. We then define the reader object and use the DictReader class to extract the data into the object. We then iterate over the reader object and retrieve each row of our data.

Finally, we output the read data into the console. You can see that the data in individual rows was returned as a dictionary this time.

Output

Using the DictReader class is better because it outputs our data as a dictionary which can be easier to work with in certain situations.

We did not pass any value for the fieldnames parameter in the previous example so DictReader used the fields in the first row as dictionary keys. Lets pass a list as the value for the fieldnames parameter this time. I have intentionally left out Largest City and Population from the list of fieldnames to show you how DictReader behaves in this case. I also pass a value for the restkey parameter because we have more items in each row than the fields in fieldnames.

The above code gives the following output:

Since we were already passing the dictionary keys in fieldnames, the first row is no longer used to create the keys and is output as a dictionary. Also note that all the extra values in each row are now being stored in a list and assigned to the key Extra Data in the dictionary.

Writing to a CSV File

Lets now see how to go about writing data into a CSV file using the csv.writer() function and the Dictwriter class discussed at the beginning of this tutorial.

Writing to a CSV File Using csv.writer

The code below writes the data defined to the state-data.csv file.

First we import the csv module, and the writer() function will create an object suitable for writing. We use the method writerow() because there is a single row whose data we want to write to the file.

Here is our CSV with the data we have written to it.

Appending State Data in CSV File Using csv.writer()Appending State Data in CSV File Using csv.writer()Appending State Data in CSV File Using csv.writer()

Writing to a CSV File Using DictWriter

Lets write the following data to our CSV file. It contains information about three different US states in a list with individual elements being a dictionary.

The code is as shown below.

We first define the fieldnames as a list and store them in fields variable. This lets the writer object know what would be the heading of each column in the CSV file. The writerows() method will write all the rows that we supply to it at once to the CSV file. The only condition for writerows() is that the rows that we want to write are iterable.

Each individual row itself should be also be an iterable of strings or numbers if we use the writer() function like we did in the previous example. Otherwise, each row should be a dictionary that maps fieldnames to strings or numbers for the DictWriter class to process it.

Lets try to write the following data to our CSV file now:

There are two noteworthy things about the above data. First, our state Florida is missing some information. Second, the state Georgia has some extra information based what we are storing in the table. Third, the keys for Georgia state are not in the order in which we have defined the fields for our CSV file.

How can we handle non-standard data like this? The DictWriter class offers a solution. For missing keys, we can simply provide a default value using the restval parameter. It is set to an empty string by default. However, you can also specify a custom value such as Unknown. For extra keys, you can use the extrasaction parameter to tell DictWriter to ignore those keys. This parameter will raise a ValueError by default.

Here is how to write to all the rows at once.

Our CSV file will look like this after all the writing operations:

Stata Data Table After Writing OperationsStata Data Table After Writing OperationsStata Data Table After Writing Operations

Conclusion

This tutorial has covered most of what is required to be able to successfully read and write to a CSV file using the different functions and classes provided by Python. CSV files have been widely used in software applications because they are easy to read and manage and their small size makes them relatively fast to process and transfer.

Learn Python

Learn Python with our complete python tutorial guide, whether you’re just getting started or you’re a seasoned coder looking to learn new skills.

This post has been updated with contributions from Monty Shokeen. Monty is a full-stack developer who also loves to write tutorials, and to learn about new JavaScript libraries.

Leave a comment

Your email address will not be published.