Bulk Import a CSV File Into MongoDB Using Mongoose With Node.js

This topic is a really enjoyable one for me. It’s quite common in many web applications to accept user input and save a single record to your database. But what about when your users (or you) want to perform multiple inserts in a single command?

In this article, we will demonstrate how to create a CSV template and a form to upload the CSV file, and how to parse the CSV into a Mongoose Model that will be saved to a MongoDB database.

This article assumes that you have a basic understanding of Mongoose and how it interacts with MongoDB. If you don’t, I would suggest reading my Introduction to Mongoose for MongoDB and Node.js article first. This article describes how Mongoose interacts with MongoDB by creating strongly-typed Schemas which a Model is created from. If you already have a good understanding of Mongoose, then let’s proceed.

Getting Started

To begin, you would instantiate a new Node.js application. In a command prompt, navigate to where you want to host your Node.js applications and perform the following commands:

Using the -y flag means that the project gets initialised with the default values in place, so the application will start with index.js. Before creating and parsing CSV files, we need to do some initial setup first. We want to make this a web application; to do that, you are going to use the Express package to handle all of the nitty-gritty server setup. In your command prompt, install Express by running the following command:

Since this web application will accept files via a web form, we are also going to use the Express sub-package Express File Upload. Let’s install that now as well:

We have done enough initial configuration to set up the web application and create a basic web page that will create a file upload form.

In the root directory, create an index.js file and add the following code snippets. This file sets up the web server. 

This file imports Express and the Express File Upload libraries, configures the web application to use the File Upload, and listens on port 8080. It also creates a route using Express at “/” which will be the default landing page for the web application. This route returns an index.html file that contains the web form that will allow a user to upload a CSV file.

You can start your web server by running the command in your terminal:

You should get the message in your terminal Server started on port 8080. This means that you have successfully connected to the web server.

In the root directory, create an index.html file. This file creates the form for uploading a CSV file.

This HTML file contains two important things:

  1. A link to /template which, when clicked, will download a CSV template that can be populated with the information to be imported.
  2. A form with the encType set as multipart/form-data and an input field with a type of file that accepts files with a .csv extension. The multipart/form-data is a method of encoding used in HTML when the form contains a file upload. 

When you visit http://localhost:8080/ you will see the form created earlier.

Index pageIndex pageIndex page

As you may have noticed, the HTML makes reference to an Author template. If you read the Introduction to Mongoose article, an Author Schema was created. In this article, you are going to recreate this Schema and allow the user to bulk import a collection of authors into the MongoDB database. Let’s take a look at the Author Schema. Before we do that, though, you probably guessed it—we need to install the Mongoose package:

Creating the Schema and Model

With Mongoose installed, let’s create a new author.js file that will define the author schema and model:

With the author schema and model created, let’s switch gears and focus on creating the CSV template that can be downloaded by clicking on the template link. To aid with the CSV template generation, we will use the json2csv package. This npm package converts JSON into CSV with column titles and proper line endings.

Now you are going to update the previously created index.js file to include a new route for /template. You will append this new code for the template route to the previously created index.js file:

The first thing this code does is include a new template.js file (to be created next) and create a route for /template. This route will call a get function in the template.js file.

With the Express server updated to include the new route, let’s create the new template.js file:

This file first includes the installed json2csv package. It then creates and exports a get function. This function accepts the request and response objects from the Express server.

Inside the function, you create an array of the fields that you want to include in the CSV template. This can be done in one of two ways. The first way (which is done in this tutorial) is to create a static list of the fields to be included in the template. The second way is to dynamically create the list of fields by extracting the properties from the author schema.

The second way would be to use the following code:

It would have been a nice idea to use the dynamic method, but it becomes a bit complicated when you don’t want to include multiple properties from the Schema to the CSV template. In this case, our CSV template doesn’t include the _id and created properties because these will be populated via code. However, if you don’t have fields you wish to exclude, the dynamic method will work as well.

Creating the CSV Template

With the array of fields defined, you will use the json2csv package to create the CSV template from your JavaScript object. This csv object will be the results of this route.

And finally, using the res property from the Express server, two header properties will be set that will force the downloading of an authors.csv file.

At this point, if you were to run your Node application and navigate to http://localhost:8080/ on your web browser, the web form would be displayed with a link to download the template. Clicking the link to download the template will allow you to download the authors.csv file. This file will be populated before it is uploaded.

Opening the template file in Microsoft Excel, the CSV file should be populated as such:

authors.csvauthors.csvauthors.csv

Here is an example of the populated CSV file:

This example, when uploaded, will create three authors.

The puzzle pieces are starting to come together and form a picture. Let’s get to the meat and potatoes of this example and parse that CSV file. The index.js file requires some updating to connect to MongoDB and create a new POST route that will accept the file upload:

With a database connection and a new POST route configured, it’s time to parse the CSV file. Luckily, there are several great libraries that assist with this job. For this tutorial, we will use the fast-csv package that can be installed with the following command:

The POST route was created similarly to the template route, which invokes a post function from the upload.js file. It is not necessary to place these functions in separate files; however, it is best to create separate files for these routes as it helps keep the code nice and organized.

Submitting Data

And finally, let’s create the upload.js file that contains the post function that is called when the previously created form is submitted:

Quite a bit is happening in this file. The first three lines include the necessary packages that will be required to parse and save the CSV data.

Next, the post function is defined and exported for use by the index.js file. Inside this function is where the magic takes place.

The function first checks that there is a file contained in the request body. If there is not, an error is returned indicating that a file must be uploaded.

When a file has been uploaded, a reference to the file is saved to a variable called authorFile. This is done by accessing the files array and the file property in the array. The file property matches the name of my file input name that has been first defined in the index.html example.

An empty authors array that will be populated as the CSV file is parsed is created. This array will be used to save the data to the database.

The fast-csv library is now called by leveraging the fromString function. This function accepts the CSV file as a string. The string is extracted from the authorFile.data property. The data property contains the contents of the uploaded CSV file.

The next line includes two options for the fast-csv function: headers and ignoreEmpty. These are both set to true. This tells the library that the first line of the CSV file will contain the headers and that empty rows should be ignored.

With the options configured, we set up two listener functions that are called when the data event and the end event are triggered. The data event is called once for every row of the CSV file. This event contains a JavaScript object of the parsed data.

The object is updated to include the _id property of the author schema with a new ObjectId. This object is then added to the authors array.

When the CSV file has been fully parsed, the end event is triggered. Inside the event callback function, we will call the create function on the author model, passing the array of authors to it.

If an error occurs trying to save the array, an exception is thrown; otherwise, a success message is displayed to the user indicating how many authors have been uploaded and saved to the database.

The database schema should be similar to this:

database schemadatabase schemadatabase schema

If you would like to see the full source code, you can check out the GitHub repository with the code.

Conclusion

In this tutorial, we have only uploaded a couple of records. If your use case requires the ability to upload thousands of records, it might be a good idea to save the records in smaller chunks.

This can be done in several ways. If I were to implement it, I would suggest updating the data callback function to check the length of the authors array. When the array exceeds your defined length, e.g. 100, call the Author.create function on the array, and then reset the array to empty. This will then save the records in chunks of 100. Be sure to leave the final create call in the end callback function to save the final records.

Enjoy!

This post has been updated with contributions from Mary Okosun. Mary is a software developer based in Lagos, Nigeria, with expertise in Node.js, JavaScript, MySQL, and NoSQL technologies.

Post thumbnail generated with OpenAI DALL-E.