This topic is a really enjoyable one for me. It’s quite common in many web applications to accept user input and save a single record to your database. But what about when your users (or you) want to perform multiple inserts in a single command?
In this article, we will demonstrate how to create a CSV template and a form to upload the CSV file, and how to parse the CSV into a Mongoose Model that will be saved to a MongoDB database.
This article assumes that you have a basic understanding of Mongoose and how it interacts with MongoDB. If you don’t, I would suggest reading my Introduction to Mongoose for MongoDB and Node.js article first. This article describes how Mongoose interacts with MongoDB by creating strongly-typed Schemas which a Model is created from. If you already have a good understanding of Mongoose, then let’s proceed.
Getting Started
To begin, you would instantiate a new Node.js application. In a command prompt, navigate to where you want to host your Node.js applications and perform the following commands:
mkdir csvimport cd csvimport npm init -y
Using the -y
flag means that the project gets initialised with the default values in place, so the application will start with index.js. Before creating and parsing CSV files, we need to do some initial setup first. We want to make this a web application; to do that, you are going to use the Express package to handle all of the nitty-gritty server setup. In your command prompt, install Express by running the following command:
npm install express --save
Since this web application will accept files via a web form, we are also going to use the Express sub-package Express File Upload. Let’s install that now as well:
npm install express-fileupload --save
We have done enough initial configuration to set up the web application and create a basic web page that will create a file upload form.
In the root directory, create an index.js file and add the following code snippets. This file sets up the web server.
var app = require('express')(); var fileUpload = require('express-fileupload'); var server = require('http').Server(app); app.use(fileUpload()); app.get('/', function (req, res) { res.sendFile(__dirname + '/index.html'); }); server.listen(8080, () => { console.log('Server started on port 8080') })
This file imports Express and the Express File Upload libraries, configures the web application to use the File Upload, and listens on port 8080. It also creates a route using Express at “/” which will be the default landing page for the web application. This route returns an index.html file that contains the web form that will allow a user to upload a CSV file.
You can start your web server by running the command in your terminal:
node index.js
You should get the message in your terminal Server started on port 8080. This means that you have successfully connected to the web server.
In the root directory, create an index.html file. This file creates the form for uploading a CSV file.
<!DOCTYPE html> <html lang="en"> <head> <title>Upload Authors</title> </head> <body> <p>Use the form below to upload a list of authors. Click <a href="http://code.tutsplus.com/template">here</a> for an example template.</p> <form action="/" method="POST" encType="multipart/form-data"> <input type="file" name="file" accept="*.csv" /> <br/><br/> <input type="submit" value="Upload Authors" /> </form> </body> </html>
This HTML file contains two important things:
- A link to /template which, when clicked, will download a CSV template that can be populated with the information to be imported.
- A form with the
encType
set asmultipart/form-data
and an input field with a type offile
that accepts files with a .csv extension. Themultipart/form-data
is a method of encoding used in HTML when the form contains a file upload.
When you visit http://localhost:8080/ you will see the form created earlier.
As you may have noticed, the HTML makes reference to an Author template. If you read the Introduction to Mongoose article, an Author Schema was created. In this article, you are going to recreate this Schema and allow the user to bulk import a collection of authors into the MongoDB database. Let’s take a look at the Author Schema. Before we do that, though, you probably guessed it—we need to install the Mongoose package:
npm install mongoose --save
Creating the Schema and Model
With Mongoose installed, let’s create a new author.js file that will define the author schema and model:
var mongoose = require('mongoose'); var authorSchema = mongoose.Schema({ _id: mongoose.Schema.Types.ObjectId, name: { firstName: { type: String, required: true }, lastName: String }, biography: String, twitter: { type: String, validate: { validator: function(text) { if (text !== null && text.length > 0) return text.indexOf('https://twitter.com/') === 0; return true; }, message: 'Twitter handle must start with https://twitter.com/' } }, facebook: { type: String, validate: { validator: function(text) { if (text !== null && text.length > 0) return text.indexOf('https://www.facebook.com/') === 0; return true; }, message: 'Facebook Page must start with https://www.facebook.com/' } }, linkedin: { type: String, validate: { validator: function(text) { if (text !== null && text.length > 0) return text.indexOf('https://www.linkedin.com/') === 0; return true; }, message: 'LinkedIn must start with https://www.linkedin.com/' } }, profilePicture: Buffer, created: { type: Date, default: Date.now } }); var Author = mongoose.model('Author', authorSchema); module.exports = Author;
With the author schema and model created, let’s switch gears and focus on creating the CSV template that can be downloaded by clicking on the template link. To aid with the CSV template generation, we will use the json2csv package. This npm package converts JSON into CSV with column titles and proper line endings.
npm install json2csv --save
Now you are going to update the previously created index.js file to include a new route for /template. You will append this new code for the template route to the previously created index.js file:
var template = require('./template.js'); app.get('/template', template.get);
The first thing this code does is include a new template.js file (to be created next) and create a route for /template. This route will call a get
function in the template.js file.
With the Express server updated to include the new route, let’s create the new template.js file:
var json2csv = require('json2csv').parse; exports.get = function(req, res) { var fields = [ 'name.firstName', 'name.lastName', 'biography', 'twitter', 'facebook', 'linkedin' ]; var csv = json2csv({ data: '', fields: fields }); res.set("Content-Disposition", "attachment;filename=authors.csv"); res.set("Content-Type", "application/octet-stream"); res.send(csv); };
This file first includes the installed json2csv
package. It then creates and exports a get
function. This function accepts the request and response objects from the Express server.
Inside the function, you create an array of the fields that you want to include in the CSV template. This can be done in one of two ways. The first way (which is done in this tutorial) is to create a static list of the fields to be included in the template. The second way is to dynamically create the list of fields by extracting the properties from the author schema.
The second way would be to use the following code:
var fields = Object.keys(Author.schema.obj);
It would have been a nice idea to use the dynamic method, but it becomes a bit complicated when you don’t want to include multiple properties from the Schema to the CSV template. In this case, our CSV template doesn’t include the _id
and created
properties because these will be populated via code. However, if you don’t have fields you wish to exclude, the dynamic method will work as well.
Creating the CSV Template
With the array of fields defined, you will use the json2csv
package to create the CSV template from your JavaScript object. This csv
object will be the results of this route.
And finally, using the res
property from the Express server, two header properties will be set that will force the downloading of an authors.csv file.
At this point, if you were to run your Node application and navigate to http://localhost:8080/ on your web browser, the web form would be displayed with a link to download the template. Clicking the link to download the template will allow you to download the authors.csv file. This file will be populated before it is uploaded.
Opening the template file in Microsoft Excel, the CSV file should be populated as such:
Here is an example of the populated CSV file:
name.firstName,name.lastName,biography,twitter,facebook,linkedin Mary,Doe,frontend developer,https://twitter.com/mary,https://www.facebook.com/mary,https://www.linkedin.com/mary Michael,Doe,backend developer,https://twitter.com/michael,https://www.facebook.com/michael,https://www.linkedin.com/michael John,Doe,app developer,https://twitter.com/john,https://www.facebook.com/john,https://www.linkedin.com/john
This example, when uploaded, will create three authors.
The puzzle pieces are starting to come together and form a picture. Let’s get to the meat and potatoes of this example and parse that CSV file. The index.js file requires some updating to connect to MongoDB and create a new POST route that will accept the file upload:
var app = require('express')(); var fileUpload = require('express-fileupload'); var server = require('http').Server(app); var template = require('./template.js'); var upload = require('./upload.js'); app.use(fileUpload()); app.get('/', function (req, res) { res.sendFile(__dirname + '/index.html'); }); app.get('/template', template.get); app.post('/', upload.post); mongoose.connect('mongodb://localhost/csvimport'); server.listen(8080, () => { console.log('Server started on port 8080') })
With a database connection and a new POST route configured, it’s time to parse the CSV file. Luckily, there are several great libraries that assist with this job. For this tutorial, we will use the fast-csv
package that can be installed with the following command:
npm install fast-csv --save
The POST route was created similarly to the template route, which invokes a post
function from the upload.js file. It is not necessary to place these functions in separate files; however, it is best to create separate files for these routes as it helps keep the code nice and organized.
Submitting Data
And finally, let’s create the upload.js file that contains the post
function that is called when the previously created form is submitted:
var csv = require('fast-csv'); var mongoose = require('mongoose'); var Author = require('./author'); exports.post = function (req, res) { if (!req.files) return res.status(400).send('No files were uploaded.'); var authorFile = req.files.file; var authors = []; csv .fromString(authorFile.data.toString(), { headers: true, ignoreEmpty: true }) .on("data", function(data){ data['_id'] = new mongoose.Types.ObjectId(); authors.push(data); }) .on("end", function(){ Author.create(authors, function(err, documents) { if (err) throw err; }); res.send(authors.length + ' authors have been successfully uploaded.'); }); };
Quite a bit is happening in this file. The first three lines include the necessary packages that will be required to parse and save the CSV data.
Next, the post
function is defined and exported for use by the index.js file. Inside this function is where the magic takes place.
The function first checks that there is a file contained in the request body. If there is not, an error is returned indicating that a file must be uploaded.
When a file has been uploaded, a reference to the file is saved to a variable called authorFile
. This is done by accessing the files
array and the file
property in the array. The file
property matches the name of my file input name that has been first defined in the index.html example.
An empty authors
array that will be populated as the CSV file is parsed is created. This array will be used to save the data to the database.
The fast-csv
library is now called by leveraging the fromString
function. This function accepts the CSV file as a string. The string is extracted from the authorFile.data
property. The data
property contains the contents of the uploaded CSV file.
The next line includes two options for the fast-csv
function: headers
and ignoreEmpty
. These are both set to true
. This tells the library that the first line of the CSV file will contain the headers and that empty rows should be ignored.
With the options configured, we set up two listener functions that are called when the data
event and the end
event are triggered. The data
event is called once for every row of the CSV file. This event contains a JavaScript object of the parsed data.
The object is updated to include the _id
property of the author schema with a new ObjectId
. This object is then added to the authors
array.
When the CSV file has been fully parsed, the end
event is triggered. Inside the event callback function, we will call the create
function on the author model, passing the array of authors
to it.
If an error occurs trying to save the array, an exception is thrown; otherwise, a success message is displayed to the user indicating how many authors have been uploaded and saved to the database.
The database schema should be similar to this:
If you would like to see the full source code, you can check out the GitHub repository with the code.
Conclusion
In this tutorial, we have only uploaded a couple of records. If your use case requires the ability to upload thousands of records, it might be a good idea to save the records in smaller chunks.
This can be done in several ways. If I were to implement it, I would suggest updating the data
callback function to check the length of the authors array. When the array exceeds your defined length, e.g. 100, call the Author.create
function on the array, and then reset the array to empty. This will then save the records in chunks of 100. Be sure to leave the final create
call in the end
callback function to save the final records.
Enjoy!
This post has been updated with contributions from Mary Okosun. Mary is a software developer based in Lagos, Nigeria, with expertise in Node.js, JavaScript, MySQL, and NoSQL technologies.
Post thumbnail generated with OpenAI DALL-E.