I am using MongoDB 2.2.2 for 32-bit Windows7 machine. I have a complex aggregation query in a .js file. I need to execute this file on the shell and direct the output to a CSV file. I ensure that the query returns a "flat" json (no nested keys), so it is inherently convertible to a neat csv.
I know about load()
and eval()
. eval()
requires me to paste the whole query into the shell and allows only printjson()
inside the script, while I need csv. And, the second way: load()
..It prints the output on the screen, and again in json format.
Is there a way Mongo can do this conversion from json to csv? (I need csv file to prepare charts on the data). I am thinking:
1. Either mongo has a built-in command for this that I can't find right now.
2. Mongo can't do it for me; I can at most send the json output to a file which I then need to convert to csv myself.
3. Mongo can send the json output to a temporary collection, the contents of which can be easily mongoexported
to csv format. But I think only map-reduce queries support output collections. Is that right? I need it for an aggregation query.
Thanks for any help :)
I know this question is old but I spend an hour trying to export a complex query to csv and I wanted to share my thoughts. First I couldn't get any of the json to csv converters to work (although this one looked promising). What I ended up doing was manually writing the csv file in my mongo script.
This is a simple version but essentially what I did:
print("name,id,email");
db.User.find().forEach(function(user){
print(user.name+","+user._id.valueOf()+","+user.email);
});
This I just piped the query to stdout
mongo test export.js > out.csv
where test
is the name of the database I use.
Mongo's in-built export is working fine, unless you want to any data manipulation like format date, covert data types etc.
Following command works as charm.
mongoexport -h localhost -d databse -c collection --type=csv
--fields erpNum,orderId,time,status
-q '{"time":{"$gt":1438275600000}, "status":{"$ne" :"Cancelled"}}'
--out report.csv
--type=csv
instead of --csv
.
Extending other answers:
I found @GEverding's answer most flexible. It also works with aggregation:
test_db.js
print("name,email");
db.users.aggregate([
{ $match: {} }
]).forEach(function(user) {
print(user.name+","+user.email);
}
});
Execute the following command to export results:
mongo test_db < ./test_db.js >> ./test_db.csv
Unfortunately, it adds additional text to the CSV file which requires processing the file before we can use it:
MongoDB shell version: 3.2.10
connecting to: test_db
But we can make mongo shell stop spitting out those comments and only print what we have asked for by passing the --quiet
flag
mongo --quiet test_db < ./test_db.js >> ./test_db.csv
2022-05-20T11:19:13.288+0200 I CONTROL [js] machdep.cpu.extfeatures unavailable
as a first line even with --quiet
Here is what you can try:
print("id,name,startDate")
cursor = db.<collection_name>.find();
while (cursor.hasNext()) {
jsonObject = cursor.next();
print(jsonObject._id.valueOf() + "," + jsonObject.name + ",\"" + jsonObject.stateDate.toUTCString() +"\"")
}
Save that in a file, say "export.js". Run the following command:
mongo <host>/<dbname> -u <username> -p <password> export.js > out.csv
Have a look at this
for outputing from mongo shell to file. There is no support for outputing csv from mongos shell. You would have to write the javascript yourself or use one of the many converters available. Google "convert json to csv" for example.
Just weighing in here with a nice solution I have been using. This is similar to Lucky Soni's solution above in that it supports aggregation, but doesn't require hard coding of the field names.
cursor = db.<collection_name>.<my_query_with_aggregation>;
headerPrinted = false;
while (cursor.hasNext()) {
item = cursor.next();
if (!headerPrinted) {
print(Object.keys(item).join(','));
headerPrinted = true;
}
line = Object
.keys(item)
.map(function(prop) {
return '"' + item[prop] + '"';
})
.join(',');
print(line);
}
Save this as a .js
file, in this case we'll call it example.js
and run it with the mongo command line like so:
mongo <database_name> example.js --quiet > example.csv
I use the following technique. It makes it easy to keep the column names in sync with the content:
var cursor = db.getCollection('Employees.Details').find({})
var header = []
var rows = []
var firstRow = true
cursor.forEach((doc) =>
{
var cells = []
if (firstRow) header.push("employee_number")
cells.push(doc.EmpNum.valueOf())
if (firstRow) header.push("name")
cells.push(doc.FullName.valueOf())
if (firstRow) header.push("dob")
cells.push(doc.DateOfBirth.valueOf())
row = cells.join(',')
rows.push(row)
firstRow = false
})
print(header.join(','))
print(rows.join('\n'))
When executing a script in a remote server. Mongo will add its own logging output, which we might want to omit from our file. --quiet
option will only disable connection related logs. Not all mongo logs. In such case we might need to filter out unneeded lines manually. A Windows based example:
mongo dbname --username userName --password password --host replicaset/ip:port --quiet printDataToCsv.js | findstr /v "NETWORK" > data.csv
This will pipe the script output and use findstr
to filter out any lines, which have NETWORK string in them. More information on findstr: https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/findstr
A Linux version of this would use grep
.
Success story sharing
use <database>
to the top of the scripttest
in the last command is the name of the database, just replace it with the name of your database.