Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

Jason McIver


A Javascript developer excited about Meteor, Mongo, React and Node.

Basic MongoDB queries

Intro

This is a bit of a cheatsheet and collection of common MongoDB queries with examples. I've selected a document structure that might be used to describe a bunch of movies. It contains strings, numbers, dates and arrays as both values and nested docs.

Database structure

Movies = [{  
    _id: ObjectId("507f191e810c19728aa120bf"),
    title: 'Super Troopers',
    director: 'Jay Chandrasekhar',
    writers: [    'Jay Chandrasekhar',
                'Kevin Heffernan',
                'Steve Lemme',
                'Paul Soter',
                'Erik Stolhanske'
    ],
    cast: [
        {
            name: 'André Vippolis',
            character: 'College Boy 1',
            gender: 'm'
        },
        {
            name: 'Joey Kern',
            character: 'College Boy 2',
            gender: 'm'
        },
        {
            name: 'Geoffrey Arend',
            character: 'College Boy 3',
            gender: 'm'
        },
        {
            name: 'Erik Stolhanske',
            character: 'Rabbit',
            gender: 'm'
        },
        {
            name: 'Jay Chandrasekhar',
            character: 'Thorny',
            gender: 'm'
        },
        {
            name: 'Steve Lemme',
            character: 'Mac',
            gender: 'm'
        },
        {
            name: 'Kevin Heffernan',
            character: 'Farva',
            gender: 'm'
        },
        {
            name: 'Paul Soter',
            character: 'Foster',
            gender: 'm'
        },
        {
            name: 'Camille Hickman',
            character: 'Thin Queen Bartender',
            gender: 'f'
        },
        {
            name: 'Marisa Coughlan',
            character: 'Ursula',
            gender: 'f'
        },
        {
            name: 'Aria Alpert',
            character: 'Waitress',
            gender: 'f'
        }
    ],
    genres: ['comedy', 'crime', 'mystery'],
    boxOffice: {
        budget: 1200000,
        openingWeekend: 7149203,
           releaseDate: ISODate('2002-02-14T13:00:00.000Z')
    }
}]

Data copied from IMDB

Create

Create document by inserting JSON object as the first parameter in insert().

db.movies.insert({  
    title: 'Super Troopers 2',
    director: 'Jay Chandrasekhar',
    writers: [    'Jay Chandrasekhar',
                'Kevin Heffernan',
                'Steve Lemme',
                'Paul Soter',
                'Erik Stolhansk']
})

If no _id is provided MongoDB will create one with an ObjectId() that consists of:

  • seconds since epoch
  • machine id
  • process id
  • a random number

If you don't like the ObjectId() include whatever unique value you like. It would be perfectly fine to use something like the users email as a unique _id. I find that a little unnerving and would prefer to set the _id to something the user has no control over and will never change.

The ObjectId() can be converted back to time stamp

ObjectId("507f191e810c19728aa120bf").getTimestamp()  

Technically you dont need a 'created' date key as you could use the _id but I generally embed createdAt and modifiedAt keys as ISODates to keep things simple for date range queries.

Find

Return all results

db.movies.find({'director': 'Jay Chandrasekhar'})  

Return only the first result.

db.movies.findOne({title:'Super Troopers'})  

Query Operators and Modifiers

find() takes an object to match in the dataset, but you can pass into the value another object proceeded by an Operator

db.movies.find({'boxOffice.budget': {$gte: 1000000}})  

find all movies with a budget over $1,000,000

surrounding field/key names with quotes are optional unless you are specifying a path to a subdocument.

$gte means any value Greater Than or Equal 1000000. There are other operators:

  • $gt greater than
  • $lt less than
  • $gte greater than or equal to
  • $lte less than or equal to
  • $eq equal to
  • $in in provided array
  • $nin Not in provided array

Greater and less than can also be used with dates. We can find all movies in the year of 2002 with.

db.movies.find({  
    $and: [
        {'boxOffice.releaseDate': {$gte: ISODate('2002-01-01T00:00:00.000Z')}},
        {'boxOffice.releaseDate': {$lte: ISODate('2002-12-31T23:59:59.999Z')}}
        ]
    })
Equal to

Can be used against any value including array values.

db.movies.find({genres: {$eq: 'comedy'}})  
Not Equal to

Inverse of $eq means !==

db.movies.find({director: {$ne: 'Steven Allan Spielberg'}})  
In

If you have an array of possible matches. Returns all movies that contain a genre of at least one of these matching genres.

db.movies.find({genres: {$in: ['action','comedy','scifi']}})  
db.movies.find({  
    writers: {
        $in:[   'Jay Chandrasekhar',
                'Kevin Heffernan',
                'Steve Lemme',
                'Paul Soter',
                'Erik Stolhansk'
        ]}
})

Finds all movies written by any of the Broken Lizard teams members

Nin - Not in

Reverse of $in, anything not in array

db.movies.find({genres: {$nin: ['action','romance','scifi']}})  
AND logic operator

Mongo implicitly uses AND with comma seperated queries.

db.movies.find({  
        director: 'Jay Chandrasekhar',
        'boxOffice.openingWeekend': {$gt: 5000000}
})

Which should retreive all movies directed by Jay Chandrasekhar AND grossing over $5,000,000 on the first opening weekend.
But if you need to peform multiple quries on the same field/key you will need to use $and

db.movies.find({  
        $and:[
            {director: {$ne: 'Steven Allan Spielberg'}},
            {director: {$eq: 'Jay Chandrasekhar'}}
        ]
})

$and takes an array of quries where all need to be a match

db.movies.find({  
        $and: [
            {writers: {$eq: 'Jay Chandrasekhar'}},
                {writers: {$eq: 'Kevin Heffernan'}},
                {writers: {$eq: 'Steve Lemme'}},
                {writers: {$eq: 'Paul Soter'}},
                {writers: {$eq: 'Erik Stolhanske'}}
        ]
})

Find all movies where all the members of Broken Lizard are writers.
But there is a better way with $all

ALL logic operator

$all is a bit like $and, the entire array must be present.

db.movies.find({  
            $and: [
              {'writers': {
                $all:['Jay Chandrasekhar',
                  'Kevin Heffernan',
                  'Steve Lemme',
                  'Paul Soter',
                  'Erik Stolhanske']
                }},
              {'cast.name': {
                $all:['Jay Chandrasekhar',
                  'Kevin Heffernan',
                  'Steve Lemme',
                  'Paul Soter',
                  'Erik Stolhanske']
              }}
            ]
})

Where all Broken Lizard members are both writers and cast.

OR logic operator

Same as $and but only one condition must be a match to be included in the result set.

db.movies.find({  
        $or:[
            {director: {$ne: 'Jay Chandrasekhar'}},
            {writers: {$eq: 'Jay Chandrasekhar'}},
            {'cast.name': {$eq: 'Jay Chandrasekhar'}}
        ]
})

Finds any movies where Jay is the Director OR Writer

Here $and is not required but I find it is easier to read and extend later

db.movies.find({  
        $and:[
            {'boxOffice.budget': {$lte: 100000}},
            {'boxOffice.openingWeekend': {$gte: 1000000}}
        ]
})

All movies with a budget less than $100,000 and grossing over $1,000,000 on opening weekend.

$and and $or can be combined

db.movies.find({  
        $and: [
            {$or: [{
              writers: {
                $all:['Jay Chandrasekhar',
                  'Kevin Heffernan',
                  'Steve Lemme',
                  'Paul Soter',
                  'Erik Stolhansk']
                }
              },
              {
              'cast.name': {
                $all:['Jay Chandrasekhar',
                  'Kevin Heffernan',
                  'Steve Lemme',
                  'Paul Soter',
                  'Erik Stolhanske']
                }
              }]
              },
              {genres: {$nin: ['comedy']}}
        ]
})

Looks for any movies where the whole Broken Lizard team are writers or cast and also not a comedy. I don't think we'll see any results here :)

Find search LIKE

In the last examples we've been searching for exact matches or less or greater than but sometimes we need more flexability by providing partial values. Mongo can handle Regular Expression for matching strings in two ways:
Pattern syntax:

db.movies.find({'title': /Trooper/})  

$regex operator

db.movies.find({'title': {$regex: /Trooper/}})  

There are some differences and limitations between the two. You cannot use $regex in $in query. For that you would use /pattern/

db.movies.find({ director: { $in: [ /son/i, /^mc/i, /^mac/i ] } }  

Will return any movies that were directed by people with names ending son, or starting with Mc or Mac.

Search from the begining of the string use the carrot.

db.movies.find({'title': {$regex: /^Super/}})  

Search with case insensitive

db.movies.find({'title': {$regex: /^super/i}})  

Matches Super only at the start of the string, case insensitive.

For long text values that have \n (new lines) use m to search each line

db.movies.find({'plot': {$regex: /Spurbury/i, $options: 'm'}})  

Looks for a match of Spurbury on each line in the plot.

Projection

find() accepts a 2nd parameter that allows specific control over which fields are returned.

db.users.find({_id: '4y8t93y448t0'},{username: 1, password:false})  

This is how we trim down details for security and/or thinning your dataset down saving bandwidth. Generally it's better practice to only return fields that you need.

db.movies.find({  
    $or: [
        {title: /jay/i},
        {director: /jay/i}
    ]
}, {_id: 0, title: 1, director:1})

This might work well as an autocompletion query where the user has searched 'jay'. It searches where titles or directors contain jay as case insensitive. The user will recieve a list of movies slimmed down to just the title and director.

Only the fields contained in the projection object marked true will be returned with the addition of _id. It needs to be explicitly removed if not required.

Projections are also used in aggregation queries by creating smaller datasets to perform additional queries on.

Sort, Limit, Skip

Are kind of like resultset methods and are executed after the query has ran but before it is returned.

Sort takes an object of the key and 1/-1.
1 for Ascending and -1 for Descending

db.movies.find({genres: 'comedy'}).sort('boxOffice.budget': 1)  

finds all comedy movies and sorts it by budget in Ascending order

We can also chain them

db.movies.find({genres: 'comedy'}).sort('boxOffice.budget': 1).limit(5)  

The same as above but only the first 5 Movies

And if you need something for Pagination use Skip() and Limit()

db.movies.find({genres: 'comedy'}).sort('boxOffice.budget': 1).skip(50).limit(25)  

This would give you 25 results per page skipping the first two pages.

If you wish to use a combination of these modifiers it makes sense to use them in the above order or you may have some bizarre results. You wouldn't want to limit before skipping or you'll end up with no results.

Update

The first paramater is the find query where the update with target. The second paramater

db.movies.update(  
    {title: 'Super Troopers'},           // find/select the doc
    {$set: {                             // make changes
        title: 'Super Troopers 2',       // set title
        },
        $currentDate: { modified: true } // update modified date
    }
)

Finds the Movie Super Troopers, changes the title to Super Troopers 2 and updates the modified date to the current time. Although is bad practice and should have been done by searching by _id. If there were multiple matches update() will only apply to the first match found.

Remember to use $set when updating a field. It is very easy to accidently replace the entire document otherwise.

db.movies.update(  
    {_id: ObjectId("507f191e810c19728aa120bf")},
    {$inc: {
        'boxOffice.openingWeekend: 100
        }
    }
)

Increments the opening weekend reported earnings by $100. $inc can also accept a negative number for subtracting.

db.movies.update(  
    {title: 'Super Troopers'},
    {$push: 
        {'cast': {
            'name': 'Little Billy',
            'character': 'Background kid 13',
            'gender': 'm'
            }
        }
    }
)    

Pushes a whole object into the cast array.

Upsert

Will create a new document if it doesn't already exist otherwise will update the existing one. And is the 3rd parameter of find().

I can imagine this being a fast way to dump data from IoT devices where doing a find first maybe too slow on a dataset with billions of entries.

db.drills.update(  
    {'drill_id': 'y43787t4t834y9u92r0'},
    {$set: 
        {
        'operator_id': '8yt4378ty70340',
        'temperature': 149
        },
        $currentDate: { modified: true }
    },
    {true} // enable upsert
)

Everytime this drill is used it sets the operators ID and its temperature. It doesn't matter how many drills there are or when new ones are added.

db.fasteners.update(  
    {'_id': 'jf9ewrg9w845h0f9k3'},
    {$push: 'tightens': {
                'torqueSet': 59,
                'torqueApplied': 32,
                'drill_id': 'y43787t4t834y9u92r0',
                'operator_id': '8yt4378ty70340',
                'createdAt': new ISODate()
                }
    }
    {true} // enable upsert
)

Why not barcode every screw/rivet/nail when build a plane. A tiny QR code on the head could be scanned when loaded into the drill and each time the torque clutch slips it creates a new screw document, if the operator re-tightens the screw further data will be saved on the same screw.
This would open a world of statistics on drill usage, clutch wear and whether operators need to be retrained. Not to mention history on how the plane was constructed and accountability.
This is something Bosch are already doing.

Multi

In the examples above changes would only be applied to the first query match. To apply changes to every match there is 4th parameter required.

Maybe Jay decided that all his movies were also actions movies now. This will add the action genre to all his movies.
$push will add an element to an array:

db.movies.update(  
    {director: 'Jay Chandrasekhar'},
    {$push: {'genres': 'action'}},
    {false}, // we don't necessarily need upsert
    {true}   // update multiple matches
)    

But then whole Broken Lizard team decide when together their movies are not action.
$pull will remove an element from an array:

db.movies.update(  
      {'writers': {
          $all:['Jay Chandrasekhar',
            'Kevin Heffernan',
            'Steve Lemme',
            'Paul Soter',
            'Erik Stolhanske']
      }},
      {$pull: {genres: 'action'}},
      {false},
      {true}
)

Now we're back to normal with our genres, unless there are any movies that were directed by Jay without the Broken Lizard team.

Delete

remove() is pretty straight forward:

db.movies.remove({director: 'Jay Chandrasekhar'})  

Removes all movies that were directed by Jay Chandrasekhar.

To limit deletions to 1 you can pass in true as the second parameter

db.movies.remove({director: 'Jay Chandrasekhar'},1)  

Will stop after deleting one.