Query speed in Mongo DB.

 

First we need to find out how much time a query takes.

For that we need a larger data set, otherwise we get 0.

db[‘primer-dataset’].find({}).explain(“executionStats”)

will do this.

The most important of the many results appears to be:

 

executionTimeMillis

 

I got 205.0 for the above query.

COLLSCAN means that the whole collection (“table”) had to be searched to find the result.

That is not surprising, because we asked for the whole collection.

 

So let’s try to look only for restaurants in Manhattan.

db[‘primer-dataset’].find({borough: “Manhattan”}).explain(“executionStats”)

 

we still get COLLSCAN and a run time for 128 milliseconds

 

Now we create an index on borough:

db[‘primer-dataset’].createIndex( { borough: 1 });

And we rerun the same query:

db[‘primer-dataset’].find({borough: “Manhattan”}).explain(“executionStats”)

This time we get a runtime of

25 milliseconds

There is no more COLLSCAN.  There is an IXSCAN.

Now there are only five boroughs.

Let’s see what happens if we look for Italian restaurants with and without index.

db[‘primer-dataset’].find({cuisine: “Italian”}).explain(“executionStats”)

 

Runtime is 132 milliseconds

db[‘primer-dataset’].createIndex( { cuisine: 1 });

Now rerun

db[‘primer-dataset’].find({cuisine: “Italian”}).explain(“executionStats”)

It takes 4 milliseconds

Now let’s look for both borough and cuisine without index.

So I have to drop the indexes.

Except… notice I did not assign a name to the indexes, the system did.

So I have to retrieve the names of the indexes first.

db[‘primer-dataset’].getIndexes();

I find that the names of the two indexes are

borough_1

and

cuisine_1

So I drop them now.

db[‘primer-dataset’].dropIndex(“borough_1”);

db[‘primer-dataset’].dropIndex(“cuisine_1”);

I check again and see they are gone:

db[‘primer-dataset’].getIndexes();

Now I do a combined query for cuisine and borough.

db[‘primer-dataset’].find({cuisine: “Italian”, borough: “Manhattan”}).explain(“executionStats”)

(Note: When I used single quotes for a value it worked as well.)

execution time 215 milliseconds

 

Now I will define a combined index.

Defining a combined index is order-dependent.

 

db[‘primer-dataset’].createIndex( { cuisine: 1, borough: 1 });

or

db[‘primer-dataset’].createIndex( { borough: 1, cuisine: 1 });

I need to run both of those.

db[‘primer-dataset’].find({cuisine: “Italian”, borough: “Manhattan”}).explain(“executionStats”)

execution time: 3ms

keys examined: 621

docs examined: 621

 

db[‘primer-dataset’].getIndexes();

db[‘primer-dataset’].dropIndex(“borough_1_cuisine_1”);

After dropping the index,

execution time: 129 ms

keys examined: 0

docs examined: 95359

Now I will create the other index version.

db[‘primer-dataset’].createIndex( { cuisine: 1, borough: 1 });

db[‘primer-dataset’].find({cuisine: “Italian”, borough: “Manhattan”}).explain(“executionStats”)

Result is the same as with the other index.

execution time: 2ms

keys examined: 621

docs examined: 621

The 2ms versus 3ms seems to be a rounding error.

So I was able to show the effect of the combined index, but I was not able to show the order effect.

Material is based on:

https://docs.mongodb.com/manual/tutorial/analyze-query-plan/