Part 1 of this series ended with the conclusion that performing a text search on listings within a specific area is problematic, because you cannot use two indexes simultaneously. In this second part of the series, I will discuss why this is the case in document-based databases like mongo. I will go over what possible solutions follow from this in part 3.
Why can you not use both indexes simultaneously?
To understand why you cannot use both indexes simultaneously, you first have to understand how databases based on the document model work; and what indexes really are in such an environment.
Mongo (and other similar non-relational/noSQL databases) are based on the document-model, making it fundamentally different from relational database management systems, such as mySQL and Postgres; and more similar to most real life filing systems.
The document model is a key design choice for Mongo as a database system; and it comes with its pros and cons.
- Simple setup: increasing speed of development
- Fast: certain operations are faster and more efficient, because all relevant information is stored in one place
- Flexible: it is easy to add features at a later stage
- Scalable: supports horizontal partitions (shards) out of the box
- Location features for geospatial queries such as $geoNear
- Redundancy: the document model sometimes forces you to store redundant information
- Inefficient: document model makes certain operations slow and inefficient, for example when information that is stored redundantly in many documents needs to be updated
Database systems based on the document model are definitely not always the right choice. If your use case is full of different sorts of entities that have all kinds of different relationships to each other, a relational database management system such as mySQL will probably still be the right choice.
In our use case, however, the modeling required can be simply structured in a way similar to a human filing system. And that, together with the flexibility, scalability and speed of development definitely made Mongo the right choice for our project.
Now that you understand how document-based databases like Mongo work, it's time to discuss what an index really is. And it is helpful, in this regard, to think of the index in a book, especially considering the document-like nature of databases like Mongo.
Whether it be in a book or in a database, the goal of an index is to make it quicker to find something. An index in a book makes sure that you don't have to browse through the whole book to find that specific section you're looking for. In the same way, databases use indexes to prevent having to search through the entire database to find a small number of documents.
For our purposes, consider that you have a pretty comprehensive management book with well over a thousand pages, divided up into 101 chapters. And you're looking for a few sections in the book about the hiring and firing process.
Luckily, the book features two sorts of indexes. In the front of the book, there is a table of contents. This is a very simple index: a document listing chapter titles and their corresponding page numbers.
In the back of the book there is also a keyword index. This is a document listing keywords and corresponding page numbers.
Now, to find the section you are looking for you could do a few things.
Firstly, you could browse through the entire book and see if you find the sections you are looking for. This is obviously the least efficient and the most time-consuming option.
Secondly, you could look through the table of contents and see if you find any chapter titles that mention something about dealing with employees; and then browse through those chapters to see if you can find what you're looking for. This would be way more efficient.
Thirdly, you could look up the words hiring and firing in the keyword index; and look up the pages where those words are mentioned. This too, would be way more efficient than just browsing through the book.
What you would almost certainly not do, though, is first look through the table of contents to find chapters that deal with employees; then also go through the keyword index in the back of the book to find which pages mention the words hiring and firing; and then combine the data you found, by cross-referencing the page numbers with the chapters, to determine which pages you're actually going to look at.
In other words, you would not use two indexes simultaneously, because it would be so grossly inefficient as to defeat the entire purpose of using that second index after the first one.
So, there is no way to combine $geoNear and $text, because they both require an index to work. But no coder would be worth his salt if he just accepted the challenges he faced and gave up. So I'm considering a few possible solutions, that I will further discuss in part 3 of this series.