Geospatial features have proven over the years to significantly increase user engagement. In fact, we got so used to it that now we even expect that any kind of recommendation (news, restaurants, products) should also be location-based. The majority of the applications nowadays are mainly using Radius-based queries, which might naturally lead to imprecise or sub-optimal results. In a world where companies are competing for the user’s attention, precision might be a key differentiating factor, that is why polygon-based search is becoming increasingly popular.

In this article, we will talk about Polygon-based search in Couchbase, a feature that has been added since Couchbase 6.6

 

What is Geospatial Polygon Search

A geospatial query specifies an area, and returns each document that contains a reference to a location within the area. Areas and locations are represented by means of latitudelongitude coordinate pairs.

There are many flavors of geospatial query, Radius-based queries (also known as Point Distance) are the most commonly used and also the easiest one to start with. All you need is a coordinate and the radius of the circle:

radius based search

 

In the bounding box search, you have to specify two latitude-longitude coordinate pairs instead. These are respectively taken to indicate the top left and bottom right corners of a rectangle. Documents are returned if they reference a location within the area of the rectangle:

rectangle geospatial search with Couchbase NoSQL

The two methods above are great when you need approximated results, but they are not enough when you need to limit your search area :

polygon search building

Geospatial Geometry/Polygon search allows you to look for documents contained inside a closed polygon-shaped object defined by a sequence of coordinates. In the image above, for instance, we want to limit our search by only documents with coordinates that live inside the polygon we have defined, which represents a building in this case.

There is virtually no limit on the number of coordinates/points that could be specified in your polygon query, but like any other search engine out there, the performance will naturally degrade if you need to specify very complex polygons.

Geospatial polygon-based use case

Polygon based search (also referred to as Geometry search) is not restricted to people analyzing satellite images, there are many common use cases that could benefit from it:

  • Real State: Find Offices/Houses in a given village or in a specific area
  • Gaming: Spawn of items in specific areas (e.g. different types of pokemon appearing according to the terrain on Pokemon Go)
  • Analytics: How many people passed by a specific region (e.g. How many Uber/Lyft drivers have been in the airport in a given day)
  • Advertisement: Increase the CPC (Cost Per Click) when users are in a specific place (e.g. a Shopping Mall) 
  • Smart Cities: Notify citizens of a region about a potential threat (eg: hailstorm, flooding)

Geospatial Search Growth

We can indirectly demonstrate how popular location-based search has become by comparing the percentage of apps requesting permission to access the user’s location over the years:

source: https://hotforsecurity.bitdefender.com/blog/1-2-percent-of-google-play-store-is-thief-ware-study-shows-7340.html

 

According to the graph above, back in 2013 only 11.68% of the mobile apps were requesting access to the user’s location. This other report suggests that in 2014, nearly 24% of the apps were requesting for the  user’s location:

mobile_app_permissions_2014

source: https://www.statista.com/statistics/486440/leading-google-play-app-permissions/

If we fast forward to 2020, this third report already suggests that 95% of the apps in China are requesting access to the user’s location:

permissions_mobile_app_2020

source: https://www.statista.com/statistics/1111353/china-most-common-mobile-social-app-permissions-by-category/

 

Creating Geospatial indexes 

For this demo, you will need this small dataset of earthquakes in the US. You can quickly load it on Couchbase by creating a bucket called  earthquakes, and then click on Documents -> Import Documents, select the earthquake.json file, and then click on Import Data:

importa dateset into couch

Now let’s create our geo FTS index. First, go to the Search tab and click on “Add Index”. Then specify the following configuration:

    • Name: earthquake_idx
    • Bucket: earthquake
    • Type Mappings:
      • Uncheck default
      • Add a new type mapping called earthquake
        • Insert a child field: 
          • Field: geo
          • Type: geopoint
          • Searchable as: geo

index creation

Once the Index Progress reaches 100% we are ready to make our first polygon search.

 

Polygon / Geometric Search in Action

The search has two main requirements:  Fields with coordinates must be indexed using the geopoints type (like we did in the previous session), and the coordinates should form a closed polygon, which means that the first and the last coordinates must be the same. Here is an example of a valid polygon:

If we plot these coordinates in a map, this is what we will end up with:

map coordinates

Source: https://www.keene.edu/campus/maps/tool/

 

Supported Coordinates Formats

The following formats for polygon coordinates are accepted:

  • Single Array: [ “lat, lon”, “lat, lon”, “lat, lon”, …]
  • Multiple Arrays: [ [ lon, lat],  [ lon, lat], … ]
  • GeoJson: [ { “lat”: 1, “lon”: 1},  { “lat”: 1, “lon”: 1}, … ]
  • Geohash: [ “9q8zjbkp”, “9q8yvvdh”, “9q8yyp1e” ]

Geospatial Search using the REST API

Apart from using the native SDKs, you can also use the Full-Text Search REST API to make geospatial queries using the following format:

Here is a real example using our dataset and index:

and here is the output of the command above:

Note that you need to specify the coordinates using the attribute polygon_points. You can also expand this query to filter by other attributes of the document (e.g: Region, Magnitude, etc)

 

Geospatial Ring / Donut Shaped Search 

You can also specify one or more holes in your polygon in case you want to filter some specific areas out:

hole-polygon

You can achieve something like the image above by using boolean queries:

 

In summary, you simply have to specify your polygon coordinates inside the “must” block and your holes in the “must_not“.

I highly recommend you to always use Disjunctions while specifying your holes, although you could still use a Conjunction if you have a single hole, using conjunction for multiple holes will potentially not filter your data properly (unless you have documents with coordinates inside both holes). If you have no idea of what I’m talking about, check out this documentation on compound queries.

 

Further Reads

If you are interested in Geo Search, I highly recommend you to read the official documentation. In case you are new to Full-Text Search, check out this video showing how to create a Netflix-like search using FTS.

We also have a series of articles on Couchbase’s blog talking about important aspects of Full-Text Search:

 

 

 

Author

Posted by Denis Rosa, Developer Advocate, Couchbase

Denis Rosa is a Developer Advocate for Couchbase and lives in Munich - Germany. He has a solid experience as a software engineer and speaks fluently Java, Python, Scala and Javascript. Denis likes to write about search, Big Data, AI, Microservices and everything else that would help developers to make a beautiful, faster, stable and scalable app.

Leave a reply