In this blog post, I plan on talking about a few foundational online mapping concepts that are essential to understand for those who do map development. Most of these concepts are discussed in this Bing Maps Tile System article. But I feel like the article didn’t really take the time to explain all of them in detail, and beginner level map developers might find them hard to understand. So the purpose of this blog post is to bridge that gap for beginner level developers, and “demystify” these concepts in more plain, layman terms. Maybe it will even help reinforce these concepts for more advanced readers.
Agenda
- Raster v.s. Vector Maps
- Spherical Mercator Projection
- Ground Resolution
- Map Scale
- Pixel Coordinates
- Tile Coordinates (Leaflet/Mapbox Tile Loading)
- Concluding Remarks
Rastor V.S. Vector Maps
The most widely used type of online maps is raster maps. A raster map can also be thought of as a tiled map, where the whole map is cut into image tiles of size 256 x 256 pixles. We cut the map into tiles at different levels of detail (zoom levels) for quick retrieval and display at each zoom level. As you zoom into different levels or pan to different locations, more tiles will be fetched from the server to give you the images you want to see at that desired location and zoom level. This raster map approach affords many benefits to the user, including letting user easily and quickly pan and zoom, and letting the user’s map client cache the tile images to optimize performance.
Raster maps are typically static read-only images (for e.g., .png files). You can’t really change them very much, except maybe for things like the map color or such. They are good for background or contextual maps such as satellite image maps. All the major map providers use raster maps, such as Google, Bing Maps and Mapbox.
Vector maps use vector data for geographical features or shapes. These geographical features can be expressed by geometries such as points, lines or polygons, instead of static images. These geometries are associated with other data in the database to represent map objects such as fields, roads, buildings, etc. Because vector data are not static images, but precise geometries, you can have a visually smooth experience zooming on a map. You will also not lose the original resolution or form.
Vector maps generally allows more flexibility for user input and customization. But it is more costly and complex to produce initially. As an example, the Maps app on iOS uses a vector map for its Standard street view.
Spherical Mercator Projection
In order to make the mapping experience seemless to the user on a flat computer screen, we need some way to project the entire earth spherical surface to a flat surface. The most common projection is called the Spherical Mercator projection. It’s the most common projection for online maps, used by almost all free and commercial tile providers. It assumes that Earth is a sphere, which it’s not. It’s more of an ellipsoid, or to be more accurate, a geoid.
This post is NOT about discussing all the technical details of the Spherical Mercator projection. You will find all kinds of info about it with a quick search on the Internet. What I will do here is talk about a few important characteristics about Spherical Mercator projection, and then explain in detail a few core mapping concepts in context of the Spherical Mercator projection. It should give you a good idea of what it is and make you comfortable talking about it as a map developer.
First thing you need to understand is that the Spherical Mercator projection significantly distorts the actual earth’s scale and surface area. Well, why do we use this projection then if it’s so bad? That’s because it’s a model that’s relatively easier to understand and good enough for most mapping needs, plus it has the following 2 key benefits that outweigh the distortion. I will just directly quote from Bing Maps Tile System page
- It’s a conformal projection, which means that it preserves the shape of relatively small objects. This is especially important when showing aerial imagery, because we want to avoid distorting the shape of buildings. Square buildings should appear square, not rectangular.
- It’s a cylindrical projection, which means that north and south are always straight up and down, and west and east are always straight left and right.
So by using this projection, we can project earth surface onto a square map.
No wonder this projection is so popular. Thinking and reasoning in my head about map objects on a perfectly flat square is lot easier than doing it on a sphere. Or at least for most people. I work with some really smart people in my company who know a lot about mapping, and they will probably feel right at home talking about the Mercator projection or mapping in terms of a sphere.
Ground Resolution
The Bing Maps Tile System article defines “ground resolution” as such:
The ground resolution indicates the distance on the ground that’s represented by a single pixel in the map. For example, at a ground resolution of 10 meters/pixel, each pixel represents a ground distance of 10 meters. The ground resolution varies depending on the level of detail and the latitude at which it’s measured. Using an earth radius of 6378137 meters, the ground resolution (in meters per pixel) can be calculated as:ground resolution = cos(latitude * pi/180) * earth circumference / map width= (cos(latitude * pi/180) * 2 * pi * 6378137 meters) / (256 * 2 ^ level pixels)
If you have some math background in Geometry or Trigonometry (which I do), then this formula shouldn’t be too hard to understand. But if not, then this is probably not so straightforwad to you, and some explanations will definitely help here.
Before we dig into the formula, let’s remind ourselves what we are trying to do here. We are trying to calculate the ground resolution in meters/pixel of any tile image displayed on the screen, using the Mercator projection. The Mercator model projects the sphere (assuming the earth is a perfect sphere) into a flat squre shape. The middle horizontal line represents the equator, which has lat of 0 degree. The top and bottom borders of the square approximately represent 85.05 and -85.05 degress lat (instead of 90 and -90 degrees), respectively, due to the nature of the Mercator projection. The middle vertical line represents the Greenwich meridian, or a long of 0 degree. The left and right borders of the square represent -180 and 180 degrees longitude, respectively.
As you can see, this projection significantly distorts the scale and surface area of the earth, esp. near the poles. So the closer you are to the equator, the more accurate the projection. On the other hand, the closer you are to the poles, the less accurate the projection.
Here’s another way to think about this. The Mercator projection is a cylindrical projection, which means that north and south are always straight up and down, and west and east are always straight left and right. For a ring (Ring A) that has a lat value near the north pole, its circumference is a lot smaller than a ring (Ring B) that has a lat value near the equator. Therefore, when Ring A is projected onto a flat square using the Mercator projection, it will be a lot less accurate than Ring B. In other words, an image tile near Ring A will have a lot less ground resolution than an image tile near Ring B, assuming the same zoom level.
With that understanding, let’s dig into the formula for calculating ground resolution.
The part “latitude * pi/180” converts the latitude in degrees to radians. If you have some math background, this is very straightforward. If not, this is how it works. In trigonometry terms the constant “pi” (3.1415926535…) is equivalent to 180 degrees. So let’s say we have an angle, say X, in degrees. How do we convert it to radians? Well, how many radians are in a degree? pi/180 radians are in a degree. So if we multiple X by “pi/180”, we will get our answer. So in our case we just replace X with “latitude”.
So what’s with the “cos” (cosine)? Well, before we go there, we need to consider the bigger picture. The first display of the formula (the more English one)
cos(latitude * pi/180) * earth circumference / map width
tells me that it is calculating the ground resolution based on some kind of ratio between earth circumference and the map pixel width. This makes sense b/c the Mercator projection is literally projecting each ring at a certain lat of the earth onto a horizontal line of a squre image. So the ratio of the two should give you the ground resolution in meters/pixel.
The second display of the formula (the more mathematical one) is very similar,
(cos(latitude * pi/180) * 2 * pi * 6378137 meters) / (256 * 2 ^ level pixels)
except it spells out the formula used to calculate earth circumference and the map pixel width. Nothing fancy here.
Now let’s examine this part:
(cos(latitude * pi/180) * 2 * pi * 6378137 meters)
Let me rearrange this a little using the commutative law of multiplication:
(cos(latitude * pi/180) * 6378137 meters * 2 * pi)
Let me introduce a variable that will make things more clear:
ring_radius = cos(latitude * pi/180) * 6378137 meters
So now it becomes
(ring_radius * 2 * pi)
Do you see it? If you have some math background, you will recognize immediately that this formula gives you the circumference of the ring. But wait a minute? How do you know if “ring_radius” is truly the radius of a ring at a given latitude. Well, glad you asked. Let’s look at this image again.
The latitude value gives me the angle I need, and the cosine, by definition, when multiplied by the earth’s radius, gives me the ring radius at that latitude. Pretty cool huh? That, my friend, is how the ground resolution formula works.
Map Scale
The map scale is a lot easier to understand compared to ground resolution. To quote Bing Maps Title System page
The map scale indicates the ratio between map distance and ground distance, when measured in the same units. For instance, at a map scale of 1 : 100,000, each inch on the map represents a ground distance of 100,000 inches. Like the ground resolution, the map scale varies with the level of detail and the latitude of measurement. It can be calculated from the ground resolution as follows, given the screen resolution in dots per inch, typically 96 dpi:map scale = 1 : ground resolution * screen dpi / 0.0254 meters/inch= 1 : (cos(latitude * pi/180) * 2 * pi * 6378137 * screen dpi) / (256 * 2 ^ level * 0.0254)
First of all, I need to mention that they use the term “dpi”. But they really mean “ppi”, I believe, since “dpi” and “ppi” are not the same thing. Refer to this article for the difference between the two.
Ground resolution tells you how many meters are represented per pixel. Screen ppi tells you how many pixels are in a inch on your displaying device’s screen. So
ground resolution * screen dpi
basically tells you how many meters are represented in an inch on your screen. But since we need to make the units the same, we need to convert the meters to inches. Therefore we divide that value by the ratio of meter/inch
ground resolution * screen dpi / (0.0254 meters/inch)
The second display of the same map scale formula is just the actually numbers plugged in
= 1 : (cos(latitude * pi/180) * 2 * pi * 6378137 * screen dpi) / (256 * 2 ^ level * 0.0254)
Pixel Coordinates
The pixel coordinate of a (lat, long) location on earth is the (x, y) pixel value of that location projected onto the square map using Mercator projection at a given level of detail. It’s basically a way to represent a lat and long location in pixels in the projected map.
Quote from Bing Maps Tile System page
Having chosen the projection and scale to use at each level of detail, we can convert geographic coordinates into pixel coordinates. Since the map width and height is different at each level, so are the pixel coordinates. The pixel at the upper-left corner of the map always has pixel coordinates (0, 0). The pixel at the lower-right corner of the map has pixel coordinates (width-1, height-1), or referring to the equations in the previous section, (256 * 2^level–1, 256 * 2^level–1). For example, at level 3, the pixel coordinates range from (0, 0) to (2047, 2047), like this:
Given latitude and longitude in degrees, and the level of detail, the pixel XY coordinates can be calculated as follows:
sinLatitude = sin(latitude * pi/180)
pixelX = ((longitude + 180) / 360) * 256 * 2 ^ level
pixelY = (0.5 – log((1 + sinLatitude) / (1 – sinLatitude)) / (4 * pi)) * 256 * 2 ^ level
The formula for pixelX is fairly easy to understand. Longitude -180 is the starting point, and 180 is the ending point. So
(longitude + 180) / 360
gives us a good ratio to work with for the location with regard to longitude. Then we use this ratio to multiply the width of the map to get the x pixel coordinate.
The formula for pixelY is NOT at all straightforward. It involves a ton of math that I don’t want to get into right now as a software developer. If I were a math wiz like I was back in the day, then maybe I wouldn’t mind so much. But not anymore. Now I do JIT learning, or Just-In-Time learning, or just learn enough to do my job b/c I’ve got so much to do.
Anyway, here I’m just going to trust that the formula for pixelY is really doing the same thing, just a lot fancier. But it gives me the y pixel coordinate I need for a lat and long location.
Tile Coordinates
Once we have the projected map, we cut it up into 256×256 pixel tiles for each zoom level. The bigger the zoom level, the more tiles we will have. The map width and height in the tile coordinate system is as follows
map width = map height = 2 ^ level tiles
The formula for the number of tiles we have at a zoom level is then
2 ^ level * 2 ^ level tiles
Each tile is given XY coordinates ranging from (0, 0) in the upper left to (2^level–1, 2^level–1) in the lower right. For example, at level 3 the tile coordinates range from (0, 0) to (7, 7) as follows:
The tile coordinates let us identify each tile for fast tile retrieval and display at each zoom level. The tile coordinates are easy to calculate once we have the pixel coordinates.
tileX = floor(pixelX / 256)
tileY = floor(pixelY / 256)
Mapbox/Leaflet Tile Loading Using Tile Coordinates
Mapbox/Leaflet uses tile coordinates at a zoom level to fetch tiles. It uses a url template of the following form:
‘http://{s}.somedomain.com/blabla/{z}/{x}/{y}.png’
{s} means one of the available subdomains (used sequentially to help with browser parallel requests per domain limitation; subdomain values are specified in options; a, b or c by default, can be omitted),
{z} — zoom level,
{x} and {y} — tile coordinates.
So when you interact with a Leaflet map, panning or zooming, it knows how to generate the correct url (with the correct zoom level, x and y tile coordinates) to fetch the correct tiles so that your mapping experience is seemless. Once the tiles are loaded from the server, they will be cached by your browser so next time you need them they are already therer. Pretty cool how it works.
Concluding Remarks
In my experience when I did map development without having a solid understanding of these concepts, I found myself stumbling and puzzled about mapping related code or behavior from time to time. I believe taking the time to understand these concepts are critically important and save you headaches later on. I hope you have found this blog post helpful to you to better understand some of the core online mapping concepts and make your life easier as a map developer. Happy mapping!