Positioning strategies – first steps

Relocalization that ensures that AR content stays in place between sessions, and most importantly is easy to find again when you start a new session nearby is a problem under active research.

Some interesting statements I found around this area below.

ARKit doesn’t have any features for tracking device position or placing content in “absolute” geospatial coordinates. Actually doing that (and doing it well) is sort of a hard problem… but there are a few things to help you on your way.

First, check out the ARSessionConfiguration worldAlignment setting. With the default gravityoption, x and z directions are relative to the device’s original heading, as of when you started the session. Getting from there to geospatial coordinates is next to impossible.

But with the gravityAndHeading option, the axes are fixed to north/south and east/west — the position of the ARKit coordinate system’s origin is still relative to where the device is at the beginning of the session, but the directions are absolute. That gives you a basis for converting to/from geospatial coordinates.

But there’s still a question of precision. ARKit tracks features up to a few meters away, down to a precision of a couple millimeters. Core Location tracks the device to a precision of several meters. So, if you have a real-world feature and you want to put virtual content on top of it… you could convert a lat/long to a position in ARKit space, but then you’re likely to find that your content doesn’t really line up close enough.

It’s not an unsolvable problem, but not an easy one either. Good luck!


This seems to be an area of active research in the iOS developer community — I met several teams trying to figure it out at WWDC last week, and nobody had even begun to crack it yet. So I’m not sure there’s a “best way” yet, if even a feasible way at all.

Feature points are positioned relative to the session, and aren’t individually identified, so I’d imagine correlating them between multiple users would be tricky.

The session alignment mode gravityAndHeading might prove helpful: that fixes all the directions to a (presumed/estimated to be) absolute reference frame, but positions are still relative to where the device was when the session started. If you could find a way to relate that position to something absolute — a lat/long, or an iBeacon maybe — and do so reliably, with enough precision… Well, then you’d not only have a reference frame that could be shared by multiple users, you’d also have the main ingredients for location based AR. (You know, like a floating virtual arrow that says turn right there to get to Gate A113 at the airport, or whatever.)

Another avenue I’ve heard discussed is image analysis. If you could place some real markers — easily machine recognizable things like QR codes — in view of multiple users, you could maybe use some form of object recognition or tracking (a ML model, perhaps?) to precisely identify the markers’ positions and orientations relative to each user, and work back from there to calculate a shared frame of reference. Dunno how feasible that might be. (But if you go that route, or similar, note that ARKit exposes a pixel buffer for each captured camera frame.)

Good luck!


Obviously you want to build AR apps that permanently place augmented reality content in precise locations, indoors and outdoors. You can use GPS, markers or beacons for geolocation. Or you can try to use computer vision systems to provide an additional layer of mapping and persistent visual positioning to mobile AR apps.

Plane detection is the basic feature that ARKit provides. Below is a good introduction into this topic.


Object detection in general could be an approach for positioning. How good object detection works can be seen by using this project (and actually immediately see the limitations as well):


As a side note, another awesome page, this time just for ARKit :


There is an article again about how to integrate placenote SDK (which has already been covered in a previous post).


This also discloses how that works.

Placenote SDK uses a custom feature detection algorithm that looks for “smart” features in the environment that it thinks can be found persistently across AR sessions. This feature detection is independent of ARKit feature tracking and runs in a parallel thread as you move your phone around.

The collection of smart features forms a 3D feature map of the space that can be saved and uploaded to the Placenote Cloud

Looks like the current approaches are circling around object detection.

What can be seen by playing around with both object detection and placenote: They have currently some limitations. One thing is for sure, you can have a pretty amusing time, especially with the object detection example program! It detects stethoscopes all around!