Use the information provided by a face tracking AR session to place and animate 3D content.
iPhone X’s TrueDepth front-facing camera might help to get better results in tracking.
Use the information provided by a face tracking AR session to place and animate 3D content.
And some random notes on floor plans. Interesting post about exactly that:
There are products in this area as well:
General IoT cloud providers explained.
On a side note: Rolex:
A bunch of interesting new capabilities have been introduced with iOS 11.3.
Detect known 2D images in the user’s environment, and use their positions to place AR content.
Follow best practices for visual feedback, gesture interactions, and realistic rendering in AR experiences.
Discovery phase: AR state of the art research.
Implement a manual re-positioning for ARchi VR.
Investigate and research to find a way for a better automatic registration of an existing scene in AR, something that could compete with available solutions (e. g. placenote).
After the investigation an implementation of Automatic Registration in ARchi VR.
Implementation of a use case using automatic registration.
A interesting document about 3D point cloud registration.
It is even explained from one of the author in a video.
For our research only the first part is of interest, where it circles around how to find a affine transformation between two point clouds. The first method is key point detection and the second one plane based registration.
Numerous point cloud coarse registration methods have been developed , yet coarse registration remains an open challenge with much room for improvement. In the Fast Point Feature Histogram [9, 10] (FPFH) algorithm, a histogram based descriptor is calculated for each point within the point cloud, over multiple scales. Salient persistent histograms over multiple scale calculations are labeled as keypoints, which are then matched to find the registration between the point clouds.
A different approach based on linear plane matching was developed for the coarse registration of airborne LIDAR point clouds . By relying on the presence of linear struc- tures, this approach is limited to specific dataset classes.
The problem of fine registration between point clouds has been intensively studied, and high quality solutions now exist for online applications such as SLAM [3, 4]. The solutions revolve around the Iterative Closest Point (ICP)  algorithm and its improvements . A noteworthy fine registration method that is based on the correlation of Extended Gaussian Images in the Fourier domain  was proposed as an alternative to ICP, although its final stage again relied on iterations of ICP for fine-tuning. Fineregistration is not the focus of this research, although to achieve end-to-end registration the standard ICP algorithm is utilized in its final stages.
An algorithm like ICP could be interesting to investigate, given that we could detect a couple of planes and then try to match them in two sessions, e. g. more precisely, given two point clouds, R (the reference) and S (the source), ICP tries to find the best rigid (or similarity) transform T so that T * S = R.
The following library could be interesting:
Or maybe just something like that:
Or even already a c++ implementation:
Relocalization that ensures that AR content stays in place between sessions, and most importantly is easy to find again when you start a new session nearby is a problem under active research.
Some interesting statements I found around this area below.
ARKit doesn’t have any features for tracking device position or placing content in “absolute” geospatial coordinates. Actually doing that (and doing it well) is sort of a hard problem… but there are a few things to help you on your way.
First, check out the
worldAlignmentsetting. With the default
gravityoption, x and z directions are relative to the device’s original heading, as of when you started the session. Getting from there to geospatial coordinates is next to impossible.
But with the
gravityAndHeadingoption, the axes are fixed to north/south and east/west — the position of the ARKit coordinate system’s origin is still relative to where the device is at the beginning of the session, but the directions are absolute. That gives you a basis for converting to/from geospatial coordinates.
But there’s still a question of precision. ARKit tracks features up to a few meters away, down to a precision of a couple millimeters. Core Location tracks the device to a precision of several meters. So, if you have a real-world feature and you want to put virtual content on top of it… you could convert a lat/long to a position in ARKit space, but then you’re likely to find that your content doesn’t really line up close enough.
It’s not an unsolvable problem, but not an easy one either. Good luck!
This seems to be an area of active research in the iOS developer community — I met several teams trying to figure it out at WWDC last week, and nobody had even begun to crack it yet. So I’m not sure there’s a “best way” yet, if even a feasible way at all.
Feature points are positioned relative to the session, and aren’t individually identified, so I’d imagine correlating them between multiple users would be tricky.
The session alignment mode
gravityAndHeadingmight prove helpful: that fixes all the directions to a (presumed/estimated to be) absolute reference frame, but positions are still relative to where the device was when the session started. If you could find a way to relate that position to something absolute — a lat/long, or an iBeacon maybe — and do so reliably, with enough precision… Well, then you’d not only have a reference frame that could be shared by multiple users, you’d also have the main ingredients for location based AR. (You know, like a floating virtual arrow that says turn right there to get to Gate A113 at the airport, or whatever.)
Another avenue I’ve heard discussed is image analysis. If you could place some real markers — easily machine recognizable things like QR codes — in view of multiple users, you could maybe use some form of object recognition or tracking (a ML model, perhaps?) to precisely identify the markers’ positions and orientations relative to each user, and work back from there to calculate a shared frame of reference. Dunno how feasible that might be. (But if you go that route, or similar, note that ARKit exposes a pixel buffer for each captured camera frame.)
Obviously you want to build AR apps that permanently place augmented reality content in precise locations, indoors and outdoors. You can use GPS, markers or beacons for geolocation. Or you can try to use computer vision systems to provide an additional layer of mapping and persistent visual positioning to mobile AR apps.
Plane detection is the basic feature that ARKit provides. Below is a good introduction into this topic.
Object detection in general could be an approach for positioning. How good object detection works can be seen by using this project (and actually immediately see the limitations as well):
As a side note, another awesome page, this time just for ARKit :
There is an article again about how to integrate placenote SDK (which has already been covered in a previous post).
This also discloses how that works.
Placenote SDK uses a custom feature detection algorithm that looks for “smart” features in the environment that it thinks can be found persistently across AR sessions. This feature detection is independent of ARKit feature tracking and runs in a parallel thread as you move your phone around.
The collection of smart features forms a 3D feature map of the space that can be saved and uploaded to the Placenote Cloud.
Looks like the current approaches are circling around object detection.
What can be seen by playing around with both object detection and placenote: They have currently some limitations. One thing is for sure, you can have a pretty amusing time, especially with the object detection example program! It detects stethoscopes all around!
To put it simply, Placenote lets you build AR apps that permanently place augmented reality content in precise locations, indoors and outdoors. Placenote does not need GPS, markers or beacons for geolocation. Instead, it uses an advanced computer vision system to provide an additional layer of mapping and persistent visual positioning to mobile AR apps built in Unity or Native Scenekit.
I have registered and built the example by using their kit.
The sample allows you to position objects in a AR session and save the map to their service. Then in a consecutive AR session you can load the saved map again. The sample seems to work pretty well and successfully re-positions the objects. The accuracy is not very high, objects can be a bit off the intended position. If the starting position is very different (other side of a room) it doesn’t place the objects.
The source code of the sample is also available on github https://github.com/Placenote.
We met in Zürich yesterday for the official kick off for the master thesis work.
We came up with the following goals for the next week:
Until now the time was spent with a discovery in the augmented reality (AR) environment. An investigation in the state of the art technologies have been conducted. From ARKit on iOS to ARCore on Android – a basic deep dive into the technology.
I have also got a access to the source code of ARchi VR which is located on a not public repository on the ZHAW github: https://github.engineering.zhaw.ch/acke/ARchiVR. I looked into the code and built and slightly adapted the sources to get a feeling for the ARKit technology in use and https://github.com/archilogic-com/3dio-js from https://3d.io/ by Archilogic, Zurich.
Also some basic business model considerations have been looked at, like usages in the maintenance environment (VR and AR, e. g. remote maintenance) or the area of the AR cloud, e. g. https://medium.com/super-ventures-blog/arkit-and-arcore-will-not-usher-massive-adoption-of-mobile-ar-da3d87f7e5ad.