Detecting Spherical Media Files | discretecosine.com

In many ways, VR is still a case of a “wild west” as far as technology goes. There are very few true standards, and those that do exist haven’t been implemented widely.

Recently, we’ve been looking at how to automatically identify spherical (equirectangular) photos and videos so they can be displayed properly in our Elevator digital asset management tool. “Why is this such a problem in the first place?” you may be wondering. Well, spherical photos and videos are packaged in a way that they resemble pretty much any other type of photo of video. At this point, we’re working primarily with images from our Ricoh Theta spherical cameras, which saves photos as .JPG files and videos as .MP4 files. Our computers recognize these file types as being photo and video files – which they are – but doesn’t have an automatic way of detecting the “special sauce”: the fact that they’re spherical! You can open up these files in your standard photo/video viewer, but they look a little odd and distorted:

So, we clearly need some way of detecting if our photos and videos were shot with a spherical camera. That way, when we view them, we can automatically plop them into a spherical viewer, which can project our photos and videos into a spherical shape so they can be experienced as they were intended to be experienced! As it turns out, this gets a bit messy…

Let’s start by looking at spherical photos. We hypothesized that there must be metadata within the files to identify them as spherical. The best way to investigate a file in a case like this is with ExifTool, which extracts metadata from nearly every media format.

While there’s lots of metadata in an image file (camera settings, date and time information, etc.), our Ricoh Theta files had some very promising additional items:

Projection Type : equirectangular
Use Panorama Viewer : True
Pose Heading Degrees : 0.0
Pose Pitch Degrees : 5.8
Pose Roll Degrees : 2.8

Additional googling reveals that the UsePanoramaViewer attribute has its origins in Google Streetview’s panoramic metadata extensions. This is somewhere in the “quasi-standard” category – there’s no standards body that has agreed on this as the way to flag panoramic images, but manufacturers have adopted it.

Video, on the other hand is a little harder to deal with at the moment. Fortunately, it has the promise of becoming easier in the future. There’s a “request for comments” with a proposed metadata standard for spherical metadata. This RFC is specifically focused on storing spherical metadata in web-delivery files (WebM and MP4), using a special identifier (a “UUID”) and some XML.

Right now, reading that metadata is pretty problematic. None of the common video tools can display it. However, open source projects are moving quickly to adopt it, and Google is already leveraging this metadata with files uploaded to YouTube. In the case of the Ricoh cameras we use, their desktop video conversion tool has recently been updated to incorporate this type of metadata as well.

One of the most exciting parts of working in VR right now is that the landscape is changing on a week-by-week basis. Problems are being solved quickly, and new problems are being discovered just as quickly.

Leave a Reply Cancel reply