The Progression of Holography into Business– An interview with Dr. V. Michael Bove, Jr. MIT Media Lab
ICT Insights: You have long been involved in advanced multimedia and video conferencing. How is it that you got interested in working to commercialize holography?
Dr. Bove: Holography at the Media Lab began with the late Professor Stephen Benton, who invented the white light holograms used on credit cards. Up until the late 1980s, these had to be recorded photographically with lasers. Now, with advances in computer graphics computation, it is possible to generate holograms cost-effectively for many specialized purposes.
Steve's students did pioneering work in the late 1980s, and, in about 1989, they built a pioneering holographic video display. My group at the time was doing work in computation for advanced TV and ended up building specialized parallel computers to generate the holographic data they needed.
We continued collaborating on hardware and algorithms through the 1990s, in part because of the very strong cool factor associated with having a holography lab in the building. Unfortunately, Steve became ill and we ended up moving all of the electronic holography work into my group when Steve passed away.
Since then my group decided to try to do holographic video as a consumer technology. We wanted to build things to work with PCs and the kinds of Graphics Processing Units (GPUs) that NVIDIA was putting into PCs. We wanted to work with traditional software, like OpenGL. We wanted to be able to send the video over the Internet and to drive the cost to where a holographic video monitor plugged into your PC will cost $500.
It was almost ten years before we achieved our goal of advancing the science and art of synthetic holography. For instance, we are the first people to have generated holographic video with a Kinect camera.
On top of the coolness and the science fiction factors, there is also an interest in 3D imagery for entertainment, gaming, or for serious applications like scientific visualization, teleoperation, and telemanipulation. For each of these situations it's important to have 3D displays that don't require one to wear glasses. Second to have 3D displays that provide comfort: i.e. a surgeon shouldn't become fatigued from using a 3D display.
Further, it's important for maximum accuracy to have the ability to judge distances between two points – whether for manipulation or simply viewing complex data sets to assess relative distance in a three-dimensional volume. Holographic displays provide all of the visual cues to reality, unlike stereoscopic displays where your eyes are always focused on the distance to the screen.
ICT Insights: You are working on Object-Based Media, what does that really mean?
Dr. Bove: In the early 1990s, when the group got that name, we were working on making communication systems that recognized the world in terms of objects not pixels. So, think about semantic video: It's important in a conferencing system for me to know which pixels are you and which pixels are the wall. Whether that's for interactive reasons or limited network bandwidth, I should be concentrating on the bits to make your face look natural and not about making the wallpaper behind you look good.
So we developed visual and audio systems that knew the difference between voice and noise, face and wallpaper, and allowed the creation of more efficient data transmission and richer interactions. We continued to work on self-aware systems that allowed us to do very interesting context-aware and interactive applications.
Along the way, we became interested in making the physical world smart and putting intelligence, interactivity, and context-awareness into things. And so, accidentally, the name "Object-Based Media" continued to be relevant with our new work as well, consequently we kept the name.
ICT Insights: What happens when self-aware content meets context-aware consumer electronics?
Dr. Bove: We are in a unique age right now. It was not so long ago that the computational power, connectivity, the amount of storage, and the kinds of interactions that were available to consumers were much less advanced than those available to industry or the military. Nowadays all of the really exciting stuff is happening in the consumer space and this shows up in a lot of different ways.
In terms of the context of self-aware content meeting context-aware electronics, what I'm really saying is you have a chain where the bits know something about themselves – they have rich metadata – and the electronics knows something about the environment and something about the users.
The ultimate goal is to make very rich communication systems. If we do it well, the human/computer interface disappears: I don't think I'm using a conferencing system, I just think I'm talking to you. I don't think I'm working with a computer program: I just have information in front of me and I can reach in and push it around and change how I look at it and share it with other people.
ICT Insights: So there has to be still something that creates that holographic image, some equipment?
Dr. Bove: Certainly. There are electronics, software, and a huge amount of cloud services computation in the background. And, when done right, it's all invisible. A good analogy is typography: if you notice the typography in a book you won't enjoy reading the book, unless maybe you are a typographer. And I think of much of Human-Computer Interaction (HCI) as being similar to that: if you notice it, it's wrong.
ICT Insights: Can you describe to us the project with Joi Ito [Media Lab Director] holding a hologram meeting with people from a distance, and expand on why that might be better than current telepresence?
Dr. Bove: The first goal is to make you forget there's a telepresence system. You know, there aren't a bunch of black boxes at one side of the room that have a technician operating them but rather there's a person who is sitting in a chair on the other side of the room. Now he may actually be in a hotel room on another continent but, subconsciously, I think there's a person sitting in that chair.
So how did we make that happen? We created a telepresence system in the form factor of an office chair. The mechanism we used for doing that is very old: it's a magician's trick called a Pepper's Ghost that dates from somewhere between the mid-16th and the mid-19th centuries: If we take a half-silvered mirror and I put it at an angle such that you can't see the mirror but you can see what's reflected in it, we can make Joi be reflected in the mirror but you see him in the chair when he's really in a remote place.
And this is a technology that resurfaces every so often and it's most recently been used to reconstruct deceased hip-hoppers on stage and people call it a hologram: it's not. It's ancient and it's not even 3D, it's 2D. So the next thing that we did was create a 3D technology for electronic Pepper's Ghosts so we can make Joi appear in his chair as 3D. Now at the other end we can't just point a single camera at him: we have to point multiple cameras or a Kinect at him. But that's cheap and that's ubiquitous and I can go to Target and buy everything I need at the capture end.
Now Joi appears, as a 3D object sitting in his office chair and it's even more natural.
The goal of building a holographic display is to make an object appear not inside a box but have it appear in space. Now it's hard for marketers to get their minds around "the best product is the one that you don't see." You know, in the future if that's your business, instead of having a Huawei logo on something, you might have to have a little bug in the corner the way broadcasters do on TV channels so they know it's your system, because there will be nothing else visible. Or maybe you could have Joi have a Huawei T-shirt on when he's in the system – you know, a virtual T-shirt.
The next step is a collaborative distributed experience that's better than what we could have had in the meeting room. How do we do that? First of all, of course, we need to bring all the people together. The second thing is we need to have a system that recognizes their goals and their intentions, that brings in additional resources, maybe from the cloud, maybe locally, so that we can collaborate about something.
ICT Insights: Do you already have all the technology that you need to accomplish all of this and move to the next steps of making this happen?
Dr. Bove: We have the technology. There are pieces of the chain that might be missing, so we might not have all of the protocols we need to send these things efficiently over existing networks or we might not have certain pieces of software that need to be written. If one threw enough money and enough people at it – of course that implies that you could come up with a business model – certain parts of this are just straight engineering at this point.
There are places where people are doing holographic video by brute force – if you can go take a room and fill it with Blade servers and do the computations for the arbitrarily large display and, yes, you need 10,000 watts of electricity to run it, and you can get huge quantities of the electro-optic hardware and make the display. So that's not going to be in your house any time soon...
Because we have decided we're trying to make these things with consumer-grade technology we have to use more finesse.
ICT Insights: So can you talk about the specific technologies that you use in your studies?
Dr. Bove: We use a huge range of technologies. On the input side, we use various combinations and types of range-sensing, machine-vision, gesture-recognition, and touch-sensing. We have a variety of algorithms for statistical pattern-recognition, machine learning, and so forth. We have a range of rendering algorithms just for working with images, doing interesting things with images, and turning image data into holograms in real-time. We have technologies for building holographic displays – we're actually making our own light-modulator chips on campus because we can't buy the kinds of chips we need for the display.
Some people would say that the whole interaction model is itself a technology. We have a variety of interaction models we work with, we build various kinds of sensors that we work with, and those sensors are rather unusual and show up in rather unusual form factors. For instance, we provided the basketball nets for the Slam Dunk Competition at the NBA All-Star game last year. Those nets were standard nets but they were able to measure the amount of energy in the ball dunked and transmit that to the system for the broadcast so it immediately showed the amount of energy behind the dunk.
It's an unusual form factor for a sensor, but it has the same characteristics of some of our other work. In that instance the sensor measures something and then makes it visible to a viewer – something they have not been able to see before. In the case of some of our context-aware work, we're figuring out what we can find out about a person or what we can find out about an environment unobtrusively, and then using that to make technology behave in a more useful, appropriate, or helpful way.