I See Holograms Everywhere

I have seen the future, and it’s going to be even more amazing than our wildest dreams. Do I have your attention? Good. Then I’ll begin…

Have you seen The Time Machine movie? Well, I should say “movies” because there was the classic 1960 version starring Rod Taylor, Yvette Mimieux, and Alan Young, and then there was the more modern 2002 interpretation starring Guy Pearce, Orlando Jones, Samantha Mumba, Mark Addy, and Jeremy Irons.

Overall, I prefer the original 1960s film. Having said this, there are some scenes from the 2002 movie that remain stuck in my mind, two of which are directly related to this column. I’m talking about when the Time Traveler (played by Guy Pearce) interacts with an artificial intelligence (AI) in the form of a holographic librarian called Vox (played by Orlando Jones) in a futuristic library.

In the first encounter, the Traveler—who hails from 1900—visits the library in the year 2030, looking for answers about time travel and its implications. Vox, the AI holographic librarian, provides information about time travel theories and books but cannot fully address the Traveler’s existential questions.

The second encounter takes place much later in 802,701 AD. Vox has survived over the centuries. In this meeting, Vox helps Alexander understand the Eloi and Morlock societies, serving as a bridge between the advanced past and the degraded post-apocalyptic future.

Let’s remind ourselves that this film was released in 2002, which is only 23 years ago as I pen these words. If you had asked me at that time, I would have said that—while we would certainly be able to create AI-powered holograms like Vox at some point in the future—we were probably talking about a 2050 timeframe, not 2030 as in the movie.

Well, “Prediction is difficult—particularly when it involves the future,” as Mark Twain didn’t famously say (although he’s famously supposed to have said it). As a case in point, we have AI-powered Vox-level holographic displays today! “No!” I hear you gasp incredulously. “Yes!” I respond enthusiastically. Let me elaborate, explicate, and elucidate (fear not, I’m a professional).

I’ve said it before, and I’ll doubtless say it again: I’m incredibly lucky. In addition to my extraordinarily good looks, my ineffable wit (well, it’s not been “effed” thus far), and my internationally commented on sense of style (my mom lives in England, and she often feels moved to comment on what I’m wearing… usually along the lines of, “You’re not going out dressed like that, are you?”), I get to talk with awe-inspiring people about their mindboggling technologies. (You may need to re-read the previous overlong sentence a couple of times. I could rewrite it, but I’m far too excited and enthused to do so.)

The reason I’m currently bouncing off the walls in excitement is that I was just chatting with Edward Ginis, who is Co-Founder and CTO at Proto. Formed in 2020, and currently boasting 65 employees, this company, I firmly believe, has the potential to explode onto the scene in more ways than one, including impinging on the collective consciousness to a greater extent even than ChatGPT (and that’s saying something, right there).

There are so many facets to this that it’s hard to know where to begin, but I think the photo below will provide a good starting point. This shows a life-sized 4K hologram of former professional baseball player George Brett, who spent his entire 21-season career (1973–1993) with the Kansas City Royals.

George Brett hologram (Source: Proto)

While on display at Kauffman Stadium in 2023, this holographic George surprised fans by telling them the story of the infamous Pine Tar Bat incident. The hologram was produced by Transcend Holographic Media and displayed using hardware and software from Proto.

Now, let’s take a step back, figuratively speaking. We’ll start with the hardware and the physical side of things. When we look at the image above, we see a box with depth. Everything about this is designed to trick our brains into thinking we are looking at a real-life, real-size, real-time, volumetric, “hologram in a box.” In fact, what we are really looking at here is a 4K image presented on a 2D flat screen that’s recessed into the front of the box, which is the only thing with depth.

Of course, there’s a lot more to this box than the display, but we’ll get to that later. The point we are interested in here is that the guys and gals at Proto design and manufacture this hardware. The original version cost $65,000. This price has already fallen to $28,000. And the price will plummet further in years to come. There’s also a desktop Proto M unit (the body of which is about 21” tall) as shown below. This is currently priced at $6,500, but I expect it won’t be long before this falls below $1,000, at which point it will firmly move into consumer space.

The desktop Proto M unit (Source: Proto)

Now look at the image of the lady in the photo above. Is this a video of a real-life person? No, it’s not. This is where we come to the software created and supplied by Proto. First and foremost, you don’t need a complicated multi-camera setup to do this. All you need is a single camera. Edward tells me that an iPhone or an iPad will do the job.

So, they record the person talking about something for a few minutes. From this they extract models that I think of as the “vocal clone” and the “physical clone.” The vocal clone is a digital representation of the person’s unique vocal characteristics that is extracted using algorithms that analyze pitch, tone, cadence, and other vocal patterns. The physical clone includes a 3D mesh of the face and body that captures 3D representations of geometry, proportions, and expressions, along with coupled expression models, pose models, texture maps, and a bunch of other stuff.

Proto’s AI-based software also automatically removes any background details leaving only the person and the clothes they are wearing. It extrapolates and interpolates any parts it cannot see to further the impression you are looking at a true 3D image, and it adds lighting and shadows to complete the illusion that you are looking at a 3D volumetric holographic representation.

In addition to the “vocal clone” model and the “physical clone” model, we need some sort of “knowledge clone” model in the form of an LLM (the Proto software is LLM agnostic and will work with the LLM of your choice).

There are several ways to generate such a knowledge clone. One is to provide the LLM with some generic background information, and to then let it interview the subject, asking questions and listening to and analyzing the responses, which leads to more questions, often heading off in unexpected directions, much the same as when two humans are chatting and getting to know each other. Another way to generate the knowledge clone is to simply feed the LLM with a bunch of data. And, of course, you can combine both techniques.

Once you have your vocal, physical, and knowledge clones, you can have a conversation with the avatar. For example, they did all this with William Shatner, who—among many other roles—is famous for his portrayal of Captain James Tiberius Kirk in Star Trek: The Original Series.

After all these years, William has grown weary of answering endless questions about topics like Tribbles. Thus, in addition to the portion of the knowledge clone that was created by interviewing Willian, they also fed the LLM with the scripts and stage directions for all the Star Trek episodes. This means that while real William is busy doing things like going into space on a private Blue Origin flight, avatar William can be answering questions about things like Tribble diets and Vulcan mating rituals (I know which of these activities I’d rather do myself).

There’s so much more to this than meets the eye. For example, when you ask a question of a typical AI assistant like an Amazon Echo, following its recognition of the “Alexa” keyword, it waits for you to finish asking your question, then it packages everything up and beams it into the cloud, then it presents the response, all of which results in a perceptible delay, thereby making human-assistant interactions awkward and stilted.

That’s not the way Proto does things. If you and I were having a question and you were talking (assuming I gave you a chance to get a word in edgewise), then—much like a game of chess—I would be planning appropriate responses. Sometimes I may have several potential responses poised depending on where you take the conversation.

Similarly, when you are conversing with one of Proto’s avatars, it will be engaging in non-verbal communication (smiling, pursing its lips, blinking, etc.) while you are talking. Meanwhile, the cloud-based AI will be planning a variety of responses—commencing from the moment you utter your first word—and streaming all of them down to the avatar. Thus, as soon as you’ve finished talking (even before you’ve finished if the occasion demands), the avatar will commence its response.

It’s also important to wrap your brain around the fact that, in addition to generating audio that sounds just like the original person’s voice, the AI will be generating the corresponding video on a frame-by-frame basis, with perfect lip-sync (and other facial musculature actions). The final rendering of the voice and video takes place in Proto’s box.

There are so many aspects to this that it will make your head spin. For example, you can swap out the underlying knowledge clone, like swapping the Star Trek trivia portion of the William Shatner avatar’s LLM for a knowledge base associated with infectious diseases, for example. This is equivalent to being able to swap the avatar’s “brain,” pulling the old one out and seamlessly popping a new one in.

Once the new “brain” is installed, you could ask your William Shatner avatar questions such as, “How have advancements in genomic sequencing transformed our ability to track and respond to emerging infectious diseases?” More importantly, you could receive an answer like, “Advancements in genomic sequencing have revolutionized infectious disease response by enabling rapid identification of pathogens, tracking their evolution and spread, and accelerating vaccine and treatment development” (as opposed to an answer like “The trouble with Tribbles is…”).

There’s also the fact that this system can support multiple languages. Suppose we stick with the original William Shatner avatar—the one whose knowledge clone encompasses all things related to Star Trek: The Original Series. Now suppose someone for whom Dutch is their first language walks up to the avatar and says, “Wat is het grootste probleem dat je hebt ervaren met Tribbles?” (which means, “What’s the biggest problem you’ve experienced with Tribbles?”).

The avatar might either burst into tears and reply, “O God, niet weer!” (“Oh God, not again!”), or it might restrain itself and—with William’s iconic grin—say something like, “Om te voorkomen dat ze zich sneller vermenigvuldigen dan het script aankan!” (“Keeping them from multiplying faster than the script could handle!”). Remember that the video is generated frame-by-frame on the fly, so the avatar’s lips and other facial muscular movements would be perfectly synchronized with the phonemes (the perceptually distinct units of sound) in this new language.

The most recent example I saw of this technology involved a 3-way conversation between a human and two avatars with expertise on some technical topic. All I can say is, “Wow!”

So, what are the potential uses for Proto’s AI-powered volumetric holographic displays? The more I think about this, the more I realize that the possibilities are boundless, and we truly are limited only by our imaginations. Take airports, for example. I can’t tell you how many times I’ve arrived somewhere, only to find the information desk empty with a sign saying, “Back in one hour” (if I’m lucky) or “Back at 7:00 am tomorrow morning” (if I’m not). Wouldn’t it be awesome to have an AI avatar that spoke to you in your own language and could answer all your questions, including having up-to-the-minute access to the weather situation, gate changes, and flight delays?

How about retail settings? If I had my choice, I would have one of these bodacious beauties at the end of every aisle in Walmart and in Home Depot, and that’s just the start. Remember that the avatar doesn’t just hear you talk, it can see your facial expressions (so it will know when you are happy, sad, confused, frustrated, or angry, allowing it to respond appropriately) and it can see what you are holding in your hands (so it will know when to duck). I can imagine a first-time home-plumbing hero holding up some “doodad” that they’d recently extracted from under the kitchen sink and asking, “What’s this called, what’s it for, and where can I find one,” and the avatar responding, “Bless your little cotton socks. That’s a P-trap. It prevents sewer gases from entering the home while allowing wastewater to flow out to the drainage system. You can find them on aisle 36 (halfway down the aisle on the bottom shelf on the right-hand side).”

Then there are medical applications. A lot of people are embarrassed about talking to doctors about certain conditions. They would be much happier conversing with an avatar. Furthermore, remember that the doctor avatar can both see and hear the patient. While chatting, the avatar might observe a variety of physical signs, like white flecks in fingernails, the color of the skin (e.g., a yellowish hue could indicate liver problems, while a bluish tone may signal oxygenation or heart issues), attributes of the eyes (e.g., drooping eyelids or uneven pupils, both of which may point to a neurological problem), trembling of the hands (which could indicate a problem like multiple sclerosis or Parkinson’s disease), breathing patterns (rapid or labored breathing can signal respiratory or cardiac issues), and… the list goes on and on and on. It would be like having your own House M.D. Even better, via its access to the internet, the doctor avatar would be 100% up-to-date with any recent occurrences (and their locations) of unusual medical conditions of which a human doctor might be unaware.

Another fantastic application would be in care facilities and nursing homes—especially for people who are cognitively challenged, elderly people who are lonely, and people suffering from some form of cognitive decline or dementia. For all these people, the AI avatar could be a friend, listening to the same stories over and over, and answering the same questions over and over, without becoming bored or tired or angry. This would dramatically ease the task of the nursing staff.

My dad passed away in January 2000. I was with him when he died. We were lucky enough to see the dawn of the new millennium together. I often wish I could chat with him while I still maintain a corporeal presence on this plane of existence. That’s no longer a possibility but—by means of Proto’s technology—it may well be that my grandchildren, my great-grandchildren, and my great-great-grandchildren will be able to chat with a 3D volumetric AI avatar of your humble narrator.

Many people love Ancestry.com and all the information it can make available. They find it fascinating to access old photographs and census records of their forebears as they trace their family tree. Now imagine what Ancestry.com might look like a hundred years or so in the future. Rather than seeing simple photographs of their predecessors, our descendants could engage their AI avatar ancestors in conversation.

You may think that all of this sounds weird. Maybe it is, but I’m reminded of an artist called Michelle Huang, who kept a diary from the age of 7. In this diary, Michelle captured her thoughts, about herself, her friends, her fears, and her goals. In 2022, Michelle fed the contents of these diaries into a chatbot, after which she engaged in text-based conversations with her younger self. An article on CNET offers snippets from some of these conversations that proved to be deeply emotional, sometimes even cathartic.

I’ve saved the best for last. Well, it’s the best as far as I’m concerned because it’s all about me (as it should be). As I’ve mentioned before (and as I’ll almost certainly mention again), I’m currently in the process of writing a book about my formative years. My target audience is kids from around 8 to 12. The title of this book will be The Life of (a boy called) Clive (this is because “Clive” is my real name—Max the Magnificent is only my stage name). As it says (or as it will say) on the back cover:

Did you ever wonder what it was like to grow up in the swinging 1960s (specifically, to grow up as a young boy called Clive at 88 Springfield Road in the suburb of Millhouses in the city of Sheffield in the county of Yorkshire in the country of England)? If so, this book is for you!

This isn’t going to be a chapter book. It’s just a series of short stories with funny titles in chronological order that takes the reader from when I was born to my first day at the “big boy’s school” when I was 11 years old. I’ve been working on this for ages. It’s about 250 pages. I’m almost finished (thank goodness).

Well, Edward painted a picture of a possible future. He asked me to imagine being videoed and interviewed by Proto’s AI to create a first-pass avatar. Edward said that the next step would be to upload The Life of Clive into the LLM, and to then let the AI interview me, asking questions like, “What did you feel when you fell into the water barrel?” and “What did you feel when you fell out of the sixth-floor hotel window?” and “What did you feel when you fell off the cliff?”

The idea is that, just like when two people are chatting with each other, one thought leads to another, often kicking off entirely new topics. When all this was complete, readers could ask me (well, my avatar) endless questions about my childhood, and I (well, my avatar) would never get tired of answering them.

This could be a whole new market. Book stores could have Proto holographic avatars present to answer questions when new books come out. Libraries could do the same. I read lots of books. I would love to be able to chat with their authors. Maybe, one day soon, I will be able to do so.

What say you? As always, I’d love to hear your thoughts on all of this. For example, have you read anything here that made you say, “Wait, what?” And can you think of any other use cases for Proto’s 3D volumetric AI avatars that I failed to cover?

Meet the Tahiti ANC+ENC+WoV SoC Solution!

Did the abundance of abbreviations with which the title of this column abounds cause you to pause for a moment and think, “Say, what?” Well, that’s just what I thought when the guys and gals at Synaptics introduced me to their AS33970 Tahiti System-on-Chip (SoC) device, but first... Human-machine interfaces…

BeBop RoboSkin Provides Tactile Awareness for Robots

I often think of a future that involves truly intelligent robots working alongside humans to make the world a better place for all of us. My wife (Gina the Gorgeous) is constantly asking me how long it will be until we have our own robot to help with household tasks.…

Another Huge Step Forward for Machine Vision

Captain's log, stardate 22276.3: I’m happy and sad, baffled and bewildered, and dazed and confused. The day is yet young, so everything is pretty much par for the course. One reason I’m happy is that the SyFy channel recently started a new series called Reginald the Vampire. Although this may…