How roboticists think about generative AI – TechCrunch

How roboticists think about generative AI – TechCrunch

Photo credit: Toyota Research Institute

[A version of this piece first appeared in TechCrunch’s robotics newsletter, Actuator. Subscribe here.]

The topic of generative AI comes up frequently in my newsletter Actuator. I admit that a few months ago I was a little hesitant to spend more time on the topic. Anyone who has been covering technology for as long as I have has lived through countless hype cycles and been burned. Covering technology requires a healthy dose of skepticism, hopefully tempered by some excitement about what can be done.

This time it seemed like generative AI was waiting in the wings, biding its time and waiting for the inevitable cratering of the cryptocurrency. As the category ran out of blood, projects like ChatGPT and DALL-E stood ready to be the focus of breathless reporting, hope, criticism, doomerism, and all the various Kübler-Rossian levels of the tech hype bubble.

Those who follow my stuff know that I have never been particularly bullish on cryptocurrencies. However, the situation is different with generative AI. First of all, there is near universal agreement that artificial intelligence/machine learning will broadly play a more central role in our lives in the future.

Smartphones offer great insights here. I write about computer photography fairly regularly. There has been great progress in this area in recent years, and I think many manufacturers have finally found a good balance between hardware and software when it comes to both improving the end product and lowering the barrier to entry. Google, for example, pulls off some really impressive tricks with editing features like Best Take and Magic Eraser.

Sure, they’re nice tricks, but they’re also useful and not features for features’ sake. However, going forward, the real trick will be integrating them seamlessly into the experience. In ideal future workflows, most users will have little to no idea what goes on behind the scenes. You’ll just be happy that it works. It’s the classic Apple playbook.

Generative AI offers a similar “wow” effect right from the start, which is another point of difference from its hype cycle predecessor. If your least tech-savvy relative can sit at a computer, type a few words into a dialog box, and then watch the black box spit out images and short stories, not much conceptualization is required. That’s one of the main reasons this has caught on so quickly: when everyday people are presented with cutting-edge technologies, they usually have to imagine what it might look like in five or ten years.

With ChatGPT, DALL-E, etc. you can now experience it first hand. The downside of this, of course, is how difficult it will be to temper expectations. As much as people tend to endow robots with human or animal intelligence without a basic understanding of AI, it is easy to project intentionality here. But that’s just how it works now. We start with the attention-grabbing headline and hope people stick around long enough to read about the machinations behind it.

Spoiler alert: nine times out of ten they don’t, and suddenly we’re spending months and years trying to bring things back to reality.

One of the nice perks of my job is the ability to clarify these things with people who are much smarter than me. You take the time to explain things, and hopefully I’ll be able to translate it well for readers (some attempts are more successful than others).

As it became clear that generative AI would play an important role in the future of robotics, I found ways to incorporate questions into conversations. I find that most people in this field agree with the statement in the previous sentence, and it’s fascinating to see what impact they think it will have.

For example, in my recent conversation with Marc Raibert and Gill Pratt, the latter explained the role that generative AI plays in their approach to robot learning:

One thing we’ve figured out how to do is use modern generative AI techniques that allow humans to demonstrate both position and force to essentially teach a robot using just a few examples. The code is not changed at all. The basis for this is the so-called diffusion policy. It’s work we did in collaboration with Columbia and MIT. So far we have taught 60 different skills.

When I asked Nvidia’s VP and GM of embedded and edge computing, Deepu Talla, last week why the company thinks generative AI is more than a fad, he told me:

I think that is reflected in the results. You can already see the increase in productivity. It can compose an email for me. It’s not entirely true, but I don’t have to start from scratch. It gives me 70%. There are obvious things that you can already see that are definitely a step better than before. In summary, something is not perfect. I won’t let you read it and summarize it for me. So you can already see some signs of productivity improvements.

During my recent conversation with Daniela Rus, the head of MIT CSAIL explained how researchers are using generative AI to actually design the robots:

It turns out that generative AI can even be very powerful at solving motion planning problems. You can get much faster solutions and much more fluid and human-like solutions for control than with model predictive solutions. I think this is very impactful because the robots of the future will be much less robotized. Your movements will be much more fluid and human-like.

We also used generative AI for the design. This is very powerful. It’s also very interesting because it’s not just about pattern generation for robots. You have to do something else. Not only a pattern can be generated based on data. The machines must make sense in the context of physics and the physical world. That’s why we connect them to a physics-based simulation engine to ensure the designs meet their required constraints.

This week, a team from Northwestern University presented their own research into AI-generated robot design. The researchers presented how they designed a robot that “successfully runs in seconds.” While there isn’t much to see, it’s easy to see how this approach could be used with additional research to create more complex systems.

“We have discovered a very fast AI-driven design algorithm that bypasses the bottlenecks of evolution without resorting to the biases of human designers,” said research leader Sam Kriegman. “We told the AI ​​we wanted a robot that could walk on land. Then we just pressed a button and off we went! It quickly created a blueprint for a robot that looks nothing like any animal that has ever lived on Earth. I call this process ‘instantaneous evolution’.”

It was the AI ​​program’s decision to give the small, soft robot legs. “This is interesting because we didn’t tell the AI ​​that a robot should have legs,” Kriegman added. “It has been rediscovered that legs are a good way to move on land. Legged locomotion is actually the most efficient form of locomotion on Earth.”

“In my view, generative AI and physical automation/robotics will change everything we know about life on Earth,” Formant founder and CEO Jeff Linnell told me this week. “I think we are all aware of the fact that AI is a thing and expect that every one of our jobs, every company and every student will be affected by it. I think it’s a symbiosis with robotics. You don’t have to program a robot. You will speak to the robot in English, request an action and then it will be figured out. This will take a minute.”

Prior to Formant, Linnell founded Bot & Dolly and served as CEO. The San Francisco-based company, best known for its work on Gravity, was acquired by Google in 2013 when the software giant set out to advance the industry (the best laid plans, etc.). The CEO tells me that his main takeaway from this experience is that it’s all about the software (given the inclusion of Intrinsic and Everyday Robots in DeepMind, I’m inclined to say Google agrees).