Ai August 8, 2024

NVIDIA is using Apple Vision Pro to build next-generation AI-powered humanoid robots


post-thumb

NVIDIA is using Apple Vision Pro to build next-generation AI-powered humanoid robots

img]

NVIDIA has revealed that mixed reality devices like Apple Vision Pro can be used to help developers accelerate the development of humanoid robots.

NVIDIA unveiled project GR00T in March, “a general-purpose foundation model for humanoid robots, designed to further its work driving breakthroughs in robotics and embodied AI.”

The “moonshot” initiative is an attempt to build a universal AI brain for humanoid robot platforms that definitely won’t rise up, take over the world, and kill us all. ‘What could possibly go wrong?’ aside, NVIDIA has revealed how devices like Apple Vision Pro can help power its new synthetic data generation pipeline to turn human movements simulated commands a robot can follow.

iRobot

NVIDIA Accelerating the Future of AI & Humanoid Robots - YouTube Watch On

In a video, the company has revealed how a new set of tools for developers in the humanoid robot ecosystem will help them build their AI models better and more efficiently, thanks to the aforementioned new synthetic data generation pipeline.

NVIDIA says that the new platform can collect human demonstrations of movement using a device like Apple Vision Pro — making toast, for example. Once gathered, the data can be multiplied by “1000x or more” using NVIDIA’s simulation tools.

Once the data is gathered, it can then be reapplied and used to train a humanoid robot that definitely won’t turn on you and your family. NVIDIA says it’s “one step closer to solving the AI brain for humanoid robots,” and Apple Vision Pro is part of the solution.

It’s not the first time NVIDIA has leveraged Apple Vision Pro to give us a glimpse of the future. Earlier this year it unveiled its OpenUSD-based Omniverse enterprise digital twins for Apple Vision Pro. IT demoed a full-fidelity digital twin of a car streamed directly to Apple Vision Pro, letting a designer toggle paint, choose trim, and even sit inside the vehicle, using spatial computing to blend 3D photorealistic environments with the physical world.

Master your iPhone in minutes iMore offers spot-on advice and guidance from our team of experts, with decades of Apple device experience to lean on. Learn more with iMore! Contact me with news and offers from other Future brands Receive email from us on behalf of our trusted partners or sponsors

img]

JAKARTA - Nvidia has introduced a new control service that allows developers to work on projects involving humanoid robots, which are controlled and monitored using the Apple Vision Pro.

Developing humanoid robots itself currently faces many challenges. One of them is controlling this very technical device. To help in this field, Nvidia has provided a number of tools for robotic simulations, including some that assist in control.

These tools are provided by Nvidia for the main robot manufacturer and software developer. This model and platform suite aims to train a new generation of humanoid robots.

The collection of tools includes what Nvidia calls NIM microservices and frameworks intended for simulation and learning. There is also Nvidia OSMO orchestration services to handle multi-stage robotic workloads, as well as teleoperative workflows supported by AI and simulations.

As part of this workflow, headsets and spatial computing devices like the Apple Vision Pro can be used not only to view data but also to control hardware.

“The next wave of AI is robotics and one of the most interesting developments is humanoid robots,” Nvidia CEO and founder Jensen Huang said. “We are developing a whole pile of NVIDIA robotics, opening up access for developers and humanoid companies around the world to use platforms, acceleration libraries, and AI models that best suit their needs.”

The NIM microservice is a pre-building container that uses Nvidia’s inference software, which is meant to reduce the implementation time. Two of these microservices are designed to assist developers with simulated workflows for a generating physical AI in Nvidia Isaac SIM, a reference application.

One of these microservices, MimicGen NIM, is used to help users control hardware using Apple Vision Pro, or other spatial computing devices. This service generates synthetic movement data for robots based on “recorded teleoperative data,” namely translating the movement from Apple Vision Pro into a movement that will be carried out by robots.

Videos and images show that this is more than just moving the camera based on headset motion. It is pointed out that hand movements and signs are also recorded and used, based on the Apple Vision Pro sensor.

Thus, users can see the robot’s movements and directly control their hands and arms, all using the Apple Vision Pro.

While such a humanoid robot can try to emulate moves appropriately, a system like Nvidia’s can interpret what users want to do. Since users don’t have tactile feedback for what robots are holding, it can be too dangerous to emulate hand movements directly.

Other teleoperative workflows demonstrated on Siggraph also allow developers to create large amounts of movement and perception data. All made up of a small number of demonstrations captured remotely by humans.

For this demonstration, the Apple Vision Pro is used to capture a person’s hand movements. This movement is then used to simulate recordings using Micro Service MimicGen NIM and Nvidia Isaac Sim, which generate synthetic datasets.

Developers can then train Project Groot’s humanoid model with a combination of real and synthetic data. This process is considered to help reduce the cost and time spent creating data from scratch.

“Development of a very complex humanoid robot requires tremendous amounts of real data, which are caught with great difficulty from the real world,” said Fourier robotic platform maker CEO Alex Gu. “Generative AI development tools and new Nvidia simulations will help accelerate the workflow of our model development.”

Micro services, as well as access to models, OSMO managed robotic services, and other frameworks, are all offered under the Nvidia Humanoid Robot Developer Program. Access is provided by the company only to software developers, hardware, or humanoid robot manufacturers.

The English, Chinese, Japanese, Arabic, and French versions are automatically generated by the AI. So there may still be inaccuracies in translating, please always see Indonesian as our main language. (system supported by DigitalSiber.id)

img]

New Omniverse Cloud APIs to let developers stream interactive, industrial digital twins into Apple Vision Pro.

NVIDIA is bringing OpenUSD-based Omniverse enterprise digital twins to the Apple Vision Pro.

Announced today at NVIDIA GTC, a new software framework built on Omniverse Cloud APIs, or application programming interfaces, lets developers easily send their Universal Scene Description (OpenUSD) industrial scenes from their content creation applications to the NVIDIA Graphics Delivery Network (GDN), a global network of graphics-ready data centers that can stream advanced 3D experiences to Apple Vision Pro.

In a demo unveiled at the global AI conference, NVIDIA presented an interactive, physically accurate digital twin of a car streamed in full fidelity to Apple Vision Pro’s high-resolution displays.

The demo featured a designer wearing the Vision Pro, using a car configurator application developed by CGI studio Katana on the Omniverse platform. The designer toggles through paint and trim options and even enters the vehicle — leveraging the power of spatial computing by blending 3D photorealistic environments with the physical world.

Bringing the Power of RTX Enterprise Cloud Rendering to Spatial Computing

Spatial computing has emerged as a powerful technology for delivering immersive experiences and seamless interactions between people, products, processes and physical spaces. Industrial enterprise use cases require incredibly high-resolution displays and powerful sensors operating at high frame rates to make manufacturing experiences true to reality.

This new Omniverse-based workflow combines Apple Vision Pro groundbreaking high-resolution displays with NVIDIA’s powerful RTX cloud rendering to deliver spatial computing experiences with just the device and an internet connection.

This cloud-based approach allows real-time physically based renderings to be streamed seamlessly to Apple Vision Pro, delivering high-fidelity visuals without compromising details of the massive, engineering fidelity datasets.

“The breakthrough ultra-high-resolution displays of Apple Vision Pro, combined with photorealistic rendering of OpenUSD content streamed from NVIDIA accelerated computing, unlocks an incredible opportunity for the advancement of immersive experiences,” said Mike Rockwell, vice president of the Vision Products Group at Apple. “Spatial computing will redefine how designers and developers build captivating digital content, driving a new era of creativity and engagement.”

“Apple Vision Pro is the first untethered device which allows for enterprise customers to realize their work without compromise,” said Rev Lebaredian, vice president of simulation at NVIDIA. “We look forward to our customers having access to these amazing tools.”

The workflow also introduces hybrid rendering, a groundbreaking technique that combines local and remote rendering on the device. Users can render fully interactive experiences in a single application from Apple’s native SwiftUI and Reality Kit with the Omniverse RTX Renderer streaming from GDN.

NVIDIA GDN, available in over 130 countries, taps NVIDIA’s global cloud-to-edge streaming infrastructure to deliver smooth, high-fidelity, interactive experiences. By moving heavy compute tasks to GDN, users can tackle the most demanding rendering use cases, no matter the size or complexity of the dataset.

Enhancing Spatial Computing Workloads Across Use Cases

The Omniverse-based workflow showed potential for a wide range of use cases. For example, designers could use the technology to see their 3D data in full fidelity, with no loss in quality or model decimation. This means designers can interact with trustworthy simulations that look and behave like the real physical product. This also opens new channels and opportunities for e-commerce experiences.

In industrial settings, factory planners can view and interact with their full engineering factory datasets, letting them optimize their workflows and identify potential bottlenecks.

For developers and independent software vendors, NVIDIA is building the capabilities that would allow them to use the native tools on Apple Vision Pro to seamlessly interact with existing data in their applications.

Learn more about NVIDIA Omniverse and GDN, and sign up to get notified about NVIDIA Omniverse streaming to Apple Vision Pro.

img]

If you buy through our links, we may get a commission. Read our ethics policy

A robot being controlled by an Apple Vision Pro [Nvidia]

A new control service from Nvidia can allow developers to work on projects involving humanoid robotics, controlled and monitored using an Apple Vision Pro.

Developing humanoid robots has many challenges, with one being the nature of controlling the highly technical devices. To help in this area, Nvidia has made available a number of tools for robotic simulation, including some assisting in control.

Provided by Nvidia to major robot manufacturers and software developers, the suite of models and platforms are meant to help train a new generation of humanoid robots.

The collection of tools includes what Nvidia refers to as NIM microservices and frameworks, intended for simulation and learning. There’s also the Nvidia OSMO orchestration service for dealing with multi-stage robotics workloads, as well as AI and simulation-enabled teleoperation workflows.

As part of these workflows, headsets and spatial computing devices like the Apple Vision Pro can be employed to not only see data, but to control hardware.

“The next wave of AI is robotics and one of the most exciting developments is humanoid robots,” said Nvidia CEO and founder Jensen Huang. “We’re advancing the entire NVIDIA robotics stack, opening access for worldwide humanoid developers and companies to use the platforms, acceleration libraries and AI models best suited for their needs.”

Apple Vision Pro control

The NIM microservices are prebuilt containers that use Nvidia’s inference software, meant to reduce deployment times. Two of these microservices are designed to help developers with simulation workflows for generative physical AI within the Nvidia Isaac SIM, a reference application.

One, the MimicGen NIM microservice, is basically used to help users control hardware using the Apple Vision Pro, or another spatial computing device. It generates synthetic motion data for the robot based on “recorded teleoperated data,” namely translating movements from the Apple Vision Pro into movements for the robot to make.

Videos and images show that this is more than just moving a camera based on the headset’s movements. It is shown that hand motions and gestures are also recorded and used, based on the Apple Vision Pro’s sensors.

In effect, users could watch the robot’s movements and directly control hands and arms, all using the Apple Vision Pro.

While such humanoid robots could try to mimic the gestures exactly, systems like Nvidias could infer what the user wants to do instead. Since users don’t have tactile feedback for what the robot is holding, it could be too dangerous to directly mimic hand movements.

Another teleoperation workflow demonstrated at Siggraph also allowed developers to create large amounts of motion and perception data. All created from a small number of remotely captured demonstrations by a human.

For these demonstrations, an Apple Vision Pro was used to capture the movements of a person’s hands. These were then used to simulate recordings using the MimicGen NIM microservice and the Nvidia Isaac Sim, which generated synthetic datasets.

Developers were then able to train a Project Groot humanoid model with a combination of real and synthetic data. This process is thought to help cut down on costs and time spent creating the data in the first place.

“Developing humanoid robots is extremely complex — requiring an incredible amount of real data, tediously captured from the real world,” according to robotics platform maker Fourier’s CEO Alex Gu. “NVIDIA’s new simulation and generative AI developer tools will help bootstrap and accelerate our model development workflows.”


回到上一頁