feature article
Subscribe Now

When Supercomputers Meet Beer Pong

My head is currently swirling and whirling with a cacophony of conceptions. This maelstrom of meditations was triggered by NVIDIA’s recent announcement of their Jetson Orin Nano system-on-modules that deliver up to 80x the performance over the prior generation, which is, in their own words, “setting a new standard for entry-level edge AI and robotics.”

One of my contemplations centers on their use of the “entry level” qualifier in this context. When I was coming up, this bodacious beauty would have qualified as the biggest, baddest supercomputer on the planet.

I’m being serious. In 1975, which was the year I entered university, Cray Research announced their Cray-1 Supercomputer. Conceived by Seymour Cray, this was the first computer to successfully implement a vector processing architecture.

Seymour was, in many ways, a man ahead of his time. As one of his contemporaries, Joel S. Birnbaum, who used to be CTO at Hewlett-Packard, said of Cray: “It seems impossible to exaggerate the effect he had on the industry; many of the things that high performance computers now do routinely were at the farthest edge of credibility when Seymour envisioned them.” Since over 100 Cray-1 systems were eventually sold, this makes it arguably one of the most successful supercomputers in history.

3D rendering of a Cray-1 with two figures to scale
(Image source:
FlyAkwa/Wikipedia)

Observe the iconic design featuring a relatively small C-shaped cabinet with a ring of benches around the outside covering the power supplies and the cooling system.

The original Cray-1 boasted an 80 MHz clock and could support up to 32 MB of memory. Yes, you read this right. MB (megabytes). This was considered to be a ginormous (some might say “gargantuan,” but not me because I can’t spell it) amount of memory in those days of yore when I wore a younger man’s clothes. Featuring 12 independent pipelined execution units, it could perform three (count them, 3) floating-point operations per cycle.

In turn, this reminds me of the Homebrew Cray-1A, which was created 35 years after the Cray-1 circa 2010 by cunning artificer Chris Fenton. Chris managed to implement pretty much the same functionality as the Cray-1A using a single Xilinx Spartan-3E 1600 FPGA. And, of course, the processing power of this FPGA is equivalent to a drop of water as compared to a bathtub full of dihydrogen monoxide in the form of a Jetson Orin Nano (I lay dibs on being the first person to compare NVIDIA’s Jetson Orin Nano to a bathtub—if anyone asks, remember that you saw it here first).

Speaking of which… what do you think they mean when they talk about “80x the performance over the prior generation.” What was the prior generation? Well, way back in the mists of time we used to call 2020, I penned a column: Feast Your Orbs on NVIDIA’s Jetson Nano 2GB Dev Kit. As I wrote in that column: “This bodacious beauty boasts a 128-core NVIDIA Maxwell graphics processing unit (GPU), a quad-core 64-bit ARM Cortex-A57 central processing unit (CPU), and 2 GB of 64-bit LPDDR4 25.6 GB/s memory. Additional storage is provided via a microSD card (I have a 64 GB card, but you can go much higher; for example, you can get microSD cards up to 1 TB, the thought of which makes my eyes water).”

That column included photos of me unboxing a Jetson Nano 2GB Dev Kit. I no longer recall why, but a couple of weeks later, the nice folks at NVIDIA sent me a second unit (maybe they thought I’d mess the first one up). This second Jetson Nano sat in the corner of my office making me feel increasingly guilty until…

…I went to Norway a couple of weeks ago to present the keynote at the FPGA Forum. As part of this trip, I was invited to give a guest lecture to the MSc students studying Embedded Computing at the Norwegian University of Science and Technology (NTNU) (on the off chance you are interested, I tallied the topics of my talks in two tortuous columns: Not Much Happened, or Did It? and All Change!).

While I was wandering around my office gathering some of the bits and pieces I wanted to take with me—like my single-digit BLUB Nixie tube clock (see Retro-Futuristic-Steampunk Technologies (Part 1))—the never-been-opened Jetson Nano 2GB Dev Kit caught my eye, and I decided this would make an awesome “giveaway” during my talk to the students. 

After arriving in Norway and having talked to Per Gunnar, who was the lecturer in charge, we decided to invite interested students to post descriptions of what they would do with the Jetson Nano and why they should be “the one.” In order to facilitate this process, Per Gunnar set up a submissions page on the university’s internal website. Later, at the FPGA Forum, he and I took some time away from the madding throng to peruse and ponder all of the submissions.

As it turned out, we were inundated with a profusion of proposals. We created a matrix of things like how interesting the project was and the likelihood of its success. We discarded the offering whose submitter had neglected to include their name (which was a shame because that was a good one). We eventually opted for the student who suggested creating an artificial intelligence (AI)-guided ping pong ball launcher to be used in Beer Pong. The idea here is that the Jetson Nano would be equipped with cameras to observe and learn the ball’s flight and bouncing characteristics. It would also have the ability to control the ball’s launch angle and speed. In addition to checking the “interesting” and “likely to succeed” boxes, I fear the judges were swayed by the closing argument that, if accompanied by this creation, the student in question and his companions would be able to get free beers at any bar in town. It’s hard to argue with logic like that.

But we digress… if you think of all the things you could do with a Jetson Nano 2GB Dev Kit, just imagine the possibilities afforded by a Jetson Orin Nano.

Meet the Jetson Orin Nano (Image source: NVIDIA)

The Jetson Orin features an NVIDIA Ampere architecture GPU, Arm-based CPUs, next-generation deep learning and vision accelerators, high-speed interfaces, fast memory bandwidth and multimodal sensor support. This performance and versatility will empower users to develop and commercialize products that once seemed impossible, from engineers deploying edge AI applications to Robotics Operating System (ROS) developers building next-generation intelligent machines.

The Orin Nano modules will be available in two versions. The Orin Nano 4GB delivers up to 20 trillion operations per second (TOPS) with power options as low as 5W to 10W, while the 8GB version delivers up to 40 TOPS with power configurable from 7W to 15W.

The important thing to remember here is that, although the Jetson Orin Nano delivers up to 40 TOPS of AI performance in the smallest Jetson form factor, this is only the “entry level” product in the Orin family. The top-of-the-line AGX Orin can deliver 275 TOPS for applications like advanced autonomous machines.

275 TOPS. My eyes are watering just thinking about this. If I could get my time machine working and take one of these bodacious beauties back to 1975 and show it to Seymour Cray, I bet his eyes would water also. How about you? Can you fight your way through your tears to post a comment communicating your thoughts on the amount of compute power afforded by NVIDIA’s Orin modules?

Leave a Reply

featured blogs
Nov 15, 2024
Explore the benefits of Delta DFU (device firmware update), its impact on firmware update efficiency, and results from real ota updates in IoT devices....
Nov 13, 2024
Implementing the classic 'hand coming out of bowl' when you can see there's no one under the table is very tempting'¦...

featured video

Introducing FPGAi – Innovations Unlocked by AI-enabled FPGAs

Sponsored by Intel

Altera Innovators Day presentation by Ilya Ganusov showing the advantages of FPGAs for implementing AI-based Systems. See additional videos on AI and other Altera Innovators Day in Altera’s YouTube channel playlists.

Learn more about FPGAs for Artificial Intelligence here

featured paper

Quantized Neural Networks for FPGA Inference

Sponsored by Intel

Implementing a low precision network in FPGA hardware for efficient inferencing provides numerous advantages when it comes to meeting demanding specifications. The increased flexibility allows optimization of throughput, overall power consumption, resource usage, device size, TOPs/watt, and deterministic latency. These are important benefits where scaling and efficiency are inherent requirements of the application.

Click to read more

featured chalk talk

Accelerating Tapeouts with Synopsys Cloud and AI
Sponsored by Synopsys
In this episode of Chalk Talk, Amelia Dalton and Vikram Bhatia from Synopsys explore how you can accelerate your next tapeout with Synopsys Cloud and AI. They also discuss new enhancements and customer use cases that leverage AI with hybrid cloud deployment scenarios, and how this platform can help CAD managers and engineers reduce licensing overheads and seamlessly run complex EDA design flows through Synopsys Cloud.
Jul 8, 2024
36,358 views