AI Brief #18: AI-powered warplanes are coming

Plus: ChatGPT aims to secure the enterprise, and more

Today is August 29, 2023.

This week’s issue highlights a number of areas in AI beyond chatbots and generative AI that are seeing significant advancement.

First, we have an in-depth look at how the US Air Force is rapidly shifting their strategy towards a future where AI-powered warplanes will outnumber human pilots.

We also feature two interesting developments on the medical front as AI improvements increasingly offer hope to paralyzed patients, and overworked doctors stand to benefit from AI detection models serving as copilots in their work.

Read on!

In this issue:

  • 🪖 The US Air Force is embracing AI-powered warplanes

  • 🧠 AI helps paralyzed woman speak out loud via decoded brain signals

  • 🚀 OpenAI launches ChatGPT Enterprise, aiming to address privacy and security needs

  • 🤗 Hugging Face raises at a $4.5B valuation, Meta releases their coding assistant AI, and more

  • 🧪 The latest science experiments, including an AI model that generates college lectures and radio shows from text prompts

🪖 The US Air Force is embracing AI-powered warplanes

The US Air Force is leading the way on AI with its focus on next-generation AI-powered warplanes, this comprehensive report from the New York Times (note: paywalled site) details.

Next-generation drones able to fly as AI wingmen to human pilots are currently in testing, and the Air Force plans to build as many as 2,000 of these drones at a target cost of $3M apiece – a fraction of what the F-35 fighter costs ($80M per plane).

The pilotless XQ-58A Valkyrie prototype. Photo Credit: NY Times

Why this matters:

  • The focus on these “collaborative combat aircraft” shows that the Air Force is aware that future battlefields may require a large swarm of highly capable robots to dominate.

  • Increasingly sophisticated air defense systems from adversaries like China also make the concept of cheaper, unmanned AI-powered warplanes far more attractive.

At the same time, these moves are accelerating a debate within the military for what role humans should play in conflicts, and the limits we should place on software that enables machines to kill.

🧠 AI helps paralyzed woman speak out loud via decoded brain signals

A Canadian woman who was paralyzed after a stroke is now able to “speak” for the first time in nearly 20 years, thanks to an AI system that decodes her brain signals.

Why this matters:

  • As AI systems have grown increasingly sophisticated, brain-computer interfaces are a promising area that is likely to see significant advances.

  • The scientists behind her accomplishment notably didn’t use AI to decode thoughts (as some other AI models have shown is possible). Instead, they simply trained an AI model on phonemes, which are muscle movements that form the building blocks of speech.

  • To everyone’s surprise, the AI model only needed to learn 39 phonenes to decipher any word in English, allowing the model to perform far more efficiently than other AI approaches.

Brain-computer interfaces are an area that numerous private sector companies are pursuing as well, with the most notable being Elon Musk-founded Neuralink. Expect to see more promising developments here in the future.

🚀 OpenAI launches ChatGPT Enterprise, aiming to address privacy and security needs

ChatGPT’s adoption amongst knowledge workers has been impressive since its launch, but in recent months a multitude of companies have banned ChatGPT use by employees over privacy and security concerns.

OpenAI officially plans to change this with an enterprise-level offering of ChatGPT, specifically focused on concerns large corporations have raised.

  • Privacy: OpenAI’s models won’t use conversations in ChatGPT Enterprise for training purposes, and additional SOC 2 compliance and encryption are designed to reassure enterprises this is a trustworthy offering.

  • Capability: ChatGPT Enterprise notably removes all usage caps, prioritizes speed (up to 2x faster), and allows users to utilize a 32k context window to process longer inputs and files.

OpenAI may also be seeking to ward off competition posed by open-source AI models such as Meta’s Llama, which have touted their privacy and customizability as key advantages for enterprise-grade AI needs. How enterprise-grade adoption plays out, especially in light of OpenAI’s new managed service offering, will be interesting to observe over the next few years.

🔎 Quick Scoops

AI startup Hugging Face valued at $4.5B in latest funding round. Notably, Google, Nvidia, Intel and others participated, highlighting investor enthusiasm for AI and the ecosystem of partners supporting each other. (Bloomberg)

Google proposes a new watermarking approach to stop deepfakes, but the availability of open-source AI image-creating projects leaves its widespread adoption in doubt. (Washington Post)

AI models can read chest radiographs and pick up signs that doctors often miss, researchers from Osaka University revealed. AI copilots of human doctors could become increasingly common as improvements arrive. (SciTech Daily)

Meta releases Code Llama, their AI model designed to help write code. Similar to Llama 2, the model comes in multiple sizes and isn’t truly open-source, but is free to use for a wide variety of use cases. (Meta)

ElevenLabs comes out of beta and now supports AI speech for 28 languages. AI speech generation is gaining in sophistication, posing numerous business possibilities but also raising challenges around its misuse. (ElevenLabs)

🧪 Science Experiments

DenseDiffusion: Dense Text-to-Image Generation with Attention Modulation

  • Existing text-to-image diffusion models struggle to synthesize realistic images when given dense captions. DenseDiffusion is a training-free method that offers far better outputs and control over scene layout when digesting dense captions.

  • Project page here, with comparisons between DenseDiffusion and StableDiffusion.

Comparison of DenseDiffusion’s outputs vs. standard StableDiffusion. Credit: GitHub

WavJourney: Compositional Audio Creation with LLMs

  • LLMs that can specialize in audio content creation remain relatively unexplored. WavJourney is a end-to-end system that leverages LLMs to excel at audio storytelling, all starting from a prompt of an auditory scene.

  • Project page here, including examples of a new report, college lecture, and more.

Meta releases SeamlessM4T, a multimodal AI model for speech translation

  • Seamless translation across speech and text for over 100 languages is possible across a variety of use cases, including speech-to–speech translation. According to Meta, this is the first model to achieve state-of-the-art results across so many languages and tasks.

  • See their project page here.

Credit: Meta

👋 How I can help

Here’s other ways you can work together with me:

  • If you’re an employer looking to hire tech talent, my search firm (Candidate Labs) helps AI companies hire the best out there. We work on roles ranging from ML engineers to sales leaders, and we’ve worked with leading AI companies like Writer, EvenUp, Tome, Twelve Labs and more to help make critical hires. Book a call here.

  • If you would like to sponsor this newsletter, shoot me an email at [email protected] 

As always — have a great week!