Meta's open-source blitz, a new Beatles AI song, and AI content licensing progress

Plus: Google AI can now detect skin conditions

Happy Monday! Today is June 19, 2023.

The biggest news last week for me was Meta’s decision to launch their next open-source LLM with a commercial license, which could turn into a massive threat against the close-source ecosystems OpenAI and Google are trying to build.

If and when that model comes out, expect the battle of open vs. closed-source to only grow even more intense.

In this issue:

  • 🧨 Meta’s next open-source LLM will have commercial license, putting pressure on Google and OpenAI

  • 🎸 Paul McCartney will release a “final Beatles song” using AI

  • 🗑️ AI-generated junk has taken over Etsy

  • 💰️ AI and media titans hash out the future of content licensing

  • 🔎 Google tells their own employees to not put code into chatbots

  • 🧪 The latest science experiments, including Google’s new skin condition detector and an avatar generator that uses pics + text

🧨 Meta’s next open-source LLM will have commercial license, putting pressure on Google and OpenAI

Meta’s Bay Area headquarters. Credit: The Verge

Last week, news broke that Meta intends to make its next set of open-source LLMs available for commercial use.

Why this matters:

  • Right now, Meta’s LLaMA LLM is only available for research use, preventing companies from using it as part of their commercial efforts.

  • This will likely tap into massive demand: Meta’s open-source AI tech is already massively popular for researchers, who have found numerous ways to fine-tune and improve it.

  • OpenAI feels the heat and may release their own open-source model. Rumors say this won’t approach GPT-4’s power, but it would represent a sharp reversal from their closed-source approach of late.

Despite questions from the US Senate about the dangers of open-source AI, in an interview last week, Meta's Chief AI scientist Yan LeCun dismissed any worries about AI posing dangers to humanity as "preposterously ridiculous."

🎸 Paul McCartney will release a “final Beatles song” using AI

AI is the key unlock to releasing a “final Beatles song,” announced McCartney last week. Vocals from John Lennon were passed through an AI model and made “pure,” enabling the track to be assembled and mixed. Lennon passed away in 1980.

This raises the question: at what point does AI make it no longer a “real” Beatles song? We’ll have to see once the song releases, as McCartney shared few details on the track itself in his interview.

The use of AI to make songs is currently experiencing a watershed moment:

  • An AI-generated song featuring Drake’s voice went viral and was banned from Spotify and other streaming services at the request of recording labels.

  • Tens of thousands of other tracks imitating famous artists continue to proliferate on social media in the meantime.

  • Meanwhile, some artists like Grimes are openly embracing AI music, calling for fans to use her voice in AI compositions and simply pay royalties.

🗑️ AI-generated junk has taken over Etsy

AI has made it easier than ever to generate artwork, but that’s also led to a proliferation of AI-generated products on Etsy that is pushing out real artists, as the Atlantic reports in a fascinating deep dive (note: article is paywalled).

This is no different than text-based content on the web, which is now suffering from a massive increase in bland and generic AI content as content writers lose out on work and companies switch over to AI platforms.

In Etsy’s case, the problem really stems from two key drivers:

  • Etsy doesn’t forbid AI-generated content so long as “creativity” is involved. This very loose interpretation makes it possible for sellers to exist.

  • “Hustle culture” Youtubers are pitching side hustles that are low-effort, maximum reward. These range from selling Midjourney artwork digitally on Etsy to putting thousands of variations of art on t-shirts and more.

I’ve heard from a lot of readers that one of your biggest fears is an age where it’s no longer easy to know if content was generated by human or machine. When I read articles like this, it feels that future is arriving quite quickly.

💰️ AI and media titans hash out the future of content licensing

As AI has exploded into our lives, the value of the content used to train these AI models has become very clear. AI giants including OpenAI, Google and Microsoft may now pay media companies a fee of up to $20M per year in order to continue using their content for AI training, early reports indicate.

AI companies are already facing pressure from multiple angles:

  • Legislation like the EU's AI Act will require disclosure of copyrighted training data, and other countries are likely to follow. It will become harder and harder to hide the data they use to train models.

  • Companies like Reddit + StackOverflow have announced pricing tiers for their APIs, as they seek to prevent AI companies from simply sucking up their data for free.

  • Lawsuits alleging copyright violation are now targeting generative AI companies. While many seem ill-advised and frivolous, it still is a thorn in the side of AI tech firms.

Google, with its prior expertise in working with media companies (though not always successfully) is leading the way here on suggesting payment frameworks, and for once it appears all sides feel good about the likely outcome.

But what does this mean for the non-tech giants? Let’s say you wanted to train your own AI model – could you end up in a world where every piece of valuable content is locked down and requires licensing?

We’ll be watching this very closely especially since open-source AI remains so promising and cost-effective compared to the closed-source approach.

🔎 Quick Scoops

Google warned its own employees not to enter confidential information into chatbots, including its own Bard. This restriction includes entering computer code. (Reuters)

Researchers worry AI models may “collapse” as they train on AI-generated content. As AI content profilerates, this doom loop could cause “irreversible defects”, researchers warn. (VentureBeat)

Mercedes Benz introduces ChatGPT for in-car voice control, marking the start of a trend where language models may increasingly power a new generation of interfaces. (Mercedes)

France’s MistralAI raises $113M to take on OpenAI. Founded by alums from DeepMind and Meta, the company intends to release its first generative AI models in 2024. (TechCrunch)

AMD jumps into the AI GPU race with a new class of GPUs. AI helped make Nvidia a trillion-dollar company, and AMD doesn’t want to be left out. (Anandtech)

🧪 Science Experiments

Google Lens can now identify skin conditions via uploaded photos

  • While Google cautions this shouldn’t replace your doctor, it’s a new adaptation of their powerful Lens image search technology for a novel use case.

  • Read their full blog here.

This isn’t going to replace your doctor, but it sure could help start the conversation.

AvatarBooth generates 3D human avatars from both text and images

  • Previous methods could only generate avatars from text descriptions, while this one is able to consume images and combine them with text prompts.

  • The quality is reminiscent of early-N64 games, but the concept is nonetheless fascinating.

  • Full paper here. GitHub here.

Obama, Hillary, and Kobe — all generated with off an image plus text description.

13B parameter OpenLLaMA LLM released to public

  • This is an open-source reproduction of Meta’s LLaMA large language model, which means it isn’t limited to research-only purposes. Until Meta allows their LLMs to be used for commercial purposes, this may be the next-best alternative.

  • See it on Hugging Face here.

Researchers use new method to fine-tune a 65B parameter model on 8x RTX 3090 GPUs

  • A new optimizer combined with other efficiency techniques means fine-tuning can consume just 11% of the memory bandwidth of traditional techniques.

  • As a result, a 65 billion parameter model was fine-tuned on just 8 RTX 3090s

  • Full paper here.

😀 A reader’s commentary

Here’s what Christopher said about last week’s email. Glad we’re keeping folks informed!

That’s it! Have a great week!

And as always, we want to hear from you on how useful this is. The more feedback you provide, the better our newsletter gets! What else would you like to see covered?

So take the poll below — your signals are helpful!