- Superhuman AI
- Posts
- A worthy Voice Mode substitute?
A worthy Voice Mode substitute?
ALSO: Mapping out a business strategy with AI
Read time: under 4 minutes
Welcome back, Superhuman
If you’re still struggling to access ChatGPT’s Voice Mode, Hume's new EVI 2 could be a worthy substitute — and you can access it via your browser. Also: Learn how to analyze your business strategy with AI.
Today’s Insights
Is EVI 2 a worthy Voice Mode substitute?
Frontier: EU AI leader’s first multimodal model
Tutorial: Devising a business strategy with AI
Everything else you should know today
3 new AI tools to boost your productivity
AI-Generated Images: Wool houses
NEXT IN AI
A new voice bot can detect your emotions in real-time
Source: Hume
Those of us still waiting to try ChatGPT’s much-anticipated Voice Mode just got something to tide us over. The “empathic AI research lab” Hume has released EVI 2 (short for empathic voice interface) — a voice-to-voice model that can supposedly detect your emotions and respond accordingly.
We spent some time chatting with EVI’s characters, who each have a unique voice and personality: The platform struggled to keep up with lengthy conversations, giving relatively simple responses and sometimes veering off-topic. Plus, there are some awkward silences as the AI formulates each of its replies.
But it’s leaps and bounds ahead of traditional chatbots:
Each voice sounds genuinely life-like, offering a much deeper level of immersion than you’d find with Siri or Alexa
It’s one thing to watch an LLM generate a personalized poem via text — but another thing entirely to hear that poem read aloud in an accent and tone of your choice
Also, as promised, it’s pretty good at detecting the emotions hidden in your voice and mirroring them in its own responses
What EVI 2 lacks in precision, it makes up for in accessibility: Experimenting with it is about as easy as hopping on a Zoom call. You navigate to the app, choose a character, and click “start call.” Then, you can turn on your mic and start chatting as you would during any other video call.
What’s next? A larger model is coming within weeks. Eventually, you’ll be able to create your own characters inside the app, too.
PRESENTED BY ASSEMBLY AI
Transcribe, summarize, and use Large Language Models on your audio with AssemblyAI’s no-code playground
Get more from your voice data with AssemblyAI's Speech AI models. Learn how to transcribe, ask questions, summarize, extract, and generate content from your audio data.
Upload an audio file to the playground (sales call, meeting recording, podcast, etc)
Transcribe the file via the AssemblyAI API for free
Apply Speech AI models including Entity Detection, PII Redaction, Speaker Labels, and more
Leverage LLMs using LeMUR to ask questions about the audio, summarize into any format, generate content, and more.
Test it out in their no-code playground, or start building voice-driven products with $50 in API credits.
FROM THE FRONTIER
Europe’s AI leader releases its first multimodal model
Source: Mistral
Developers are working on LLMs that can seamlessly move between text and images, unlocking new capabilities. This mirrors how humans think — by integrating all of our senses to form a cohesive picture of the world.
Paris-based Mistral just joined the party with its first multimodal model, Pixtral 12B:
It’s competitive with rivals like GPT-4o while coming in a smaller form factor
It’s open-source for non-commercial uses: Researchers can use it to carry out experiments or build apps
It can spot how many times an item appears in a set of images, write photo captions, or generate charts that combine images and text
It could soon be used for more complex tasks, like integrating symptom databases, medical scans, and patient records to help diagnose patients
THE AI ACADEMY
How to do SWOT analysis with Mapify
Go to Mapify and log in to get credits.
Now go to New Map and select the sample prompt option on top.
Write your prompt for the SWOT analysis and click on the Mapify button at the bottom.
Sample Prompt: ‘SWOT analysis for [insert your business idea]’
It will generate a detailed SWOT analysis for you.
Once done, you can edit it, change its format, and try further prompts to get the required results.
Sample Prompt: ‘How can [insert business idea] maximize its competitive advantages?’
You can use Mapify to brainstorm ideas, make outlines, plan projects, develop strategies, explain concepts, analyze topics, create timelines, and more.
PROMPT OF THE DAY
Fostering Collaborative Learning
Prompt: [Insert context about learning project here.] Describe the project's goals and the importance of collaborative learning. Outline strategies to promote active participation, knowledge sharing, and problem-solving. Discuss how to manage group dynamics and the facilitator's role in guiding the collaborative process. Propose assessment methods to measure the project's success and the learning outcomes.
Source: PromptPal
PRESENTED BY GUIDDE
Train employees/customers at scale with AI How-To videos
Guidde is a useful AI that creates branded How-To videos from a screen recording or PDF. It handles all the work, you just pick the narrator's voice, caption language, and desired colors/theme. In sum, you can create a shareable guide in less than 5 minutes with no editing skills.
Try Guidde today at no cost.
AI & TECH NEWS
Everything else you need to know today
Source: Huawei
Access Granted: Anthropic has released a “Workspaces” feature that will let businesses set custom rate limits and manage their API usage with more precision.
Cloud Conquest: In a bid to expand its cloud services, Amazon will invest $10.5B to AI infrastructure projects in the UK over the next five years.
Audio Oracle: YouTube is officially dubbing its new AI radio feature “Ask Music.” A beta version is expected to roll out to Premium members over the next few weeks.
Bend, Don’t Break: Chinese tech company Huawei just unveiled what might be the world’s first tri-foldable smartphone.
😄 One Fun Thing: Interactive AI startup Infinite Reality is partnering is working on new 3D virtual experiences, like explorable digital replicas of sports venues, interactive shopping apps, and VR job fairs. Rachel Jacobson, Infinite Reality’s President, Global Business Ventures and Partnerships, says the startup is at the forefront of a “$35B immersive tech market ripe for brand engagement.”
🎥 Early Screening: Adobe offered new details about its Firefly text-to-video model, which is “only trained on public domain or licensed content” and is set to be released later this year. The model will let you generate B-roll with reference images and custom camera angles. You’ll also be able to extend clips to fill gaps in your footage.
PRODUCTIVITY
3 AI Tools to Supercharge Your Productivity
✅ Oscr AI: Transform any content into personalized, publish-ready blogs or social posts.
✅ Podcraftr: Instantly turn reports, emails, and other content into engaging podcasts.
✅ Genkin: Track your cash flows and get an in-depth analysis of your spending with AI.
* indicates a promoted tool, if any
AI-GENERATED IMAGES
Wool Felt Houses
Source: @m2z1.creative on Midjourney
Midjourney Prompt: The streets were lined with small shops and houses made of clouds and cotton candy, made of wool felt
--ar 3:4 --sref 680572301 --v 6.1 --stylize 1000 --personalize kzilt9y
Acquire new customers and drive revenue by partnering with us
Superhuman is the world’s biggest AI newsletter for businesses and professionals with 600,000+ readers and 1.5 Million followers on socials working at the world’s leading startups and enterprises. Companies like Amazon, Hubspot, and Salesforce feature their products in Superhuman. You can learn more about partnering with us here.
🧞Your wish is my command
What did you think of today's email?Your feedback helps me create better emails for you! |
Got more feedback or just want to get in touch? Reply to this email and we’ll get back to you.
Thanks for reading.
Until next time!
Zain & the Superhuman AI team