- Superhuman AI
- Posts
- đźď¸ Gemini goes multimodal
đźď¸ Gemini goes multimodal
ALSO: Learn how to use ChatGPT Deep Research from scratch

Read time: under 4 minutes
Welcome back, Superhuman. Alphabet is already back for more. Hot off the heels of Gemma 3, it just became the first company to hand out access to a fully multimodal image generator. Keep reading for some examples of what you can create with the powerful new feature.
Todayâs Insights
Efficiency-focused AI, realistic ads, and hybrid models
Gemini gets a major overhaul
Tutorial: Learn to use ChatGPTâs Deep Research from scratch
5 new AI tools to boost your productivity
News, memes, whatâs trending on socials, and more
TODAY IN AI

Captionsâ new tool lets you generate realistic influencer ads. Source: Captions
1. Cohere drops ultra-efficient model for businesses: Googleâs Gemma 3 is only a few days old, and thereâs already a new LLM giving it a run for its money. Cohere just launched Command A, which can run on only two Nvidia chips while being about as powerful as DeepSeek and OpenAIâs latest offerings. The Toronto-based startup says the model is great for smaller businesses who donât have dozens of pricey GPUs at their fingertips.
2. New platform generates ads that look eerily realistic: NYCâs Captions is rolling out a new feature called Mirage that lets you create marketing campaigns with AI-generated influencers. You can either write a script or upload audio and have the avatar automatically sync up with it. Natural-looking body language and âmicro-expressionsâ make each AI influencer look far more convincing than what weâve seen from similar tools, too. You can try it out here.
3. Boundary-pusher Nous Research drops two-in-one LLM: Known for its censorship-free models, Nous unveiled a preview version of DeepHermes, in 3B and 24B sizes. It lets you toggle between chain-of-thought and quick thinking â becoming âone of the first models in the worldâ to fuse both modes into a single interface. Itâll also share its entire thought process, unlike most closed-source rivals. (Hereâs a link to try it out via the collectiveâs Discord channel.)
PRESENTED BY GOOGLE
2025 will be âthe most defining yearâ for startups. Learn why

Gain the upper hand on the year ahead with Google Cloudâs Future of AI: Perspectives for Startups report.
It explains key AI predictions from 20+ industry experts from Google Cloud, Social Capital, and more:
Where startups should focus resources to beat the competition
Hidden opportunities to create immediate value
Steps to scale AI products efficiently
FROM THE FRONTIER
Gemini brings multimodal image generation to the masses

Alphabet is giving away more Gemini features at no cost. Source: Google
In an industry-first, Alphabet just gave everyone access to Geminiâs native image generation capabilities on AI Studio.
What makes it unique? Usually, your image prompts get fed through an LLM â and in the process, details can get lost in translation. Gemini 2.0 Flash, on the other hand, can move between different mediums on-the-fly, delivering much better speed and accuracy.
What can you do with it? You can generate a recipe and add photos for each step. Or embed entire words or sentences into your images. You can also quickly edit specific parts of an image without having to regenerate an entirely new one.
Here are some of our favorite examples:
Transforming the cover of a magazine, and swapping out its price tag
Quickly adding chocolate drizzle to a pile of croissants
Creating video game characters, then dropping them into a 3D world
Applying the style of one image to another
Geminiâs also getting a major upgrade: Now, anyone can try out the platformâs Deep Research feature, not just paid subscribers. Alphabet is also opening up access to Gems â customizable presets âlike a translator, meal planner, or math coach.â Finally, Gemini can now integrate into your search history and apps for a more personalized experience.
THE AI ACADEMY
Learn how to use ChatGPT Deep Research from scratch
ChatGPTâs Deep Research is the best AI product Iâve used this year and it might be the first one that surpasses humans in research ability. Watch the full tutorial here.
Go to ChatGPT and sign up. (Upgrade to Pro or Plus for Deep Research)
Use 03-mini high or 03-mini reasoning models and select Deep Research.
Type a detailed, specific prompt with clear goals.
Answer any follow-up questions and wait 5-60 minutes for the generated report.
Review the report, including sources, summary, and ask follow-up questions for further details.
Note: Deep Research was opened to ChatGPT Plus users shortly after I recorded this video, so you no longer need the Pro plan to use it.
PRESENTED BY VANTA
Is Compliance Holding You Back? Vanta Can Help

Navigating new compliance requirements can be a daunting task, but with the right automation tools, it doesn't have to be.
Whether youâre a fast-growing startup or an established security team, Vanta can help you achieve continuous compliance (and more).
Join the live demo on April 3 to learn how Vanta can help you automate compliance for frameworks like SOC 2, ISO 27001, HIPAA, HITRUST, and ISO 42001 and build customer trust.
AI & TECH NEWS
Everything else you need to know today

Snap just introduced three new AI video lenses. Source: Snap
đą Open Sesame: Sesame just released the AI model powering their viral Maya assistant under an Apache 2.0 license, making it super easy to clone voices in under a minute - though some are worried about the lack of safeguards to prevent misuse.
đď¸ Polyglot Pros: According to Bloombergâs Mark Gurman, Apple's working on adding a real-time live translation feature to AirPods, which could roll out later this year alongside iOS 19.
⨠All in One: Alibaba announced its consumer chatbot will now be powered by its frontier Qwen models, featuring deep thinking, search, and other agentic capabilities, which can all be accessed from a single app.
𤳠Lens Leap: Social media platform Snap introduced new generative video âlensesâ that let you do things like animate animals and add dynamic objects into your shot.
đľď¸ On the Case: The creator of âLaw & Orderâ is launching an AI-generated murder mystery game, which will be updated with new mysteries on a daily basis.
PRODUCTIVITY
5 AI Tools to Supercharge Your Productivity
â Duck AI: Anonymous access to popular AI models like GPT-4o mini and Claude 3.
â Quadratic: Chat with your data, connect databases, and visualize results in a code-friendly all-in-one tool.
â Innovating with AI*: Just welcomed 200 new students into The AI Consultancy Project, their new program that trains you to build a business as an AI consultant. Request early access now!
â Greta: Ship any full-stack applications within seconds without writing a single line of code.
â Whisper V3: Transcribe long-form YouTube videos with the click of a button.
* indicates a promoted tool, if any
PROMPT OF THE DAY
Improve Focus
Prompt: I need suggestions for techniques that can help improve focus and productivity during work hours. The techniques should be practical, easy to implement, and effective in helping individuals stay focused and avoid distractions while working.
Your task is to provide a list of proven techniques that can help improve focus, concentration, and productivity during work hours.
Work environment: [work environment]
Typical distractions: [typical distractions]
Source: promptadvanceclub
SOCIAL SIGNALS
Whatâs trending on socials today

đŚ Squawk Back: Watch what happens when a parrot tries to have a conversation with Blandâs AI voice assistant. âParrot-1B shows signs of promise in natural language understanding but has a very small context window,â one commenter joked.
đ Thread the Needle: X user âs13kâ vibe-coded a game that lets you try to land SpaceXâs booster rocket into the now-famous âchopsticks.â You can try to beat his record of 7.2 seconds here.
đ No Strings: Builder Catalin Pit shared a list of open-source alternatives to popular AI-powered apps like Bitly, Jira, and Docusign.
đ¸ Fantasy Filter: AI video platform Pika just dropped a new batch of effects, including âmuseum me,â âbaby me,â âhero me,â and âprincess me.â
đŠ Bot Burnout: Anthropic CEO Dario Amodei explained why he thinks there should be an âI quitâ button embedded in chatbots so they can let us know when theyâre getting too overwhelmed or stressed.
AI-GENERATED IMAGES
Springtime bliss

Source: ubudmdulc174 on Midjourney
Midjourney Prompt: Impressionist wood grain oil painting, parallel view, A photo of two cute [enter subject] wearing blue harnesses, lying on the grassy shore of Lake Lisa in KitzbĂźhel with a clear sky and mountains in the background. One [enter subject] is standing up while the other is sitting down. The scenery includes green meadows, lake water, and a distant mountain range under sunny skies. Bright sunlight highlights their fur colors and playful expressions, small strokes of oil painting, --stylize 250 --profile 6rtguen
Acquire new customers and drive revenue by partnering with us
Superhuman is the worldâs biggest AI newsletter for businesses and professionals with 1M+ readers and 2M+ followers on socials working at the worldâs leading startups and enterprises. Companies like Amazon, Hubspot, and Salesforce feature their products in Superhuman. You can learn more about partnering with us here.
đ§ Your wish is my command
What did you think of today's email?Your feedback helps me create better emails for you! |
Got more feedback or just want to get in touch? Reply to this email and weâll get back to you.
Thanks for reading.
Until next time!
Zain & the Superhuman AI team