12 Days of Open AI - mitchellmetz.com

I’m enjoying following AI news. I’ve been checking in on these videos, and summarizing them for my own consumption. I thought it would be fun to just copy-paste those summaries here as well.

Here’s a more concise, fresh take:

Day 12

OpenAI’s Latest Breakthrough: The 03 Model Family

OpenAI has unveiled its latest advancement in AI technology with the 03 model family, marking significant progress in AI reasoning and efficiency. Here are the key highlights:

Performance Milestones

Achieved 75.7% on ARC AI’s holdout set (scaling to 87.5% with increased computation)
Set new records on GPQ Diamond with 87.7% accuracy
Impressive 25% accuracy on Epic AI’s challenging Frontier Math benchmark
Breakthrough performance on mathematical reasoning and coding tasks

Introducing 03 Mini

The standout innovation is the 03 Mini, designed for practical deployment:

Three-tier reasoning system (low, medium, high) for flexible processing
Significantly reduced latency compared to predecessors
Developer-friendly features including function calling and structured outputs
Optimized balance between cost and performance

Real-World Impact

Enhanced problem-solving capabilities across various domains
Improved accessibility for developers and researchers
Strong focus on safety and responsible deployment
Scheduled for public release in January, following extensive safety testing

These developments represent a significant step forward in making advanced AI capabilities more accessible while maintaining high performance standards. The 03 family demonstrates OpenAI’s commitment to balancing innovation with practical usability and safety considerations.

Day 11

The AI Assistant Revolution: Making Daily Tasks Smarter

As we continue our “12 Days of AI” journey, let’s explore how AI is transforming our everyday workflow. From creative projects to technical tasks, AI assistants are becoming our indispensable digital companions.

Creative Companion

Imagine having a brainstorming partner available 24/7. Whether you’re crafting the perfect holiday playlist, designing marketing materials, or writing content, AI can suggest ideas, offer alternatives, and help refine your work. It’s like having a creative collaborator who never runs out of energy.

Coding Buddy

For developers, AI has become the ultimate pair programmer. Need to debug that tricky function? Looking for a more elegant solution? AI can analyze your code, suggest improvements, and even generate new code snippets. It’s particularly handy during those crucial moments before a demo or when facing tight deadlines.

Research & Writing Assistant

Gone are the days of tedious research and writer’s block. AI can now help gather information, fact-check your work, and even suggest ways to make your writing more engaging. Whether you’re drafting a blog post, creating documentation, or writing a report, AI tools can help streamline the process while maintaining your unique voice.

The Game Changer

What makes these AI assistants truly revolutionary is their ability to learn and adapt to your style. They’re not just tools; they’re collaborators that enhance your natural abilities rather than replace them. By handling the repetitive aspects of tasks, they free you to focus on the creative and strategic elements that truly matter.

The future of work isn’t about AI taking over – it’s about AI empowering us to work smarter, faster, and more creatively. As we continue to explore this technology, we’re discovering new ways to leverage AI assistants to enhance our daily productivity and spark innovation.

Day 10

Expanding Accessibility

Day 10 of the “12 Days of OpenAI” event focused on enhancing accessibility to AI technologies. OpenAI emphasized its mission to make AI beneficial to all by introducing exciting new features that broaden engagement with the ChatGPT platform.

Key Highlights:

Voice Interactions: OpenAI announced the availability of ChatGPT via voice, allowing users in the U.S. to engage directly with the AI through phone calls at 1-800-CHAT-GPT. This feature aims to provide a more interactive and personal experience.
WhatsApp Integration: Users globally can now message ChatGPT on WhatsApp, facilitating easy access through a platform that many are already familiar with. This integration supports users in receiving real-time answers, enhancing everyday interactions with the AI.
Seamless Experience: The day featured demonstrations of the voice and WhatsApp functionalities, showing how simple and effective it is to communicate with ChatGPT. Whether using a rotary phone or the latest smartphone, users can interact with the AI and receive assistance on various queries.
Community Engagement: OpenAI encouraged participants to add ChatGPT as a contact, reinforcing the accessibility and ease of use in the AI’s daily application.

These advancements reflect OpenAI’s commitment to reducing barriers and promoting widespread adoption of AI technologies, ensuring that users from all backgrounds can leverage the power of ChatGPT in their lives.

Day 9

Developer-Centric Evolution

OpenAI showcased major developer-focused advancements during Day 9 of their “12 Days of AI” celebration, highlighting their thriving community of over 2 million developers across 200+ countries.

The spotlight was on the official release of the O1 model from preview, bringing four game-changing features:

Function Calling for smooth backend API integration
Structured Outputs with JSON schema for cleaner application integration
Developer Messages for enhanced model behavior control
Vision Inputs enabling advanced visual processing capabilities

Performance improvements were significant, with the O1 model achieving 60% reduction in thinking tokens, translating to faster processing and lower costs. The Realtime API got a boost with WebRTC support, streamlining real-time voice applications with reduced latency.

A standout addition is the new preference fine-tuning feature, allowing developers to shape model responses based on user feedback – particularly valuable for customer service and content moderation applications.

Developer tools expanded with new SDKs for Go and Java, alongside a simplified API key signup process, making OpenAI’s technology more accessible than ever.

These updates reinforce OpenAI’s dedication to empowering developers and startups with robust, innovative tools that push the boundaries of AI application development.

Day 8

ChatGPT Unleashes Universal Search Access

In a groundbreaking move, OpenAI has democratized ChatGPT’s search capabilities, extending them to all logged-in users worldwide. This expansion marks a significant milestone in making advanced AI search accessible to everyone, not just Plus subscribers.

Breaking Down the Barriers

The rollout reaches hundreds of millions of users across all platforms – web, iOS, and Android. Whether you’re on your desktop or mobile device, ChatGPT’s enhanced search features are now at your fingertips, no premium subscription required.

Revolutionary Features

The update introduces two game-changing capabilities:

Web-Connected Intelligence: ChatGPT now taps into real-time web data, delivering current information while maintaining its signature conversational flow.
Voice Search Revolution: The new voice mode transforms how we interact with AI, enabling natural, spoken queries for web searches – a truly hands-free experience.

Enhanced User Journey

ChatGPT’s search experience has been refined for both speed and clarity:

Lightning-fast results for everything from travel planning to local event searches
Rich visual content integration, including images and videos
Intelligent conversation tracking that remembers context, making follow-ups feel natural

Browser Power-Up

The browser integration brings new convenience:

Set ChatGPT as your go-to search engine
Smart navigation that prioritizes direct website access when you know your destination
Streamlined interface for quicker web browsing

Mobile Magic

The mobile experience has been supercharged:

Seamless Apple Maps integration for iPhone users
Advanced location-based features for discovering local spots
Intuitive voice and mapping capabilities for on-the-go searches

This transformation represents more than just an update – it’s a reimagining of how we interact with AI-powered search. By maintaining the natural conversation flow while adding powerful search capabilities, ChatGPT continues to push the boundaries of what’s possible in AI assistance.

Keep watching as we unveil more innovations in our 12 Days of AI series!

Day 7

ChatGPT Projects – Organized Conversations & Enhanced Productivity

OpenAI has introduced Projects, a powerful new organizational feature for ChatGPT that allows users to better manage and group their conversations. This feature represents a significant upgrade in how users can interact with and organize their ChatGPT interactions.

Key Features:

Project Creation: Users can create dedicated projects with customizable titles and color coding
File Management: Ability to add relevant files and instructions to specific projects
Conversation Organization: Users can group related chats and add existing conversations to projects
Context Awareness: The AI maintains context across conversations within a project

The feature was demonstrated through practical use cases including:

A Secret Santa gift exchange organizer
Home maintenance tracking system
Personal documentation management

Availability:

Initially rolling out to Plus, Pro, and Teams users
Coming soon to free users
Enterprise and EDU versions planned for early 2025

This update addresses one of the most requested features from ChatGPT users, providing a more structured way to manage multiple conversations and maintain context across related interactions.

Day 6

Video, Screen Sharing, and Santa Mode Come to ChatGPT

OpenAI announced two major features today: video capabilities in Advanced Voice mode and a special holiday Santa mode. Here are the key highlights:

Video & Screen Sharing Features

Users can now share live video and screen content with ChatGPT in real-time
Rolling out gradually over the next week to:
- Teams users
- Most Plus and Pro subscribers
- (Coming early next year for Enterprise and edu plans)
- European Plus/Pro users will get access later

Advanced Voice Upgrades

Uses a multimodal model that directly processes audio input/output
Supports natural conversations in 50+ languages
More expressive with enhanced emotion and tone
Real-time visual interaction capabilities

Santa Mode

Available globally wherever ChatGPT voice mode works
Accessible via:
- Mobile apps
- Desktop apps
- chat.openai.com
Features Santa’s signature jolly voice
Usage limits reset for first-time Santa conversations
Can be found via:
- Snowflake icon on home screen
- ChatGPT settings page

The update demonstrates OpenAI’s push toward more interactive and engaging AI experiences, combining visual, voice, and themed interactions in time for the holiday season. The video feature was showcased through practical demonstrations including pour-over coffee instructions, highlighting the potential for real-time visual assistance and learning applications.

This development marks a significant step in making AI interactions more natural and versatile, while the Santa mode adds a festive touch that makes the technology more approachable and fun for users of all ages.

Day 5

ChatGPT x Apple Intelligence – Seamless Integration Across Devices

Day 5 of OpenAI’s “12 Days of AI” series celebrates the exciting integration of ChatGPT across Apple devices, including iPhones, iPads, and Mac computers. This update marks a significant step in making ChatGPT even more accessible and frictionless, directly connecting it to Apple’s ecosystem. Whether on the go with an iPhone or working on your MacBook, ChatGPT is now right at your fingertips.

Key Features Introduced

Siri Collaboration
ChatGPT can now be directly invoked through Siri. If Siri encounters a task too complex or requires deeper reasoning, it seamlessly hands it off to ChatGPT. Users maintain full control; they can confirm requests and decide whether or not to share specific content like screenshots or documents.
Writing Assistant Tools
ChatGPT enhances Apple’s writing tools, enabling users to refine documents, summarize lengthy texts, compose entire documents from scratch, and extract key points. This feature is perfect for users who need an efficient assistant to simplify complex projects, whether academic or professional.
Visual Intelligence on iPhone and Camera Control
The integration includes powerful visual intelligence features through Apple’s camera system. Users can access ChatGPT via their iPhone camera (available on iPhone 16 and later models) to “learn more about” what they’re looking at, such as analyzing documents or images in real time.
Cross-App Functionality
On all Apple devices, ChatGPT can be accessed from any application. For instance, typing to Siri on a Mac or using a simple shortcut on your iPhone allows ChatGPT to respond to queries, edit content, or analyze information seamlessly within the existing app. The “ChatGPT” button now even saves entire sessions for later continuation.

Fun, Festive Use Cases

To showcase its capabilities, a live demo during the Day 5 video highlighted creative and festive use cases. This included organizing a holiday party with ChatGPT, generating a holiday playlist (featuring Mariah Carey’s iconic music), and ranking participants in a Christmas sweater contest. These examples emphasize how ChatGPT’s usefulness goes beyond professional applications—it’s engaging, entertaining, and adaptable in festive and personal settings, too.

How It Works

The integration operates through Apple’s “Apple Intelligence” system, which is accessible in device settings. Once enabled, users can interact with ChatGPT via the Siri interface, system tools, or standalone apps. The rollout supports both anonymous use and linked accounts, ensuring personalization without sacrificing privacy.

This is a game-changing milestone in AI accessibility, blending the power of ChatGPT with Apple’s intuitive design to create a seamless, cross-platform experience for users. Whether the goal is productivity, creativity, or just having some fun, ChatGPT on Apple devices is unlocking new possibilities for AI-powered convenience.

Day 4

Unveiling Canvas – A Collaborative Writing and Coding Experience

On Day 4 of the OpenAI 12 Days of AI celebration, the spotlight was on Canvas, a new feature designed to enhance collaboration in writing and coding. Previously available only to Plus users in beta, Canvas is now accessible to everyone, seamlessly integrating into ChatGPT.

Key Features:

Collaborative Canvas:
Canvas allows users to create and edit documents side-by-side with ChatGPT. This dynamic interface provides a clear distinction between user input and AI-generated content, streamlining the editing process and making collaborative writing more intuitive.
Python Code Execution:
Users can now run Python code directly within Canvas. This feature provides immediate feedback on code, making it easier for programmers to debug and visualize their work with accessible output like text and graphics.
Interactive Feedback:
ChatGPT can now leave comments directly on specific portions of text or code, enabling more precise and relevant feedback. Users can apply suggested changes or fine-tune their content, thus enhancing the iterative writing and coding experience.
Integration with Custom GPTs:
Canvas has been integrated into custom GPTs, allowing users to create tailored experiences. This feature empowers creators to define specific tasks and styles, making it easier to generate draft responses or educational content.

With these enhancements, Canvas not only supports creative writing projects, like crafting stories for children, but also aids in technical writing and programming tasks, making it a versatile tool for a variety of users.

Join us as we explore these innovative features, unlocking new potential in collaborative, AI-assisted creativity, whether you’re writing a holiday story or developing complex code.

Day 3

Sora’s Advanced Video Generation Features

As part of OpenAI’s 12 Days of AI, they dove into one of the standout innovations from OpenAI: Sora. This groundbreaking tool revolutionizes video generation by combining state-of-the-art machine learning with intuitive human-computer interface design. Below is a detailed overview of its advanced features.

Sora’s Core Features:

Looping:

Sora offers the capability to create seamless loops by filling in frames between the start and end points of a video.
This feature is particularly useful for artists who want to craft repetitive, smooth transitions in their visual content effortlessly.

Blend:

With Blend, users can merge two distinct scenes into a cohesive new video.
This tool allows creative experimentation, such as blending robots with natural scenes, leading to innovative and unique outputs.

Storyboard:

One of Sora’s most powerful tools, Storyboard, allows users to direct videos through a sequence of actions and scenes.
It supports granular editing where users can define the environment, character movements, and actions, making the storytelling process highly customizable.

Recut:

This tool provides enhanced video editing capabilities by allowing users to trim and extend segments within a storyboard.
It simplifies creating new beginnings or endings for existing videos, offering flexibility in video narrative construction.

Remix:

The Remix feature enables users to describe changes they want in an existing video, and Sora will generate the new version.
It supports variations in details like adding wind or changing characters, providing a dynamic way to iterate on video content.

Explore:

Sora’s Explore section is a community-driven feed brimming with shared videos and creative techniques.
This platform for inspiration helps users learn and draw ideas from others, enhancing the collaborative nature of video creation.

Availability

Sora is set to go live in most of the world, with some restrictions in Europe and the UK, where roll-out will take a bit longer. OpenAI Plus subscribers will enjoy 50 generations per month, while Pro subscribers can access more with faster and higher resolution options.

Conclusion

OpenAI’s Sora is a testament to the potential of combining advanced AI with user-centric design, pushing the boundaries of visual content creation. As we continue the 12 Days of AI, stay tuned for more groundbreaking developments and tools that are set to transform the landscape of artificial intelligence.

For more information, check out OpenAI’s official announcements and explore the magic of Sora in action.

Day 2

Key Highlights

Introduction to 01 and Reinforcement Fine-Tuning (RFT):
- 01 is OpenAI’s latest model series, incorporating advanced reasoning capabilities.
- RFT allows customization by teaching models to reason effectively over specific tasks and domains using reinforcement learning techniques, rather than just mimicking input data.
Comparison with Supervised Fine-Tuning:
- Supervised fine-tuning modifies the model’s output style, tone, or structure.
- RFT enables models to learn entirely new reasoning methods for tasks using feedback (rewards) based on their performance.
Target Audience and Applications:
- Beneficial for domains requiring expertise, such as legal, finance, engineering, insurance, and scientific research.
- Example: A partnership with Thomson Reuters used RFT to develop a legal assistant, “Co-Counsel AI,” which aids analytical workflows.

How Reinforcement Fine-Tuning Works

Setup Process:
- Training Data: Users provide curated JSONL datasets (e.g., case reports in healthcare, legal documents) to teach models domain-specific reasoning.
- Graders: A key innovation, graders evaluate the model’s performance by assigning scores to its output (0-1 scale). Graders measure correctness and reasoning quality, providing feedback for learning.
Workflow:
- Upload a training dataset.
- Define validation data to ensure generalization rather than memorization.
- Use graders to guide the model during training.
- Launch training on OpenAI’s infrastructure, leveraging their reinforcement learning algorithms.
Example Use Case:
- A biomedical task involved diagnosing rare genetic diseases:
  - Data included patient symptoms, excluded symptoms, and known genetic mutations.
  - RFT trained a smaller model, 01 Mini, to identify likely genes responsible for diseases from symptom descriptions.
  - Validation ensured the model generalized beyond the training data.

Demonstration Results

Performance Metrics:
- Compared models on “top at 1” (correct answer ranked first), “top at 5,” and “top at max” (correct answer listed anywhere).
- RFT significantly improved 01 Mini’s performance:
  - “Top at 1” rose from 17% to 31%.
  - Demonstrated generalization across new data.
Visualized Outputs:
- The fine-tuned model provided ranked gene predictions with reasoning explanations, offering valuable insights for domain experts.

Potential Impact and Future Directions

Applications Across Domains:
- Scientific Research: E.g., genetic disease diagnostics, AI safety, legal workflows, healthcare applications.
- Hybrid approaches combining traditional tools with fine-tuned AI models hold promise for near-term advancements.
Accessibility:
- OpenAI is expanding its RFT research program for organizations tackling complex, expert-level tasks.
- Public launch planned for early next year.

Summary of Advantages

Custom Reasoning: Models learn to reason in new, domain-specific ways.
Efficiency: Small datasets (dozens of examples) suffice for customization.
Scalable: Leverages OpenAI’s infrastructure for ease of use.
Generalization: Models adapt to unseen tasks without memorizing specifics.

This presentation highlighted how RFT represents a leap in model customization, enabling AI to handle complex, expert tasks across diverse domains.

Day 1

Detailed Overview: OpenAI o1 and o1 Pro Mode Launch

Introduction:

OpenAI introduced the “12 Days of OpenAI,” unveiling a new product or feature daily over 12 weekdays.
Day 1 highlights two major updates: the launch of o1, the next evolution of OpenAI’s models, and a new subscription tier called ChatGPT Pro.

Key Announcements:

1. o1 Model:

Smarter and Faster:
- Significant performance upgrades compared to GPT-4 and o1 Preview.
- Enhanced capabilities for scientists, engineers, and coders, excelling in tasks like math, competition coding, and complex Q&A.
Multimodal Capabilities:
- o1 processes both text and images jointly, improving its versatility.
Improved Responsiveness:
- Faster response times tailored to query complexity (simple questions yield quicker answers; complex problems allow deeper thought).
- Evaluations show a 34% reduction in major mistakes and a 50% increase in response speed compared to o1 Preview.
Real-World Demo:
- Showcases of o1’s faster performance and higher accuracy in tasks like listing historical data (e.g., Roman emperors) and solving physics-related problems (e.g., space cooling systems).

2. ChatGPT Pro:

New Subscription Tier:
- Designed for “power users” pushing the limits of AI capabilities in math, science, programming, and complex writing tasks.
- Priced at $200/month.
Features:
- Unlimited access to all models, including advanced voice mode.
- o1 Pro Mode:
  - Uses additional compute resources for even more challenging problems.
  - Offers higher reliability and accuracy, making it ideal for complex workflows.
- Demonstration:
  - A challenging chemistry problem solved by o1 Pro Mode in under a minute, with detailed reasoning and correct answers.
Target Audience:
- Researchers, professionals, and users who need top-tier performance for intensive tasks.

Highlights from o1 Demonstrations:

General Performance:
- o1 processes tasks significantly faster and with fewer errors compared to previous models.
- Demonstration included answering historical queries and solving detailed physics problems involving radiative cooling in space.
Multimodal Reasoning:
- The model handles image-based inputs effectively, extracting relevant data and making accurate calculations (e.g., calculating cooling panel areas in a fictional space scenario).
- Strong performance on standard benchmarks (e.g., MMU, Math Vista).
o1 Pro Mode:
- Specifically optimized for tackling highly complex problems in math, science, and programming.
- Provides thought-out, structured reasoning for challenging queries with an extended compute process.

Future Plans:

For Pro Tier Users:
- Expanding capabilities for longer and more compute-intensive tasks.
API Enhancements:
- Features for developers such as structured outputs, function calling, developer messages, and API image understanding.
Tool Integrations:
- Adding functionalities like web browsing, file uploads, and advanced developer tools to o1.
Developer Focus:
- Promising advanced capabilities for building innovative applications.

Closing:

OpenAI expressed excitement about o1 and o1 Pro Mode, emphasizing improvements in user experience and capabilities.
Upcoming updates for developers and additional features will be showcased in subsequent days.
The launch concluded with a festive dad joke about Santa and AI, lightening the technical tone.

This launch marks a significant leap in OpenAI’s AI capabilities, catering to both professional power users and general audiences seeking improved performance, multimodal input handling, and reliability.