OpenAI 12 Days 2024 Announcements
OpenAI 12 Days 2024 Announcements
Day 1- Announcements
-
Launch of o1 Full Version: This is an upgraded model designed to be faster, smarter, and multimodal, responding better to instructions. It shows significant improvement over its predecessor, especially in coding and problem-solving tasks.
-
Introduction of ChatGPT Pro: A new subscription tier priced at $200/month offering unlimited access to OpenAI’s models, including advanced features like voice mode and o1 PR mode. The PR mode is intended for the most challenging problems, providing even higher performance capabilities.
-
Enhancements in ChatGPT:
- Multimodal capabilities allowing the model to understand and process both text and images.
- Improved speed and intelligence, providing faster and more accurate responses even to complex queries.
- Continued enhancements to make the model better suited for both everyday tasks and high-level computational challenges.
Day 2 - Announcements
-
Reinforcement Fine-Tuning (RFT): Unlike standard fine-tuning, RFT uses reinforcement learning to enhance the model’s ability to reason over new domains. It incentivizes correct responses and disincentivizes incorrect ones, allowing the model to adapt to specialized tasks with relatively few examples.
-
Applications in Various Fields: They features discussions on how RFT can be applied across different sectors such as legal, finance, engineering, and insurance, showcasing a partnership with Thompson Reuters to develop a legal assistant AI.
-
Scientific Research Applications: Justin Ree, a computational biologist, discusses the potential of RFT in aiding the study of rare genetic diseases by improving the model’s ability to interpret complex biomedical data and perform systematic reasoning.
-
Demonstration of RFT’s Capabilities: The team demonstrates the process of setting up and running an RFT training job on OpenAI’s platform, highlighting how the model learns to generalize from training to validation data sets effectively.
-
Enhanced Performance with Fewer Resources: The video shows how RFT can make smaller and less resource-intensive models (like 01 mini) perform as well or better than more extensive models on specific tasks.
Day 3 - Announcements
On Day 3 of OpenAI’s event, they announced the launch of Sora, their new video product. Here’s a summary of the key points from the announcement:
-
Product Introduction: Sora is introduced as a tool primarily for video generation, aiming to revolutionize creative processes with AI. It supports creative dynamics between AI and users, enhancing how creatives interact with technology.
- Capabilities and Features:
- Video Generation: Sora allows users to generate videos from text descriptions or image uploads, supporting different aspect ratios and resolutions up to 1080p, and video lengths from 5 to 20 seconds.
- Explore and Library: The platform includes an ‘Explore’ feature for inspiration and a ‘Library’ as a home base for users’ creations, with organizational tools like folders and bookmarks.
- Storyboard: A new feature called ‘Storyboard’ lets users direct videos with multiple actions across a sequence, using a timeline for detailed creative control.
- Advanced Tools: Other advanced tools include remixing options to modify existing videos, loop features for creating seamless repeats, and blend tools for merging scenes.
-
User Accessibility: Sora is integrated into existing ChatGPT Plus and Pro accounts without extra charges, making it broadly accessible. It offers a tiered usage model based on the type of OpenAI subscription.
-
Global Availability: Launching in the United States and internationally, with specific limitations in Europe and the UK due to pending approvals.
- Safety and Moderation: OpenAI emphasizes the importance of abuse prevention and safe creative expression, acknowledging the ongoing challenge of balancing moderation with creativity.
Day 4 - Announcements
Canvas is positioned as a tool that broadens the scope of what can be achieved with ChatGPT, from everyday creative writing to more complex programming tasks, offering an interactive and integrated platform for enhanced productivity and creativity.
-
Launch of Canvas: Initially available in beta for Plus users, Canvas is now being launched for all users. This tool integrates directly into OpenAI’s main model, enabling a seamless operation that combines ChatGPT’s capabilities with interactive documents.
- Features of Canvas:
- Integration with Python: Users can now run Python code directly within Canvas, allowing them to see outputs, whether text or graphics, in real-time alongside the ChatGPT conversation.
- Custom GPT Integration: Canvas functionality can be integrated into custom GPTs, enhancing them with the ability to generate and edit documents and code interactively.
- User Interface and Experience:
- Canvas provides a side-by-side view where users can interact with ChatGPT on one side and see the canvas document or code editor on the other.
- Users can initiate a canvas session within ChatGPT, enabling a flexible and dynamic space for creating documents or debugging code.
- Canvas supports direct editing by users, application of suggestions by ChatGPT, and the use of shortcuts for common tasks such as adjusting text length, adding polish, or embedding emojis.
- Real-Time Python Execution:
- Canvas includes a feature to run Python code within the interface, complete with syntax highlighting and error feedback.
- Users can interactively debug and improve their code with suggestions automatically applied by ChatGPT, illustrated by side-by-side comparisons of changes.
- Application to Various Content Types:
- Canvas is designed to support a wide range of content types, from creative writing projects like Christmas stories to technical tasks such as coding and data analysis.
- The integration of custom GPTs with Canvas allows for tailored responses and functionalities, enhancing the personalization of ChatGPT applications.
Day 5 - Announcements
On Day 5 of OpenAI’s event, the focus was on enhancing the accessibility and integration of ChatGPT across Apple devices. Here are the key features announced:
- Integration with iOS and macOS:
- ChatGPT is now integrated into iOS for iPhones and iPads, as well as macOS, making it accessible directly through the operating system. This integration aims to make ChatGPT as frictionless as possible for Apple device users.
- Key Integrations:
- Siri Integration: ChatGPT can be invoked by Siri when it identifies tasks that could benefit from ChatGPT’s capabilities. This seamless handoff aims to enhance user experience by integrating ChatGPT’s intelligent responses directly into Siri’s functionality.
- Writing Tools: Apple’s intelligent writing tools now include the capability to use ChatGPT for composing documents from scratch. This feature enhances the document creation process, providing users with sophisticated AI-driven text generation.
- Camera Control with Visual Intelligence: On iPhone 16, users can invoke visual intelligence through ChatGPT to gain insights about objects in their environment. This feature utilizes the camera to interact with ChatGPT visually, providing a richer interaction model.
- User Configuration and Privacy:
- Within the settings on Apple devices, users can enable ChatGPT and log into their accounts for a personalized experience. Privacy settings allow users to confirm ChatGPT requests, ensuring that user data is handled securely.
- Demonstration of Features:
- Demonstrations included using Siri to organize a Christmas party via ChatGPT, adjusting settings for personalized use, and using ChatGPT to create festive album art and manage Christmas playlists. Additionally, the integration allows for complex queries to be processed by ChatGPT directly from macOS applications, showcasing the ease of accessing ChatGPT’s capabilities across various contexts.
- Application Examples:
- Practical examples included creating visual content, sorting data, and enhancing productivity tasks directly from Apple devices, illustrating the versatility and utility of the ChatGPT integration in everyday tasks.
Day 6 - Announcements
The announcements from Day 6 highlighted OpenAI’s commitment to improving user experience by integrating more interactive and multimedia capabilities into ChatGPT, making it a more versatile and engaging tool for various applications.
- Apology for Downtime:
- The event started with an apology for a few hours of downtime experienced the previous day, assuring users that a detailed postmortem would be shared later to address and explain the incident.
- Introduction of Video to Advanced Voice Mode:
- OpenAI introduced video and live screen sharing capabilities to ChatGPT’s Advanced Voice Mode. This new feature allows users to engage in richer, more interactive conversations by sharing real-time visual content during their chats.
- Demonstration of New Features:
- A live demonstration showed how users could initiate video calls within ChatGPT, introducing team members and interacting through both voice and visual feedback. ChatGPT displayed its capability to remember and recall details about participants, enhancing the interaction quality.
- Real-time Interaction Enhancements:
- The demonstration extended to practical applications like making pour-over coffee, where ChatGPT guided the process step-by-step through video. This showcased the practical utility of combining voice, video, and instructional guidance in real-time.
- Screen Sharing Capabilities:
- The screen sharing feature was highlighted, showing how users can share their screens during a conversation. This was demonstrated by seeking assistance in crafting a response to a text message, illustrating how ChatGPT could offer contextual help based on visual cues from the user’s screen.
- Integration with Apple Devices:
- The integration of ChatGPT with iOS, demonstrated through Siri, allowed users to seamlessly transition queries to ChatGPT for more complex assistance, reflecting deep integration with mobile and desktop ecosystems.
- Holiday Themed Interaction:
- A festive touch was added with a “Talk to Santa” feature, allowing users to interact with a ChatGPT-modeled Santa Claus. This feature aimed to provide holiday cheer and a unique interactive experience, showcasing the versatility of ChatGPT in providing themed interactions.
Day 7 - Announcements
On Day 7 of OpenAI’s event, the focus was on launching a new feature in ChatGPT called Projects, designed to enhance organization and customization for users. The introduction of Projects marks a significant enhancement in how users can interact with and utilize ChatGPT, providing a more organized, customizable, and task-oriented approach to using the AI model.
- Update on Recent Rollouts:
- Updates were provided on the rollout of various features announced earlier in the week. Sora is now fully available to Plus and Pro users outside Europe. Live video and screen sharing in Advanced Voice Mode have been rolled out to Plus, Pro, and Teams users outside of Europe. Santa Mode is available globally.
- Introduction of Projects in ChatGPT:
- Projects in ChatGPT allow users to upload files, set custom instructions, and tailor ChatGPT for specific tasks within a project. This feature enhances the ability to organize and manage conversations and tasks within ChatGPT.
- Demonstration of Projects:
- A live demonstration showcased the creation of a new project, the customization of project settings, and the addition of files and instructions. This included handling specific tasks such as organizing a Secret Santa event, where ChatGPT helped to manage and execute gift exchanges based on user-inputted preferences.
- Use Cases for Projects:
- Projects can be used as smart folders to organize conversations or for more complex applications such as coding, documentation, and event planning. For example, one demonstration involved using ChatGPT to assist with home maintenance tasks by retrieving and using data from uploaded documents to provide actionable advice.
- Technical Demonstration:
- Further demonstrations included using Projects for personal website development, where ChatGPT assisted in integrating personal data into a website template and refining web content based on user feedback.
- Rollout Information:
- Projects are rolling out starting today to Plus, Pro, and Teams users, with plans to extend this feature to Free users and Enterprise and EDU users early in the New Year.
Day 8 - Announcements
On Day 8 of OpenAI’s event, Kevin Wheel announced several updates related to ChatGPT’s search capabilities, aimed at enhancing user experience and accessibility. Here are the key points from the announcement:
- Improvements to ChatGPT Search:
- OpenAI has enhanced the search functionality based on user feedback. Improvements include faster response times, optimized performance on mobile devices, and new mapping experiences.
- Integration of Search in Advanced Voice Mode:
- Search functionality is now integrated with advanced voice mode in ChatGPT. This allows users to perform web searches through voice commands, combining the convenience of voice interaction with the power of web search.
- Expansion of Search Access to All Users:
- ChatGPT search is now available to all logged-in free users globally, across all platforms where ChatGPT is used. This marks a significant expansion, making advanced search features accessible to a broader audience.
- Demonstrations of Search Capabilities:
- Demonstrations highlighted how users can initiate searches directly within ChatGPT by typing queries or using the dedicated search button for explicit web searches. Examples included searching for events in San Francisco and finding family-friendly activities in New York.
- New Default Search Engine Option:
- Users now have the option to set ChatGPT as their default search engine in their browsers, allowing for direct navigation to desired web pages from the search bar.
- Optimized Mobile Experience:
- The mobile experience for ChatGPT search has been improved, providing users with rich visual content and interactive map features directly within the app.
- Live Demo of Search in Voice Mode:
- A live demonstration showcased how users could ask for real-time information about events in Zurich and receive responses in a conversational manner, illustrating the integration of search with voice commands.
- Encouragement for User Feedback:
- OpenAI continues to encourage user feedback to further refine and enhance the search functionality.
- Account Benefits:
- While ChatGPT is accessible without an account, Kevin highlighted the benefits of creating a free account, such as access to premium features like search and canvas, and higher usage limits.
Day 9 - Announcements - Dev Day
These updates are part of OpenAI’s ongoing efforts to enhance their API offerings, making advanced AI tools more accessible and efficient for developers across various applications.
-
GPT-4.0 in the API: OpenAI announced the full launch of GPT-4.0 out of preview in the API, including new features such as function calling, structured outputs, developer messages, and vision inputs. They also introduced a new parameter called “reasoning effort,” which allows developers to manage computational resources more effectively depending on the complexity of the problem.
- Realtime API Enhancements:
- WebRTC Support: OpenAI added WebRTC support to the Realtime API, which facilitates building real-time voice experiences with better internet adaptability, including automatic bitrate adjustments and echo cancellation.
- Cost Reductions: The cost for GPT-4.0 audio tokens in the Realtime API has been reduced by 60%, and GPT-4.0 Mini audio tokens are now 10x cheaper.
- Python SDK: A new Python SDK was released to simplify integration of the Realtime API.
-
Preference Fine-Tuning: A new method called preference fine-tuning was introduced, using direct preference optimization to align models more closely with user preferences, especially useful in scenarios like customer support or content moderation.
- Additional Announcements:
- New SDKs: Official support for Go and Java SDKs was announced.
- Simplified API Key Access: An improved login and signup flow now allows developers to obtain an API key more quickly and easily.
- Educational Content: Talks from OpenAI’s global Dev Days have been made available on YouTube.
- Developer AMA: An Ask Me Anything session with OpenAI’s team was scheduled to answer developer questions.
- Bad Joke Closure: The session closed with a playful joke about structured outputs being on Santa’s naughty list because they were a “schema.”
Day 10 - Announcements
OpenAI has previously launched ChatGPT on various platforms including the web, iOS, Android, Mac, and Windows. Today, they are taking a significant step by making ChatGPT accessible via traditional telephone systems.
- Voice Calling: Users in the U.S. can now call 1-800-CHAT-GPT / (1-800-242-8478) to interact with ChatGPT by voice.
- WhatsApp Messaging: Users worldwide can now message ChatGPT on WhatsApp.
They demonstrate these features with several devices, including modern smartphones and older technology like a flip phone and a rotary phone, showcasing ChatGPT’s versatility in handling various forms of communication. The demonstrations cover scenarios like identifying a unique house while on a road trip, obtaining travel tips, and learning phrases in another language.
The WhatsApp demonstration highlights ChatGPT’s ability to adapt recipes to user preferences, such as making a pesto recipe vegan or meat-based, showcasing the AI’s responsiveness to dietary changes mentioned during the conversation.
Day 11 - Announcements
These features collectively aim to make ChatGPT a more versatile and practical tool across various professional and personal use cases, especially by enhancing its ability to interact intelligently with other software on a user’s computer.
-
Efficiency and Accessibility: The native ChatGPT desktop app for Mac is highlighted for being lightweight and resource-efficient. It operates in its own window and can be quickly accessed with a keyboard shortcut (Option-Space).
-
Seamless Integration with Other Apps: The desktop app can integrate directly with other applications on the user’s computer, allowing ChatGPT to pull context automatically. This eliminates the need for manual copying and pasting when users want to use ChatGPT alongside other apps.
-
Privacy Control: Users have full control over what information they share with ChatGPT. The app only accesses content from other applications after explicit user selection.
-
Enhanced Code Interaction: The desktop app can interact with development environments like Xcode, enabling users to get coding assistance directly within their IDE. ChatGPT can generate and suggest code based on the current context of the development project.
-
Visual Data Representation: ChatGPT can assist in transforming command-line data into visual formats, such as bar graphs, making it easier for users to interpret complex data.
-
Document Assistance: Integration with text and document management applications like Notion. ChatGPT can help users refine content, ensuring factual accuracy through web searches and adapting the tone to match existing materials.
-
Advanced Voice Interaction: The new advanced voice mode allows users to interact with ChatGPT using voice commands, enhancing the usability and accessibility of the application.
-
Support for Multiple IDEs: The desktop app supports a range of integrated development environments (IDEs), including Xcode, VS Code, and the JetBrains ecosystem, facilitating a broad range of coding tasks.
Day 12 - Announcements
OpenAI discussed their latest AI models, O3 and O3 Mini, marking the culmination of their 12-day series of announcements.
- Introduction of O3 and O3 Mini:
- O3 and O3 Mini are new frontier models with O3 described as “very smart” and O3 Mini as “incredibly smart with good performance and cost”.
- Public Safety Testing:
- The models were not launched publicly but were made available for public safety testing starting on the day of the announcement. This approach reflects OpenAI’s commitment to safety as the models increase in capability.
- Model Performance and Capabilities:
- O3 demonstrated superior performance on various benchmarks compared to previous models, particularly in coding and mathematical tasks.
- Notably, O3 shows significant improvements in software development tasks and competition-level problem solving.
- Significant Benchmarks:
- O3 achieved impressive scores on high-level, difficult benchmarks, showing advancements that allow it to perform tasks usually challenging for AI, like those involving complex reasoning and PhD-level science questions.
- Arc AGI Benchmark:
- O3 set a new state-of-the-art score on the Arc AGI benchmark, which tests AI on a variety of tasks that require distinct cognitive abilities, akin to human-like general intelligence.
- Efficiency and Accessibility of O3 Mini:
- O3 Mini, despite being a smaller model, retains high efficiency and cost-effectiveness. It offers adjustable reasoning efforts to handle tasks of varying complexity effectively.
- Demonstration of O3 Mini:
- A live demonstration showed O3 Mini executing tasks dynamically, including programming tasks that involved real-time problem solving and execution of complex commands.
- Safety and Security Research Applications:
- OpenAI invited safety and security researchers to apply to test the new models to ensure robust safety measures are in place before wider release.
- Future Plans and Availability:
- The planned public release of O3 Mini is slated for the end of January, with O3 to follow shortly after, contingent on the outcomes of the safety testing.
Note: O2 is a global brand name owned by the Spanish telecommunications company Telefónica. The company uses the O2 brand for its subsidiaries in the United Kingdom and Germany. That is why they didn’t have o2 after o1.
Overall Models and Capabilities
- ChatGPT 4o: Picture-DALL-E, Search on Web, Reasion-o1, Canvas (collaborate writing and coding). Attachment not allowed.
- ChatGPT o1 and o1-mini: Reason-o1, Search
- ChatGPT 4: Picture-DALL-E, Reason, Search on Web