LL-Mondays: Opening Grox's box; ChatGPT's got legs (literally); EU gets its (AI) act together; Jensen Huang on 'pain & suffering'
Also Devin, SIMA and more
Hi folks, welcome to LL-Mondays - your weekly roundup of the latest in AI and Deeptech sprinkled with reflection
Elon delivers
Grok-1 is a 314B MoE (2 of 8 active on a given token) and open-source (and open weights). GitHub
Weights and architecture have been provided with the base model checkpoint. Training concluded in October 2023 and is not fine-tuned for any particular task e.g. conversation. Whilst an Apache 2.0 license enables commercial use, this drop doesn’t include any data used to train the model (or allow for connecting to X for real-time data).
In November 2023, xAI mentioned that Grok-1 was “developed over the last four months” and is focused on coding generation, creative writing, and question answering.
“It's more open source than other open weights models, which usual come with usage restrictions. It's less open source than Pythia, Bloom, and OLMo, which come with training code and reproducible datasets.” - Sebastian Raschka, PhD (Tweet)
Elon delivered on his promise and whilst Grok-1 could be more open like Mistral/Falcon, it is ahead of most ‘open-weight’ and closed models/limited open license models. E.g. Meta may publish research openly however developers cannot integrate on top of Llama 2 and customers with large install bases need to pay.
Bottom line: Grok-1 may push limited open license models to be ‘more open’ (might Llama 3 include some training data?)
Figure x OpenAI: GPT v-Humanoids
Figure and OpenAI announced their strategic partnership a couple of weeks back to enhance humanoid robot language processing & reasoning capabilities to complete real-world tasks with full autonomy. This week a milestone demonstration showed Figure 01 conversing fluently & completing dexterous robot actions w/o any remote operation. Great technical thread here from Corey (AI@Figure, former GDM)
Bottom line: An inflection point for humanoid AI providers? Expect more shovels to enter this space and accelerate as hardware costs drop. One could draw an analogy to the proliferation of the autonomous vehicle software provider industry (e.g. Applied Intuition) as commercial LIDAR costs dropped
3D World Agents from Google DeepMind
Google DeepMind just dropped SIMA - Scalable Instructuable Multiworld Agent(s). A generalist AI agent that can follow natural-language instructions to perform tasks in 3D virtual-world video games.
Astonishingly SIMA uses just two inputs - (1) visual images from a game and (2) natural language instructions, to control a games’ central characters by outputting keyboard & mouse actions across a range of gaming worlds.
SIMA represents a significant step change in creating autonomous AI agents. Experiments show superior performance over specialized agents trained on individual games with SIMA generalizing effectively across new unseen environments.
Bottom line: Expect the world of Agents to make significant strides in 2024. Fully generative virtual worlds are also not too far behind (GENIE).
Devin - The world’s first AI SWE
Cognition Labs just introduced the first AI software developer to the world – Devin. Devin can plan and execute complex software engineering tasks from start to finish using its own shell, code editor, and web browser.
They can learn how to build and deploy apps end-to-end, find and fix bugs, and lots more.
“Devin correctly resolves 13.86%* of the issues end-to-end, far exceeding the previous state-of-the-art of 1.96%. Even when given the exact files to edit, the best previous models can only resolve 4.80% of issues.”
Bottom line: 13.86% might not seem like much but >5X gain on previous SOTA is an impressive benchmark to build on
EU AI Act finally lands
The European Parliament passed the Artificial Intelligence Act, establishing the world's first comprehensive AI regulatory framework last week.
Expected to come into effect by May 2024, with phased implementation from 2025, the AI Act includes provisions such as bans on specific practices, governance rules for general-purpose AI, and obligations for high-risk systems.
It also focuses on transparency, labels for deepfakes, and support for SMEs through regulatory sandboxes.
Bottom line: On-trend with EU & innovation, most operators are concerned that they will be hampered by a framework mired in ideals and a lack of practicality. Meanwhile, the AI community in Paris is popping
1 new learning last week
Marc Andreessen on Andy Rachleff’s ‘Onion Theory of Risk’ (video)
The way that I think about running a startup is also the way I think about raising money, which is it’s a process of peeling away layers of risk… you’re peeling away risk by achieving milestones.
…
You can think of a day 1 startup as having every conceivable kind of risk: founding team risk, product risk, technical risk, market acceptance risk, revenue risk, cost of sales risk, viral growth risk, etc. And as you achieve milestones, you're (1) making progress on your business and (2) justifying raising more capital.
My take: Whilst a great lens for both investor and operator, I loved how this can be applied to just about anything you do (startups just happen to have extremely high and complex Beta i.e. lots of risk layers like a large onion) including spinning up a new project or making a speculative bet
1 interesting stat
72% of senior execs report they have reached AI maturity, with 69% of this group focused on GenAI as a top priority but only 12% have deployed it so far! Training data is the biggest adoption barrier for enterprise GenAI
Source: LXT The Path to AI Maturity 2024 - an Executive Survey
1 interesting quote
Arguably the man of the moment, Jensen Huang (Founder & CEO, Nvidia) has been making further waves online with his recent words of wisdom for entrepreneurs and wantrepreneurs.
“Greatness comes from character, and character isn't formed out of smart people. It is formed out of people who have suffered. 𝐒𝐨 𝐟𝐨𝐫 𝐚𝐥𝐥 𝐨𝐟 𝐲𝐨𝐮 𝐒𝐭𝐚𝐧𝐟𝐨𝐫𝐝 𝐬𝐭𝐮𝐝𝐞𝐧𝐭𝐬... 𝐈 𝐰𝐢𝐬𝐡 𝐮𝐩𝐨𝐧 𝐲𝐨𝐮 𝐚𝐦𝐩𝐥𝐞 𝐝𝐨𝐬𝐞𝐬 𝐨𝐟 𝐩𝐚𝐢𝐧 𝐚𝐧𝐝 𝐬𝐮𝐟𝐟𝐞𝐫𝐢𝐧𝐠.”
Source: Keynote by NVIDIA CEO Jensen Huang at 2024 SIEPR Economic Summit