.Plan

Swift

  • basic proof of concept for integrating SwiftUI in a Obj-C UIKit app.

AI Learning

Summary of Readings:

Notes from Readings

"I am again pivoting my career. I think a SWE → AI pivot is almost as much of a pivot as going from Finance → SWE, just in terms of superficial similarity while also requiring tremendous amount of new knowledge and practical experience in order to get reasonably productive."

"Jeremy Howard’s fast.ai course started by getting developers into AI in 7 weeks in 2016. In 2022 he’s already taking people up to reimplementing Stable Diffusion in ten 90-minute lessons. Suhail Doshi took this course in June 2022 and launched Playground.ai by November."

"This is in part driven by the Transformer architecture, introduced in 2017, but since taking over almost every field of AI offering a strong/flexible baseline such that knowledge of prior architectures becomes optional. So there isn’t decades of research to catch up on; it’s just the last 5 years."

"But even looking at top researcher careers gives you an idea of how much time it could take you to get to the very top. Yi Tay contributed to/led many of the major recent LLM advancements at Google, but you may be surprised to learn he just has about 3.3 years of experience since his PhD. Ashish Vaswani was 3 years out of his PhD when he published the Transformer paper, and Alec Radford was 2 years out of undergrad when he published the GPT and GPT-2 papers at OpenAI."

"Career trajectories like this do not happen in more mature fields like physics, mathematics, medicine, because their foom years were centuries ago. AI’s “foom” is clearly happening now."

"Many areas of contribution apart from becoming a professional ML Researcher"

"Prompt and Capabilities Research: Riley Goodside’s career blew up in 2022, going from being a data scientist at Grindr to becoming the world’s first Staff Prompt Engineer by tweeting GPT-3 tricks until finding and popularizing “prompt injection” as a major LLM security concern. Many others have since caught on that finding interesting usecases for GPT-3 and -4 does well on social media, which is great and a layperson corollary of how academics do formal capabilities research."

"Software engineering: Whisper.cpp and LLaMA.cpp have inspired many people recently to the future of running large models on-device, but I was surprised to listen to Georgi Gerganov’s interview on the Changelog and learn that he was a self described “non-AI-believer” in September 2022 and merely ported Whisper to C++ for fun...Harrison Chase’s Langchain has captured a ton of mindshare by building the first developer-friendly framework for prompt engineering, blending both prompt and software improvements to pretrained LLM models. A raft of LLM tooling from Guardrails to Nat.dev have also helped bridge the gap for these models from paper to production."

"Productization: Speaking of Stable Diffusion, Emad Mostaque was a hedge fund manager right up to 2019, who didn’t seem to have any prior AI expertise beyond working on “literature review of autism and biomolecular pathway analysis of neurotransmitters” for his son. But his participation in the EleutherAI community in 2020 led him to realize that something like Stable Diffusion was possible, find Patrick and Robin of the CompVis group at Heidelberg University, and put up the ~$600k it took to train and deliver the (second) most important AI release of 2022. Nobody wants to cross examine who did what but it makes sense that a former hedge fund manager would add a lot of value by spotting opportunities and applying financial (and organizational) leverage to ideas whose time had come. More broadly, Nat Friedman has been vocal about the capability overhang from years of research not being implemented in enough startups, and it seems that founders willing to jump on the train early, like Dave Rogenmoser taking Jasper from 0 to $75m ARR in 2 years, will reap disproportionate rewards."

"..the way that both incumbents are startups across every vertical and market segment are embracing AI is showing us that the future is “AI-infused everything” - therefore understanding foundation models will more likely be a means to an end (making use of them) rather an end in itself (training them, or philosophizing about safety and sentience). Perhaps it might be better to think of yourself and your potential future direction less like “pivoting INTO AI”, and rather “learning how to make use of it” in domains you’re already interested or proficient in."

"I’ve done the fast.ai course content, but also am following my curated Twitter list, and adding notes to my public GitHub AI repo and to the Latent Space Discord. Important new papers get read the week they come out, and I try to run through or read the code of highly upvoted projects and products. We’re also about to release “Fundamentals 101” episodes on the podcast where we cover AI basics, which has forced me to read the papers and understand the history of some of the things we take for granted today"

"The introduction of Dolly suggests that LLMs and other generative AI models are also “skilled,” and could be divided along the same crude lines that economists use to divide the labor force. Dolly is low-skill. You might not trust it to send an important email to your boss, but you’d probably be fine with it making a restaurant reservation for you. GPT-4 is high-skill—it might actually be good enough for that email to your boss. And companies will surely develop specialized models that extend GPT-4 (or GPT-5, or GPT-6S Plus) with specific training data, and are particularly good at creative writing, or molecular chemistry, or negotiating for higher salaries."

"This dynamic could appear everywhere. There could be expensive AI therapists that keep years of dialogue in their conversational memory, and there could be cheap ones that start from scratch every session. There could be good AP U.S. History tutors (and test-takers), and bad ones that hallucinate facts about American history.6 There could be AI lawyers that specialize in obscure corporate tax law—for the right price. People with money may be able to train models on lookalike groups of patients and get personalized medical care, while people without it may have to rely on ChatWebMD."

"The same could be true in the corporate world as well. Companies don’t pay a premium for McKinsey because of the quality of their consultants, but because McKinsey promises to train a dedicated model for all of their clients. Google builds a better writing assistant than anyone else because it’s trained on private data inside of Google Docs, which only Google has access to.7 HBO stops investing in the best writers, and starts investing in creative LLMs. Nike’s ads are developed by a team of experts and a fork of Midjourney that’s tuned to create inspiring and emotionally arresting imagery; Pepsi storyboards their ads with Microsoft Tay."

"Still, it’s also possible the ceiling for some of these tools is so high that there’s still a huge gap between what’s cheap and what’s expensive—it’s just that the expensive ones are capable of things we can barely fathom today. Or, even if the differences between models are small, the advantages they provide, like being able to ship code faster, could compound quickly. Employing a marginally better AI “workforce” might not make a discernible difference in how a company operates from day to day, but those benefits could accrue into huge gaps over time."

"...it feels like a lot of the assumptions we make today about how society works, including the basic underpinnings of how jobs work, are about to get upended.

Here’s to hoping that it doesn’t happen faster than we can handle—unless, of course, at the pace we’re going, it already happened four hours ago."