Today, the vast majority of tech companies are running their generative AI models solely in the cloud. But as they face an increase in cloud costs and inherent privacy challenges, many companies are looking at building their own AI models or running these workloads on-device. A recent report from Mark Gurman at Bloomberg reveals that Apple is building their own generative AI tools, for example.
We're already seeing continued processing and memory performance advancements in mobile chips for smartphones which allow models like Stable Diffusion to run on a high end phone with decent performance. Meanwhile, companies are also working to reduce the number of parameters a generative AI model needs to provide accurate results. Smaller models in the range of 1 to 10 billion parameters such as Llama 2-7B are already achieving similar results to GPT-4 and will continue to improve.
This points to a future in which generative AI models with 10 billion parameters or more will soon run on mobile devices, meaning most of us will have a small supercomputer in our pockets in the near future. And thanks to a recent report from Stephanie Palazollo and Jessica Lessin at The Information, it looks like that future might be closer than we think.
Comments