The AI Plateau

Disclaimer: This could be absolute BS because I do not know of the depth of stuff. If you’re looking for some story or narration, welcome; but if you seek the technical details, I suggest you to look someplace else. Also, if you’re all set to read anyways, you’re advised to take everything as a grain of salt and not take any of the information for granted.

Bit of a backstory, past year might have been full of AI models left right and center, but this year somehow feels different. Is GPT-4 (Generative Pre-trained Transformers) the limit? Are other models even that good? What’s the deal with Gemini and big tech cuffing people to work for them for such high rewards but completely oblivious risks? And, biggest of them all, is AGI (Artificial General Intelligence) even achievable at this point, or it has been and will be a mirage all the way ahead?

My timeline of AI events (very rough)

ChatGPT took the world by storm early 2023, sparking and paving the way for more of these large language models to be fine tuned to do one thing or the other. Several competitors rose in conjuction to OpenAI’s remarkable models, yet none seemed to be doing that good of a job. Eventually, besides text, multimodal (text + audio + video) models were soon being seen in the market, just after individual image generation and video models peaked. Big tech companies went all in on the “AI” hype, incorporating it into each and every last bit of their product that could need it. Why? Because AI was the cool new kid everyone wanted to interact with. These huge “transformers” or so called “models” were nothing but a huge sea of ML nets under the hood; which predicted, in a sense, the next word in their response that made the most sense, humanly. Hence, one could infer the huge compute needed for these things to work. As a result of this “feature”, as models were on the move, compute needed to move as fast as that too: GPU’s becoming the key to exactly that. NIVIDIA got the job done, selling shovels while everyone was out digging gold. Fast-forward to late 2023, we get major updates to the existing models, and a few new models finally entered production after months of development. Multimodal models were a reality now, with exceptional capabilties, seemingly stepping towards singularity.

Now, with all this software and stuff, did hardware stay the same? Glad you asked, because no, hardware also had its rightful share on the AI clout. Multiple new age tech products were announced, be it cutting-edge GPUs, nearly-human robots or highly-configurable handheld AI devices; all of which would pave the way for a very hyper-realistic future. Once they become accessible to the general public, it will surely be like some futuristic scene from a sci-fi movie to say the least. Even automobiles had their own share, configuring interiors as they like and switching from manual to all-out auto. The whole techware landscape chaged, but on the large scale. AI startups, new extremely priced AI products, premium AI chatbots, home assistants, what not; but only accessible to those at the top.

Job markets continued to recede, with layoffs hitting half of the workforce off their workdesks, and new grads finding it difficult to land their first web-dev job. For sure, everyone was scared and on the edge of their seats hoping that they’d wake up to see another day at the office, everyday. Ironically, COVID was a factor back then too, remote jobs still are a consequence. Some people believed that the next age of software engineers would just be the english majors, prompting their way in and out based off on the product managers’ demands which inturn were just the reflection of what the customers actually wanted. No human overhead in between! To not over exhaggerate things, that sure might happen and next thing we know, an all batteries included “prompting” framework governs the whole SDLC (Software Development Life Cycle), while the tech lead, leads his way out the front door. But but but, some people beg to differ. It’s until you’re not proficient enough that you’re proficient enough to question your own skills and abilities. Code as we know it at present, needs a human to get human needs incorporated into them, and an LLM can only do too much for your billion-dollar product. Still, even these people will ultimately admit that using a code assistant such as the “GitHub Copilot” gets work done much faster and easier. Also, companies paying people their costly cut, would just pefer to turn to a lifeless code-monkey, who doesn’t even understand emotionally the semantics of a pay raise, and does not have “human” inadequacies to it; given it gets the job done, which it somewhat does.

In case you’re wondering, Huh, That’s not what a timeline is? :/

Well yeah, idk I wrote what came to mind.

The Current State and thoughts

Fast-forwarding to current year, the AI hype that shook the world away, seems to be calm at the moment, with model-progression ongoing at a pace at which it just seems like it’s plateauing or something? AI is not big a deal now, because everything around us conformed according to the latest AI trend in the market; and it will be like this for a while now, I suppose. It’s like when computers started becoming a thing, everyone was kinda skeptical at first but eventually it became a part of our lives. To no suprise, AI is already halfway through that road. Everyone’s living/coping with the fact that AGI might just take over the world someday, given if we reach there, which we will eventually, it’s just a matter of time. I’ll be completely honest, being a person who’s oblivious to most of the things under the hood, I thought AGI will hit very very soon. But, it is not that simple, which I realized just primitively entering 2024. You see, no big model updates, no major breakthroughs and no more progress on AGI, just that plain old ChatGPT with the latest model and some cool, better or atpar ripoffs. Could be the fact that I expected much more drama coming into 2024, which certainly is not there yet.

Headline of the TLDR newsletter

And when I see a headline like the one above at the top of my monday TLDR newsletter, it just makes me wonder:

If we need more compute than there is, to make the already existing models better, shouldn’t we question if this is actually the right way to make such models? Aren’t we doing something wrong here? About time a new solution to this problem should prevail over the already existing ones.

Now, I’m well aware that better models need better and more data to be better at what they do, but what I’m trying to imply is shouldn’t we be aware of the limit to all of this, as in the case with anything/everything else?

All we hope for is, put in more compute and someday the math will check itself out.