My learnings from Addy Osmanis article on how good is AI React coding
These are my notes from the article How Good Is AI at Coding React (Really)? by Addy Osmani.
There's a lot of information packed into this presentation
Addy says that AI is a force multiplier. "It amplifies everything: good requirements, good architecture, good taste"
AI is most useful for scenarios such as building isolated components, scaffolding, implementing explicit requirements. Its less useful for scenarios such as: multi-step integration, design taste, complex state management.
We can generalize this to: the higher the complexity the less useful (productive) is the LLM. I called this same point out in a presentation I made in October to our tech leaders.
And in fact Addys says this explicity later in the article: "If you remember nothing else from this article, remember this: AI handles simple tasks well and then falls off a cliff as complexity rises."
That's because some benchmark tests are limited: oversimplified problems, limited task complexity, solutions seen before. These benchmarks/evals give a false impression of the models.
There are other evals and benchmarks which address these shortcomings and provide a more realistic picture of model performance:
- Nextjs evals is one such benchmark which shows success rate of approx 42% on tests (https://nextjs.org/evals)
- SWE Bench Pro - models score <= 43% success rate on tasks (https://scale.com/leaderboard/swe_bench_pro_public)
React apps are not just code, they're products with user experiences which also include reliability, security, performance, accessibility. React devs should pay attention to quality, user experience quality, code quality, aesthetic and more.
Addy references these eval sites:
- Design Arena (https://www.designarena.ai/) where users rank ai generated content
- Web Dev arena (https://lmarena.ai/leaderboard/webdev) how well models generate websites from prompts and follow ups
Comments
Post a Comment