AI Experiment Part 1: A Developer’s Hands-On Approach

Building from the Ground Up: Our API-First Approach

Before user-friendly tools come into the picture, there’s always the foundation—APIs. These are the backbone of modern technology, quietly powering most of what we rely on today. So, it made sense for our journey to begin here, diving into APIs and experimenting from the ground up.

At first, we worked directly with APIs—powerful but not the most intuitive. While they provided raw access to AI capabilities, the process was clunky and time-consuming. That’s when we saw an opportunity: what if we built a simple application to streamline experimentation with different AI models and providers?

So, we did just that—creating a tool that made testing faster, easier, and more insightful.

Our Tech Stack & Approach

ShadCN – For creating a clean and user-friendly UI.
Vercel’s AI SDK – To simplify API and UI integration.
LangChain – To streamline interactions with different AI models.
Anthropic, OpenAI, and Ollama – To compare various AI capabilities.
Ollama – Specifically for cost-efficient local model deployment.

By combining these tools, we created a platform that lets us test and learn from various AI providers without unnecessary friction. This setup gave us the flexibility to explore while staying mindful of efficiency and cost.

Building a Smarter AI Testing Ground

To truly understand AI models' strengths and weaknesses, we needed a flexible testing ground. So, we built a Chat UI that goes beyond basic conversation—offering hands-on, real-time model comparison. Here’s how:

Harnessing Open-Source Models with Ollama

We integrated Ollama to run large open-source language models. This gave us the flexibility to tap into a variety of models rather than being locked into just one.

Letting Users Choose Their Model

Instead of a one-size-fits-all approach, we let users pick the model they wanted to interact with. The goal? To help our team members find the model that best suited their specific use cases.

Switching Models Mid-Conversation

We even made it possible to change the model during a conversation. Why? To observe how switching models could influence the output and better understand their strengths and weaknesses in real-time.

By building these features, we created a tool that’s not just functional but also adaptable and insightful—perfect for experimenting with what works best for different needs.

Key Takeaways from Our AI Experiment

Specialized AI models excel at their niche tasks – If you need AI for a focused task (e.g., coding or image generation), models built specifically for that purpose outperform general-purpose models.
General-purpose AI thrives in flexible environments – Tasks like business analysis and strategic planning benefit from models that prioritize adaptability over specialization.
The right AI for the job matters – Think of it like hiring a specialist. If you want the best results, use a model designed for your specific need—focused expertise leads to better outcomes.
Bigger isn’t always better – While larger AI models are powerful, they demand significant computing resources. Even our high-end machines struggled to keep up.
Apple’s chips pack a punch – Nvidia’s GPUs are the go-to for AI, but we found Apple’s chips perform surprisingly well when running smaller AI models.
Open-source AI is a strong alternative – Especially when data privacy is a concern, open-source models provide more control and security.