Spec driven development with Agents

Because of a lack of time and resources, I've been trying to use Agent AI for my personal projects. Of course, in professional projects, I'm open to using LLMs for tasks that I don't like to do by myself, and their complexity is not a problem.

Before, I was skeptical about using AI models for professional tasks, but since strong AI models are available such as opus version >= 4.5 I think: hmmm, now AI can work with me together.

In terms of experiences, I'm using LangChain and OpenAI for a few projects, working with models such as opus, sonnet, codex, ... and nowadays also minimax, kimi, big pickle,...

A problem with using AI models is that they are too easy to use (even for non-technical people), and it causes accuracy issues if reviewers can not review whether the content generated is slop or not.

Besides, Managing contexts and memories is a challenge with people using AI models, because contexts, memories are related to how much you need to pay (how many tokens you need to use) to do a task / job. Modern LLMs models are able to remember large contexts and memories, and they are able to do complex tasks. But many times, if you don't provide enough contexts its memory will be lost and it will try to collect more or all information (again) even it did that same work for 2 minutes ago. I see a lot of people using AI models for simple tasks without thinking about the minimizing related contexts and memories. That leads to a lot of waste of money and time, and even the accuracy of the output. In this post, I'll try to explain how I use AI agents for professional tasks, and how to organize the contexts and memories to avoid rolling contexts, contexts lost to gain expected output's accuracy, quality with minimum cost.

Understand spec-driven-development

Spec-driven development is a development methodology that is used to develop software. It is based on the idea that software should be developed in a way that it meets the requirements of the product owner/user.

In agentic AI era, this methodology is useful for new development style, because it allows developers to focus on the requirements of the user, rather than the implementation details. With this, specs become the source of truth, and the development process with Agents becomes more efficient and less error-prone.

A spec is a structured, behavior-oriented artifact - or a set of related artifacts - written in natural language that expresses software functionality and serves as guidance to AI coding agents. Each variant of spec-driven development defines their approach to a spec’s structure, level of detail, and how these artifacts are organized within a project.

There is a useful difference to be made I think between specs and the more general context documents for a codebase. That general context are things like rules files, or high level descriptions of the product and the codebase. Some tools call this context a memory bank, so that’s what I will use here. These files are relevant across all AI coding sessions in the codebase, whereas specs only relevant to the tasks that actually create or change that particular functionality.

You can read more about spec-driven development here.

My implementation of spec-driven development using Agents

Keep general contexts as the high level reference for agents, as they are memory banks. I consider using architecture design documents, database design documents, rules/or agents.md (a file that contains all the rules of the project) for this purpose. Since they are high level, they are not specific to a particular task, but they are relevant to all tasks, any tasks should follow the rules and architecture of the project. Once architecture is defined and reviewed, it rarely changes, so it is what agents should be based on, rembember and follow from zero.
All requirements/specifications need to follow the general contexts, to not break the rules, goals and architecture. If in a spec, requirement; we find that we need to change the architecture, we need to update the general contexts (related documents) right away, and then update the spec.
Specs should have a boundary, so that they are not too broad. I always separate specs into smaller peaces, organized them by the modules or features they are related to.
From the specs, as our development approach, I created plans, and tasks. Plans are the high level goals for each milestone or sprint, and tasks are the specific tasks that need to be done for each plan. Since in the side project, since there aren't many teams, members, I always plan based on modules or features. It is easy to see the progress of the project, and it is easy to see what is the next step to do rather than looking at the whole project with slowly changing for each features/modules. It is also reduce dependencies between tasks.

While doing tasks, agents can follow the general contexts which are always in their memmory banks, they don't need to roll contexts again, if in any case, they forgot about the contexts, reading the general contexts will be much easier than trying to get the contexts from the source code. (cost tokens and low accuracy)

For testing and debugging also reviewing, we - human and agents can follow the modules and features design because, in each task, we are linking todos with the features and modules specs which are our source of truth.

For microservices, we can also follow this approach. But each service should have its own whole contexts and memories. Even in modular system, if a module is big enough, we can also do the same thing. And we just need an overall contexts file to connect things together.

Example of organizing documents / specs

docs
├── architecture.md
├── database.md
├── features
│   └── design-core-processing.md
│   └── ...
├── modules
│   └── design-shared-components.md
│   └── ...
├── screens
│   ├── auth
│   │   ├── screen-login.md
│   │   ├── login.svg
│   │   └── forgot-password.svg
│   ├── item-management
│   │   ├── screen-item-management.md
│   │   ├── item-list.svg
│   │   ├── item-add.svg
│   │   ├── item-edit.svg
│   │   └── item-detail.svg
│   └── dashboard
│       ├── screen-dashboard-main.md
│       └── dashboard-charts.svg
├── plans
│   ├── README.md
│   ├── sprint-1-foundation-setup.md
│   ├── sprint-2-core-logic-implementation.md
│   ├── sprint-3-performance-tuning.md
│   └── sprint-4-dashboard-crud-ui.md
└── tasks
    ├── S1-001-input-validation.md
    ├── S1-002-localization-setup.md
    ├── S1-003-data-sanitization.md
    ├── S1-004-error-handling-config.md
    ├── S2-001-core-service-refactor.md
    ├── S2-002-api-request-optimization.md
    ├── S2-003-state-management-update.md
    ├── S2-004-cache-ttl-configuration.md
    ├── S2-005-global-exception-handler.md
    └── S2-006-database-schema-migration.md

Notes

For UI development, we can split them into another one with all spec components. And I usually do so.
For diagram in the document, I usually use mermaid, I find that agents can read it quite welll.
If you are working with multiple agents, consider using each agent for a specific task, for simple tasks, you might don't want to fire tokens with opus.
Again, opencode is my choice as AI coding agents tool. It has nice TUI, easy to use with nice commands, mutiple models available and prepaid with zen is quite suitable to me.
This is only my implementation, It might work for me, but not for you. If you have any questions or recommendations, please feel free to let me know. Thanks!

SPEC DRIVEN DEVELOPMENT WITH AGENTS

Understand spec-driven-development

My implementation of spec-driven development using Agents

Example of organizing documents / specs

Notes

▶ Find out more: