Roadmap and Architecture Ideas

lamberson · December 3, 2024, 9:53pm

Hi,

I am NOT a coder, but I am a systems and networking guy with a lot of experience in architecture.

My interests are initially in expanding the areas of memory management and token management, but overall architecture questions are what fill my mind.

Given this I worked the following up with ChatGPT, and I’d like to hear what others think. I see a lot of talk about features to add, but very little regarding architecture and a real roadmap. Have a look:

Architecture Summary for Memory and Token Management Framework

Goals

Comprehensive Memory Management:
- Store and retrieve chat histories, datasets, and metadata.
- Support multiple storage backends, including databases (e.g., PostgreSQL) and vector databases (e.g., Pinecone, Weaviate).
- Enable summarization, context generation, and seamless integration with chat flows.
Efficient Token Management:
- Manage tokenization, detokenization, and token limits dynamically.
- Provide compatibility across multiple language models with configurable token strategies.
Dynamic Dataset Handling:
- Allow loading, querying, and semantic searching of structured datasets.
- Enable integration with retrieval-augmented generation (RAG) pipelines.
Agent Framework:
- Support modular agents, including Swarm agents, MCP-compatible agents, and task-specific agents.
- Facilitate multi-agent collaboration and communication.
Extensibility:
- Provide a plugin system for memory, token management, and dataset backends.
- Allow external tools and APIs (e.g., LangChain, FAISS) to integrate seamlessly.
Python Integration:
- Leverage Python for advanced memory, RAG, and tokenization tasks.
- Enable communication between Python microservices and the TypeScript app.

Architecture Components

1. Memory Management

Core Interface:

Abstracts storage, retrieval, summarization, and key management.

export interface MemoryManager {
  save(key: string, value: any): Promise<void>;
  load(key: string): Promise<any | null>;
  summarize(context: string[]): Promise<string>;
  delete(key: string): Promise<void>;
  listKeys(prefix?: string): Promise<string[]>;
}

Backends:
- Relational Databases (e.g., PostgreSQL, MySQL):
  - Store chat histories and metadata.
- Vector Databases (e.g., Pinecone, Weaviate):
  - Handle embeddings and semantic searches.
- Hybrid Memory:
  - Combine multiple backends for versatility.

2. Token Management

Core Interface:

Handles token operations like counting, splitting, and ensuring token limits.

export interface TokenManager {
  tokenize(input: string): string[];
  detokenize(tokens: string[]): string;
  countTokens(input: string): number;
  isWithinLimit(input: string): boolean;
}

Backends:
- OpenAI Token Manager: Token handling for OpenAI models.
- Custom Tokenizers: For Hugging Face, Anthropic, or other APIs.

3. Dataset Management

Core Interface:

Supports dataset loading, listing, and querying.

export interface DatasetManager {
  listDatasets(): Promise<string[]>;
  loadDataset(datasetName: string): Promise<any[]>;
  queryDataset(datasetName: string, query: string): Promise<any[]>;
}

Backends:
- Relational Databases: Handle tabular data storage and querying.
- Python Integration: Perform semantic searches via LangChain.

4. Agent Framework

Core Interface:

Defines agents that respond to inputs and perform tasks.

export interface Agent {
  id: string;
  name: string;
  description: string;
  respond(input: string): Promise<string>;
  performAction(action: string, data: any): Promise<any>;
}

Agent Types:
- Simple Agent: Linear task-based responses.
- Swarm Agent: Multi-agent collaboration for complex tasks.
- MCP Agent: Implements Anthropic’s Model Context Protocol for advanced communication.

5. Plugin System

Core Design:

Centralized plugin registration for memory, token, and dataset backends.

export class PluginManager<T> {
  private plugins: Map<string, T> = new Map();
  register(name: string, plugin: T): void { this.plugins.set(name, plugin); }
  get(name: string): T | undefined { return this.plugins.get(name); }
  list(): string[] { return Array.from(this.plugins.keys()); }
}

Use Cases:
- Add custom memory or token managers dynamically.
- Register dataset backends or RAG pipelines.

6. Python Integration

Microservices:
- Run Python services using FastAPI for:
  - RAG pipelines (e.g., LangChain).
  - Tokenization and summarization.
  - Dataset querying and semantic search.

TypeScript Interfacing:

Communicate with Python services using HTTP APIs:

async function queryPythonService(endpoint: string, payload: any): Promise<any> {
  const response = await fetch(endpoint, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(payload),
  });
  return await response.json();
}

Proposed Directory Structure

src/
├── lib/
│   ├── memory/            # Memory managers
│   │   ├── InMemoryMemoryManager.ts
│   │   ├── VectorDatabaseMemoryManager.ts
│   │   └── DatabaseMemoryManager.ts
│   ├── tokens/            # Token managers
│   │   ├── OpenAITokenManager.ts
│   │   └── CustomTokenizer.ts
│   ├── agents/            # Agent logic
│   │   ├── MCPAgent.ts
│   │   ├── SwarmAgent.ts
│   │   └── SimpleAgent.ts
│   └── plugins/           # Plugin system
│       └── PluginManager.ts
├── hooks/                 # React hooks
│   ├── useMemoryManager.ts
│   ├── useTokenManager.ts
│   └── useAgentManager.ts
├── utils/                 # Utilities
│   └── fetch.ts           # API communication
├── services/              # Python interop
│   ├── memoryService.ts   # Communicates with Python memory services
│   ├── datasetService.ts  # Handles dataset queries
│   └── ragService.ts      # RAG pipeline integration

Next Steps

Build Core Interfaces:
- Define and implement MemoryManager, TokenManager, and Agent interfaces.
Implement Database and Vector Backends:
- Create DatabaseMemoryManager for chat history.
- Add VectorDatabaseMemoryManager for semantic retrieval.
Integrate Python Microservices:
- Use FastAPI for LangChain-based RAG and tokenization.
Develop Plugin System:
- Enable dynamic registration and loading of memory, token, and agent modules.

This architecture ensures scalability, modularity, and interoperability with Python-based resources like LangChain.

Even though I am not a coder, if the architecture were established and agreed upon I could do what I do at work (sort of as an experiment, too): see if my zero coding experience but vast IT experience oterhwise can result in some usable contribution to this effort.

Thoughts?

ColeMedin · December 4, 2024, 5:12pm

Love your thoughts here @lamberson, thank you for sharing!

You’re definitely right that we haven’t focused a ton on architecture for oTToDev at this point and there is a huge opportunity to do so.

The main reason we haven’t is because we want to knock out the core features for oTToDev (image prompting, loading local projects, loading projects from Git, etc.) before moving on to the larger scale things like this.

But we are certainly in the beginning stages of planning a lot of these things. I’ve personally been thinking a lot about an agent framework and a plugin system.

lamberson · December 5, 2024, 2:24pm

Hi Cole,

Thanks for your comments.

Since I’m not a developer I don’t fully grasp the architecture that exists now, so I’m unable to imagine how to work with the codebase. You guys who actually know the components you’ve used understand the architecture and how it might be best maintined and extended, but I dont.

Overall one makes plans and designs before building anything, and given my lack of development background I can’t comprehend what might be possible without a documented framework.

In the meantime I’m excited to see what transpires, but I hope even a basic architecture document is forthcoming soon.

Thanks!

scormer · December 5, 2024, 6:14pm

I fully agree on architecture level enhancement.

AI code generators excel at creating functional snippets but often overlook the critical architecture framing step needed for large-scale applications. Here’s my two cents in case haven’t been thought about yet:

Dedicated Architecture Layer (maybe special agent)

Use AI-driven requirements, templated blueprints, or user-prompted inputs to define architecture as a separate layer before code framing.
Support UX-based inputs like JSON or diagrams to enhance usability (on top of the new image understanding feature).

Guided Input for Average Users

Introduce explicit prompts for architectural details to assist non-expert users.
A dedicated architecture agent can ensure structured, accurate outputs.

Early Architecture = Better Results
Prioritizing architecture early reduces rework, improves structure, and accelerates development, making the tool more practical for building scalable applications.

These features may make oTTo distinct from others.

dsmflow · December 6, 2024, 5:17pm

I found a neat site, roadmap.sh … it’s helped connect the dots for me on what really needs to be connected in order for things to work in a robust fashion.

I’ve been using ROADMAP.md, IMPLEMENTATION.md, PROGRESS.md within my IDE (Windsurf) to plan, execute and update my projects. I usually instruct the IDE itself to update the progress file with comments. Makes it easy stop and start in the EXACT place I mean to without any loss of data. It’s a little bit of work on the setup, but the time it takes to go from 0-to-deployed has gone down dramatically.

I’ve been leveraging MCP tools with the claude app. 24 tools so far

I am a network engineer by trade, and just have a LOT of xp over the 20 years or so doing this…so while I don’t know coding, I know what has worked and what has not

lamberson · December 8, 2024, 8:57pm

This is a little different than my initial point but I like it all the same. This is certainly in line with my intended dream use of oTToDev.

I was referring to establishing and documenting the architecture of the existing codebase itself, not the use of oTToDev as a manager of the architecture of a project.

Cheers.

lamberson · December 8, 2024, 8:59pm

We come form the same backgrounds it sounds like, and so yeah, we’re looking for the structure and the plan. Right? Us infrastructure guys get it.

dsmflow · December 9, 2024, 12:22am

I think I replied to the wrong thing. Sorry haha

max2veg · December 9, 2024, 12:49am

do you mean internally for ottodev to use an agentic system (i thought it did already)

ColeMedin · December 12, 2024, 12:49pm

Yeah that’s right! Currently the LLM is agentic in the sense that it produces code artifacts one at a time for the UI, but it isn’t like we have chain of thought, mixture of experts, self-reflection, or anything like that behind the scenes right now. And all of those agentic flows could make the output a lot better!