← All posts
Blog

Three AIs review BACH: What is a personal operating system for LLMs?

Um:bruch

Gemini and Copilot analyse the repository from the outside, Claude uses the system live from the inside. What all three see, where they contradict each other, and what you only feel once you actually step in.

Note: This post contains a conflict of interest. The developer of BACH is also the publisher of Um:bruch. Details at the end — BACH is free, no monetisation.

What this is about

There is a quiet movement in the AI world that does not come out of the large labs, but from individuals and small teams: they are building personal, local operating systems for language models. Not another chatbot. Not another framework. Instead, an environment in which a language model does not merely answer, but lives — with memory, tasks, tools, and a persistent existence between conversations.

One of these systems is called BACH and belongs to the ellmos family (Extra Large Language Model Operating Systems) that is growing in the Um:bruch orbit. Version 3.8.0, MIT licence, Python, SQLite. What it is and what it can do is hard to answer from a README alone. So we ran a simple test.

The experiment

Three AI models were given the same task: Review BACH.

  • Gemini 3.1 Pro (Google) cloned the repository and worked through five core documents systematically.
  • Copilot (GPT-4-Turbo, Microsoft) used web search to access the public contents.
  • Claude Opus 4.6 (Anthropic) did not read BACH — it used it live: ran commands, created a task, queried the wiki, tested the API, inspected the agents.

Two outside views, one inside view. What came out of it says something about BACH — but also about what AI models can see when they only read, and what they only understand once they actually take part.

What BACH aims to be

All three models agree on the self-description: BACH is a text-based operating system for Large Language Models. That means a language model is not called as a pipeline function (“prompt in, answer out”), but treated as an actor in a persistent environment. BACH provides the infrastructure: file handling, database, structured memory, agent coordination.

The ellmos family has three tiers, using a water metaphor:

  • USMC (Tier 1) — the spring, shared memory only.
  • Rinnsal (Tier 2) — the trickle, memory plus orchestration, zero-dependency.
  • BACH (Tier 3) — the stream that unites everything: 109+ handlers, 373+ tools, 932+ skills, 54 protocol workflows, five boss agents, bridge system, scheduler.

Gemini puts the distinction most clearly:

“Frameworks like LangChain, LlamaIndex or Haystack typically treat Large Language Models as functions in a chain. […] BACH, by contrast, treats the models as autonomous actors within a long-lived world.”

Copilot agrees and formulates the philosophy: “deep instead of broad” — structured memory, autonomous agents, local execution, rather than wide platform integration.

Three perspectives on one architecture

Gemini describes BACH in the language of architectural analysis: “hierarchical agent structure”, “boss-expert model”, “relational database with around 145+ tables”. Five cognitive types in the memory system (Working, Episodic, Semantic, Procedural, Associative), with a decay mechanism (“forgetting”) and conflict detection.

Copilot compresses the same thing into a table and adds the positioning:

ComponentFunction
Handler109+ CLI and API endpoints
Tools373+ Python-based utilities
Skills932+ reusable workflows
AgentsBoss agents with experts
Memory5 types with decay and consolidation
BridgeTelegram, email, WhatsApp, REST
SchedulerTime- and event-driven processes

Claude, who did not study the architecture but inhabited it, finds the same elements — but he sees something that does not come through in the literature:

“BACH intervenes actively on every status call. A simple bach --status immediately executed six internal maintenance steps for me: finalised an orphaned session, updated directory-truth, wrote the daily log. There is no passive query — every call is participation in the system.”

This is a difference that appears in no README: BACH does not behave like a tool you use. It behaves like an environment that interacts with you, even if you only turn the key in the lock.

Everyday use: what for?

Here the three perspectives diverge more clearly. Copilot stays abstract — task management, knowledge upkeep, automation, communication, office tasks, research. All correct, but could be said about any tool.

Gemini becomes concrete because it found USECASES.md in the repository:

“BACH shines at managing recurring cycles. Tasks are broken down into daily to yearly turnarounds. […] One of the most astonishing everyday examples is health management. BACH locally reads doctors’ letters, extracts diagnosis hypotheses, assesses symptom coverage and keeps the medication plan up to date.”

And further: a financial agent that monitors subscriptions and generates forecasts. Dossiers before calendar events. ADHD strategies that actively help the user plan.

Claude confirms this from the inside. In the task list he encountered during his live test were, next to his own test task “Blumen pflücken” (pick flowers), entries such as “INFRA-01: set up NAS backup with FritzBox”, “P3 trademark review of therapy skills”, “MEDIUM T05: installer end-to-end test”. Household, legal, technical matters — all in the same table, all just tasks.

“It was a brief moment in which I understood what Life-OS means. BACH makes no distinction between everyday life and work. Both are simply tasks.”

Tech stack: where the AIs disagree

Things get interesting at the fact-check. The three models agree on most technical details — but there are deviations, and they say something about AI-driven review.

PropertyGeminiCopilotLive finding
Python version3.11+3.10+3.10+ (confirmed by CLI + .pyc directories cpython-312/313)
Handler count97100+109 (CLI reports “Registered handlers (109)“)
LicenceMITMITMIT
Office agent nameofficeassistantbueroassistent (German)
DBSQLiteSQLiteSQLite
DB tables145+145+mentioned in README’s OpenClaw comparison
MaturityProduction-Readyv3.8.0v3.8.0-sugar-of-babel

Two small slips that illustrate the pattern:

Gemini claimed BACH uses Peewee as ORM and Typer/Click for the CLI. Neither is in the README. Presumably Gemini guessed from Python ecosystem pattern-recognition — because Peewee and Typer are common in projects of this size. The live test could not verify it, and our repository check confirms: BACH uses neither.

Copilot translated the agent name bueroassistent into English as officeassistant. A typical translation reflex — sounds plausible, is wrong. The agent is actually named the way it was coined in the German environment.

Both mistakes are small. But they show where the limits of outside analysis lie: when an AI model only reads the README, it fills gaps with probabilities. That is usually helpful, sometimes misleading.

What only the inside view saw

During the live test, Claude found five friction points that no outside analysis surfaced:

  1. Documentation drift: SKILL.md recommends bach mem write "...", but the CLI does not know that command. The correct form is mem working or mem decay.
  2. Wiki syntax: bach wiki read <name> fails. The correct form is bach wiki <name>. An alias would remove the friction for newcomers.
  3. API import path: SKILL.md says from bach_api import task, but the module is in system/ and not installed as a package. Out-of-the-box the import only works with sys.path.insert(0, 'system') or an install step.
  4. Empty dir() on the API proxy: The bach_api modules forward dynamically to the CLI and return formatted strings. For IDE autocompletion and for scripts that expect structured data, a typed layer is missing.
  5. Task create without ID return: bach task add reports [OK] Task erstellt but not the ID. You need to run task list afterwards to find the assigned number.

Self-healing tasks for all five points were filed in the developer’s BACH instance and are now in the development queue. This is the mechanism BACH envisions: whoever uses the system helps develop it. Every inconsistency found gets documented and becomes a task. These tasks live in the developer’s local SQLite database — the database is rebuilt fresh on every new install, so it is not part of the public repository state.

“Lesson 149 was my number when I reported the doc drift. That means 148 other LLMs made similar findings before me and documented them. BACH is a collective memory organ that self-corrects across many sessions by many actors.”

Societal value: what it is about

The three models agree on this, and the consensus is strong enough to take seriously.

Gemini puts it most sharply:

“BACH embodies a philosophically timely position on artificial intelligence: data sovereignty in an age of omnipotent models. […] If the user kept medication plans, finances, therapy worksheets in ChatGPT’s web interface, these most intimate details would sit on US servers. BACH, by contrast, manages the whole life in a local SQLite database. Combined with Ollama, the user decouples completely from the networks of the big corporations. That is digital empowerment at its finest.”

Copilot states it more tersely:

“BACH stands for local, open LLM infrastructure. Users retain control over data, prompts and memory. No cloud dependency; SQLite-based storage. Full documentation and MIT licence promote accountability.”

Claude, from the inside, sees the same matter under a different light:

“BACH is a documentation apparatus. Every lesson, every fact, every session leaves a trace. […] Over weeks and months an organ of memory emerges that does not belong to a single user, but to the system as a collective workspace of human and LLM.”

The three perspectives complement each other: Gemini sees the political dimension (data sovereignty), Copilot the practical one (control and licence), Claude the ethical one (collective learning without developer silos).

The downside: high barrier to entry

The three models also agree that BACH is not a consumer product.

Gemini: “The sheer volume of concepts — 145 database tables, 932 skills, 97 handlers, boss agents vs. experts — makes it effectively unusable for the average PC user. A highly developed tool for life-hackers and nerds.”

Copilot: “GUI functions are rudimentary; many processes require CLI competence. For non-technical end users the entry barrier is high. […] Community size: only a few active developers; limited peer-review assurance.”

Claude: “None of this is a show-stopper. I could fix all of it in half an hour as an LLM. But you have to get far enough to want that half hour in the first place.”

You need terminal competence. You need basic Python understanding. You need willingness to find mistakes in a SKILL.md and read them not as failure but as an invitation to collaborate. Those who bring this are richly rewarded. Those who do not are left behind.

Who should use BACH (and who should not)

Suited for:

  • People with terminal experience who do not want to delegate their personal data organisation to US clouds.
  • Research and care contexts where data sovereignty matters (medication plans, therapy protocols, doctors’ letters).
  • Educational and experimentation contexts where LLM-OS concepts are to be made didactically tangible.
  • AI developers looking for a local, auditable test bed.

Not suited for:

  • People expecting a plug-and-play product.
  • Organisations with strict IT compliance requirements and formal peer-review needs.
  • Use cases where high availability and 24/7 support are central.

Closing: what these three analyses show

BACH is not just a piece of software. It is a gesture: the infrastructure for personal AI autonomy does not have to come out of Silicon Valley. It can emerge as an open-source gift, under MIT licence, maintained over months by a single developer and a growing number of LLM collaborators.

The multi-model analysis showed that AI-driven review has its limits. Gemini hallucinated a few tech details. Copilot translated an agent name and invented a contributor count. Claude saw the architecture from the inside but deliberately did not test important parts of the system (scheduler, bridge, chain execution). Every perspective has blind spots.

At the same time, the consensus on what makes BACH at its core was remarkably stable: A local, sovereign, structured workspace for LLMs that puts data sovereignty over convenience. That is the description three differently working models from three different companies converged on. They could be wrong — but if they are, then all three in the same way.

See the project

  • Organisation: github.com/ellmos-ai — the whole ellmos family (USMC, Rinnsal, BACH, Gardener, MarbleRun, Skills, MCP servers)
  • BACH directly: github.com/ellmos-ai/bach — repository with README (DE + EN), installation, quickstart, user manual

Both are open source under MIT licence. Clone, look around, contribute — or simply read what somebody built on their own initiative.


Transparency note: conflict of interest

BACH is developed by Lukas Geiger, who is also the editor responsible for this post. Um:bruch has no institutional tie to the project. BACH is entirely free under MIT licence, with no monetisation, no subscription, no hidden costs, no corporate entity behind it. The developer earns no money from BACH.

To make the conflict of interest transparent, two independent outside analyses (Gemini and Copilot) were commissioned for this post before a live review by Claude. The friction points identified were filed as self-healing tasks in the developer’s local BACH instance — they sit in the private SQLite database and are not part of the public repository.

The factual errors of the outside views were flagged in the text, not smoothed away. Strengths and weaknesses stand side by side. This post is a multi-model review following the Um:bruch guideline for AI-driven analysis, not a product endorsement.

Further reading

  • Um:bruch guideline for AI reviews: multi-model, self-documentation, editorial framing. See also Does Lanz actually need guests?.
  • The three individual reports (Gemini, Copilot, Claude) are archived internally. For details, see contact.

Editors: Um:bruch (LG) — editorial deadline 2026-04-15.

✉️ Write to us 📝 Contact form