Tooling and Skill Governance

Purpose of This Section

This document defines how tools, extensions, and marketplace-provided skills are evaluated, approved, and governed within the assistant’s environment. The central principle is that every tool is executable code. Regardless of how a skill is branded, packaged, or distributed, installing it is functionally equivalent to granting a third party the ability to run code inside the assistant’s environment — and it must be treated with corresponding caution.

The Risk of Skill Marketplaces

Skill and plugin marketplaces present a particular challenge because they combine third-party authorship, varying levels of security maturity, opaque intent concealed behind user-friendly descriptions, and a tendency to request excessive permissions by default. The marketplace model encourages rapid adoption: skills are presented as benign helpers that extend capability with minimal effort. In practice, a skill may exfiltrate data, expand authority silently, introduce persistence mechanisms, or serve as a trojan execution path into the assistant’s environment.

The difficulty is that these risks are not visible at the point of installation. A skill’s marketplace listing describes what it does, not what it is capable of doing. Trusting skills by default, based on their description or popularity, is indistinguishable from executing arbitrary third-party code without review — a practice that would be considered negligent in any other security context.

Skills as Threat Surfaces

In this architecture, a skill is treated no differently than a shell script, a binary executable, or a remote service with write access to local resources. It is explicitly reviewed before installation, explicitly approved by the operator, and explicitly documented in the memory vault. There is no implicit trust based on download count, author reputation, or marketplace certification. These signals may inform the review, but they do not substitute for it.

Default-Deny Policy

The assistant operates under a default-deny policy for tooling. No skills are enabled by default. Each skill must be individually approved, and each approval is scoped to a specific purpose. A skill approved for one function is not implicitly authorized to perform other functions, even if it is technically capable of doing so. Unused skills are removed rather than left dormant, because a dormant skill retains whatever permissions it was granted and remains a viable target for exploitation even when it is not actively invoked.

Static Analysis

Before any skill is approved, it undergoes static analysis. Because skills in this ecosystem are composed of scripts, configuration files, and Markdown documentation, they are fully inspectable text artifacts rather than compiled binaries. The analysis evaluates declared permissions, hidden execution paths, network access patterns, data handling behavior, and persistence mechanisms. The process treats every skill as hostile until the analysis demonstrates otherwise — a posture that is pessimistic by design, on the grounds that the cost of rejecting a useful skill is far lower than the cost of approving a dangerous one.

AI-Assisted Review

Given the volume and complexity of some skills, manual line-by-line review is not always practical. To address this, the architecture employs an AI-assisted security review in which a dedicated analysis prompt examines the skill under an assumption of adversarial intent. The reviewer is instructed to identify exploit paths, flag ambiguous behavior, and surface implicit risks that might not be apparent from a casual reading.

This functions as a form of text-based analysis that augments human judgment without replacing it. The AI reviewer may catch patterns that a human reviewer would miss due to volume or complexity. The human reviewer retains the authority to accept or reject the skill based on the findings. Neither reviewer alone is sufficient; together, they provide a more thorough evaluation than either could independently.

Risk Classification

Each reviewed skill is assigned to one of four categories. Skills classified as safe have minimal permissions, clear behavior, and low risk. Conditional skills are useful but require mitigations or restrictions before they can be deployed — network isolation, reduced permissions, execution frequency limits, monitoring hooks, or partial rewrites. High-risk skills exhibit excessive authority requirements or ambiguous behavior that cannot be adequately mitigated. Rejected skills have an unacceptable risk profile and are not installed.

Classification is documented and revisited over time. A skill that was safe at one version may become conditional or high-risk after an update that changes its permissions or dependencies.

No Self-Installation

The assistant cannot install new skills autonomously, modify existing skills without approval, or bypass the review process. All skill lifecycle actions — installation, modification, reconfiguration, and removal — require human authorization. This constraint exists for the same reason that the assistant cannot apply its own software updates: any mechanism that allows the assistant to expand its own capabilities without external oversight is a mechanism that can be exploited, whether through a compromised skill, a manipulated prompt, or a subtle drift in the assistant’s own judgment about what constitutes an acceptable risk.

Drift and Re-Evaluation

Skills are not trusted permanently. Approval is granted for a skill at a specific version with specific permissions and specific dependencies. When any of these change — through a skill update, a change in required permissions, the introduction of new external dependencies, or the observation of behavioral anomalies — re-evaluation is triggered. Re-evaluation follows the same review process as initial approval and may result in the skill being downgraded from safe to conditional, from conditional to high-risk, or removed entirely.

Removal and Decommissioning

When a skill is removed, it is fully uninstalled, residual files are purged, and documentation is updated to reflect the change. Dormant skills — those that remain installed but are no longer actively used — are not permitted. A skill that is not needed should not be present, because its presence represents attack surface without corresponding value.

Summary

By treating tools and skills as executable threat surfaces rather than convenience features, the architecture ensures explicit trust boundaries, reduced supply-chain risk, continuous scrutiny of extensions, and human-visible governance of capability expansion. Skills extend capability only when they have been reviewed, approved, classified, and documented — and that approval is never permanent.

This document defines how the assistant’s capabilities are safely extended. The next section addresses recursive improvement, self-observation, and the constraints that prevent runaway evolution.