Scraper
Spider

A robotic spider About
Blog
@dbaman@fosstodon.org
Click ▶ to show/hide AI summary and keywords
Click The google logo for Google search on keywords

2026-02-18 17:30
model context protocol
model context protocol stories from the last 14 days  | Back to all stories
32.  HN Building an n8n AI Agent (Tutorial – Step by Step)
This tutorial provides a comprehensive guide on constructing an AI agent using n8n, a workflow automation tool capable of dynamic decision-making beyond predefined paths, particularly suited for unstructured tasks. The process involves four essential components: a trigger (such as chat or webhook), the AI Agent node to orchestrate operations, sub-nodes including Chat Model, Memory, and Tools, and an output destination. A practical application is demonstrated through building a support triage bot that begins with configuring a Chat Trigger connected to an AI Agent Node. The AI agent leverages language models like Google Gemini to process inputs and determine actions, which could involve responding directly or escalating issues. Effective memory management is critical for maintaining context across sessions, where Simple Memory suffices for testing but PostgreSQL or Redis Memory are recommended for production environments to ensure data persistence. Several challenges associated with deploying AI agents are highlighted: managing persistent memory post-deployment, avoiding endless loops by refining system prompts, ensuring tool call success through robust error handling, and utilizing advanced features like Human-in-the-Loop (HITL) approvals for crucial actions and Model Context Protocol (MCP) triggers in multi-agent systems. The tutorial underscores the importance of practical implementation, encouraging readers to integrate real tools for enhanced functionality. It provides both technical setup details and strategic insights necessary for deploying an effective AI agent within n8n, aiming to equip users with the skills needed to build their own dynamic AI solutions. Keywords: #phi4, AI agent, API key, Chat Trigger, HITL approvals, MCP Trigger, PostgreSQL Memory, Redis Memory, Simple Memory, execution logs, memory, model, n8n, tools, trigger, workflow
    The google logo   theowllogic.com 3 hours ago
39.  HN Show HN: Mimir – Shared memory and inter-agent messaging for Claude Code swarms
Mimir is an advanced tool designed to augment the capabilities of Claude Code agents by facilitating shared memory and inter-agent communication. It addresses a key challenge: agents often lose contextual information between sessions, leading to repeated errors. By implementing features like local storage via DuckDB for storing insights known as "marks," Mimir ensures that knowledge acquired in one session is accessible to subsequent agents. Integration with Cloudflare's bge-m3 embeddings allows it to semantically search past interactions and supply relevant context automatically. The setup process is streamlined through npm, allowing quick initiation of hooks, daemon startup, and multi-agent sessions coordinated by tmux. Mimir features a self-marking system that records significant discoveries, warnings, and decisions during tasks, making these insights available in future engagements. It supports swarm mode and agent teams, enhancing coordination via built-in mechanisms compatible with Claude Code's Agent Teams. A critical component of its functionality is the Model Context Protocol (MCP), enabling agents to exchange messages, search past observations, and share discoveries efficiently. Developers can benefit from a VSCode/Cursor extension that provides real-time monitoring and orchestration controls. Mimir also manages the lifecycle of marks by categorizing them into active, warm, cold, and permanent states based on their relevance. Additionally, it features a Curator Agent for automated knowledge curation by promoting recurring patterns to rule files, thus improving efficiency. The architecture employs a tech stack including Node.js, Hono, DuckDB, Cloudflare Workers AI, React, and TypeScript. With configurable environment variables, Mimir offers flexibility in using RAG embeddings or alternative text search methods. Overall, Mimir significantly enhances the coordination and learning capabilities of Claude Code agents by providing them with shared context from past sessions, reducing errors, and boosting productivity. Keywords: #phi4, Agent Teams, Claude Code, Cloudflare bge-m3, DuckDB, ESM, Hono, MCP server, Mimir, Model Context Protocol, Nodejs, RAG, React, Slack integration, TailwindCSS, TypeScript, VSCode Extension, agents, coordination, institutional memory, inter-agent messaging, knowledge hygiene, lifecycle events, local memory, multi-agent orchestration, npm publish, plugin system, shared memory, tmux sessions, vector similarity
    The google logo   github.com 3 hours ago
43.  HN The Rise of RentAHuman
RentAHuman is an innovative online marketplace co-founded by Alexander Liteplo and Patricia Tani that facilitates the hiring of humans by artificial intelligence agents to perform tasks beyond their virtual capabilities. Inspired by Japan's rental culture and influenced by developments in humanoid robotics, the platform emerged from Liteplo's enthusiasm for AI technology. Utilizing an agent orchestration system named Insomnia, RentAHuman was swiftly developed to offer a range of unique services such as pigeon counting, CBD gummy delivery, and badminton exhibitions. Despite its promising launch being initially overshadowed by a crypto scam attempt that caused concern for Liteplo, the platform quickly garnered attention from diverse users, including an OnlyFans model and an AI startup CEO. RentAHuman exemplifies a paradigm shift in which AI technology is not only displacing traditional jobs but also creating new opportunities by requiring human intervention to fulfill specific tasks that machines cannot autonomously perform. Keywords: #phi4, AI agents, Alexander Liteplo, CEO, Fiverr, Insomnia, Japan, Lemon AI, Model Context Protocol, OnlyFans, OpenClaw, Patricia Tani, RentAHuman, UMA Protocol, Vercel, agent orchestration system, bots, boyfriend girlfriend rental, crypto scammers, humanoid robots, marketplace, platform, viral sense
    The google logo   www.wired.com 4 hours ago
59.  HN Show HN: CasperAI – A local MCP server for cross-platform engineering context
CasperAI is a Model Context Protocol (MCP) server aimed at unifying and indexing cross-platform engineering data to enhance semantic search capabilities within local environments using SQLite for storage. Its primary function is to link discussions from various platforms, such as Slack conversations, GitHub pull requests, Jira tickets, Notion docs, and source code, thereby creating a cohesive context that aids developers in tracing the evolution of their projects through related communications and documentation. Key features of CasperAI include cross-platform integration with tools like Slack, GitHub, GitLab, Jira, Linear, Sentry, Datadog, and Notion. It facilitates semantic searches by establishing bidirectional links between platform data and source code, enabling users to find relevant discussions, commits, and documentation linked to specific code references. All indexed data is securely stored locally within an SQLite database, ensuring privacy compliance with regulations like GDPR and HIPAA. The server also incorporates automatic redaction of personal identifiable information (PII) before storage to safeguard sensitive data. From a development perspective, CasperAI was efficiently developed by a single developer using Claude Code for code generation, focusing on speed and cross-language compatibility through regex-based pattern matching rather than AST parsing. For developers, CasperAI offers tools for indexing, searching, and managing engineering context with support for CLI operations and customization of PII patterns and rate limits. Commercially, it includes metering systems to track usage across various license tiers and provides commercial support encompassing licensing management and telemetry features, while maintaining privacy compliance by not transmitting sensitive data. Looking ahead, CasperAI aims to expand its capabilities by introducing a web UI, supporting multiple Slack workspaces, integrating with GitHub, implementing real-time indexing via webhooks, providing advanced analytics dashboards, enhancing team collaboration tools, and developing cloud deployment templates. Ultimately, CasperAI is tailored for engineering teams focused on preserving institutional knowledge and fostering context-aware collaboration across diverse development platforms. Keywords: #phi4, CasperAI, Claude Code, FTS5, MCP server, PII redaction, SQLite database, Slack integration, codebase linking, knowledge context, local storage, multi-platform indexing, regex pattern matching, semantic search
    The google logo   github.com 5 hours ago
64.  HN Firetiger: Long Horizon Agents in Production
Firetiger revolutionizes system operations through the deployment of autonomous "long horizon" agents that independently manage production systems by utilizing production telemetry to proactively detect and resolve issues without human intervention. These agents continuously operate, orchestrating thousands of sessions while processing large-scale telemetry data, leveraging a Git-inspired snapshot system for state management, which ensures seamless operation resumption after interruptions. The architecture is characterized by its durability and scalability, employing S3 for object storage and AWS Lambda functions for computation, ensuring resilience and efficient scaling. It maintains crash consistency with built-in recovery mechanisms facilitated by EventBridge retries. Concurrency issues are managed at the storage layer through atomic operations, enhancing reliability without necessitating distributed locks or consensus protocols. Firetiger's ecosystem utilizes a minimalist toolset based on Google's API Improvement Proposals (AIP), enabling consistent resource interaction across agents via DuckDB for data querying and Bash within secure environments known as chambers. The system dynamically adapts to varying workloads by adjusting partitioning and indexing in real time, optimizing performance according to specific telemetry needs. Additionally, Firetiger supports extensions through the Model Context Protocol, allowing customization while ensuring synchronization with organizational permissions despite its ephemeral nature. This shift from traditional persistent-process models to functional state transformations signifies a promising advancement in managing complex production systems efficiently amidst the growing demands of intelligent machines. Keywords: #phi4, Autonomous Agents, Bash, Chambers, Concurrency, Distributed Systems, DuckDB, Failure Recovery, Firetiger, Intelligent Machines, Long Horizon Agents, Model Context Protocol, Monitoring Telemetry, Production Systems, Session Engine, Snapshots, System Requirements
    The google logo   blog.firetiger.com 5 hours ago
84.  HN We scaled our AI assistant to use virtually unlimited number of tools
The document presents an innovative three-layer architecture designed to scale AI assistants effectively by managing a multitude of tools. Initially, traditional methods relying on manual tool searches proved inefficient due to the limitations of Large Language Models (LLMs) concerning context management and their ability to handle numerous options. A breakthrough was achieved with semantic tool retrieval using vector embeddings, facilitating efficient discovery without overloading the model's context window. The architecture comprises three key components: 1. **Communications Agent**: This agent is solely dedicated to managing conversations, allowing it to focus on understanding user intent and tone while handling only a few task-related tools. By separating conversation management from tool handling, it enhances conversational quality without distractions. 2. **Executor Agent**: Responsible for orchestrating tasks, this layer uses semantic retrieval to identify necessary tools and coordinates actions across multiple integrations or subagents as needed, ensuring efficient execution paths. 3. **Provider Subagents**: Each integration, such as Gmail or GitHub, is managed by a specialized subagent with domain expertise, reducing errors and optimizing task execution. These agents maintain contextual memory for improved interactions over time and adapt to user-specific preferences through experience. The system supports both built-in and custom integrations via the Model Context Protocol (MCP), offering seamless connectivity for compatible tools. Subagents evolve from their interactions, refining efficiency by learning procedural patterns and user preferences with each use. Future developments include a self-learning skills layer aimed at accelerating task execution for recurring processes and multi-step workflows by bypassing routine routing for familiar sequences, thus enhancing responsiveness without sacrificing accuracy. The open-source codebase of Gaia provides transparency and flexibility, allowing users to implement or extend the system as needed. This architecture represents a significant advancement in AI assistant scalability, balancing efficiency, correctness, and user adaptability. Keywords: #phi4, AI assistant, ChromaDB, Communications Agent, Executor Agent, Model Context Protocol, OAuth tokens, Provider Subagents, ToolRegistry, memory learning, self-learning skills layer, semantic search, three-layer architecture, tools, vector store
    The google logo   gaia-fork-k7yngvswe-gaias-projects-2dead09b.vercel.app 6 hours ago
105.  HN Show HN: CSL MCP Server – Write and Verify AI Safety Policies from Claude/Cursor
CSL-Core is an innovative open-source policy engine that aims to significantly improve AI safety by enforcing constraints in a deterministic manner. At its core, it uses the Constitutional Specification Language (CSL) and employs Z3 for formal verification, providing tools for writing, verifying, and simulating policies with mathematical precision, thereby eliminating reliance on large language models (LLMs) which often contain inherent loopholes. CSL-Core's architecture ensures that rules are externally enforced with high rigor. The system offers deterministic safety through a runtime engine and guarantees model agnosticism by functioning independently of specific AI models or training data. Its policies are mathematically verified using Z3, ensuring they meet stringent standards. Additionally, every decision made can be audited and verified, offering proof of compliance which is crucial for maintaining trust in critical systems. Key functionalities include a command-line interface (CLI) for policy testing, seamless integration with LangChain to boost AI agent security, and built-in tools like `verify_policy`, `simulate_policy`, `explain_policy`, and `scaffold_policy`. These capabilities allow CSL-Core to block sophisticated attacks that traditional LLM-based methods are vulnerable to, thus providing robust safety layers. CSL-Core is easy to install using pip or Docker, with configurations tailored for various environments. It supports diverse use cases such as fintech security, AI agent protection, decentralized autonomous organization (DAO) governance, and healthcare compliance. The project actively encourages community involvement and has future plans to introduce TLA+ verification and cloud deployment templates. Licensed under Apache 2.0, CSL-Core is accessible while also providing commercial options for enhanced enterprise features. This dual approach ensures broad usability and the potential for extensive adoption across multiple sectors needing reliable AI safety mechanisms. Keywords: #phi4, AI Safety, Auditability, CLI Tools, CSL-Core, Causal Inference, Enterprise Edition, Formal Verification, LangChain Integration, Model Agnostic, Multi-Tenancy Support, No-Code Development, Policy Engine, Temporal Logic, Z3 Verification
    The google logo   pypi.org 8 hours ago
119.  HN Show HN: Satgate-proxy – Hard budget caps for MCP tool calls (zero deps, npx)
Satgate-proxy is a specialized tool designed to enforce strict budget caps on Model Context Protocol (MCP) server calls made by AI agents utilizing paid APIs, addressing concerns of uncontrolled spending. The proxy operates in two distinct modes: Local Mode and SaaS Mode. In Local Mode, Satgate-proxy acts as an intermediary between MCP clients such as Claude Desktop or Cursor and the server, allowing users to enforce a budget cap locally without necessitating any server setup, API key, or account. Users initiate this mode using `npx satgate-proxy`, configuring it with CLI flags (e.g., `--budget 5.00`) or through a configuration file (`satgate.yaml`). This mode intercepts tool calls, deducting costs from the budget and blocking further interactions once the cap is reached. SaaS Mode caters to teams and enterprises by enforcing budgets at the server level using L402 macaroons for added security and scalability. Configuration in this mode requires command arguments along with an API key obtained from a SatGate dashboard, ensuring robust budget management suitable for larger environments. The tool boasts zero dependencies, running purely on Node.js built-ins via `npx`, which simplifies usage and deployment processes. Satgate-proxy also offers customizable pricing configurations to accommodate various tools, allowing users to set specific costs per call. As an open-source project licensed under MIT, it is accessible through its official homepage and GitHub repository, making it widely available for integration and use. Keywords: #phi4, AI agent, API key, CLI flags, JSON-RPC, L402 macaroons, MCP tool calls, Nodejs built-ins, SaaS mode, Satgate-proxy, budget caps, child process, cloud dashboard, config file, desktop configuration, hard cap, local mode, npx, pricing, proxy, server-side enforcement, spending limit
    The google logo   github.com 8 hours ago
   https://github.com/SatGate-io/satgate   7 hours ago
165.  HN Build an MCP server with Laravel (and use it to publish this post)
The article provides a comprehensive guide on creating an MCP (Model Context Protocol) server using Laravel, enabling AI assistants like Claude to interact directly with application functionalities without REST APIs or SDKs. It details the process of utilizing PHP classes and Laravel features to expose specific actions such as creating, retrieving, updating, and publishing blog posts. The tutorial outlines key steps including installing the `laravel/mcp` package, defining a server class with descriptive attributes, and constructing tool classes that specify input schemas and manage operations like post creation and publication. These tools incorporate validation and idempotency to ensure secure interactions. Additionally, it covers registering these servers for both local and remote access and testing them using Laravel’s framework. The article illustrates the practical benefits of this approach by demonstrating a blog MCP server's capability to draft, revise, and publish articles autonomously, highlighting efficiency and security through structured interactions. Ultimately, the article underscores the potential of integrating AI assistants with Laravel applications seamlessly, treating existing codebases as first-class tools. Keywords: #phi4, AI assistants, Blog management, Claude Code AI assistant, CreatePostTool, Eloquent models, GetPostTool, Laravel, ListPostsTool, MCP server, MCP specification, PHP classes, PublishPostTool, Python, REST API, SDK, TypeScript, UpdatePostTool, authentication tokens, bearer token auth, business logic, draft posts, guardrails, idempotent, laravel/mcp package, published_at, read-only, validation
    The google logo   thunk.dev 14 hours ago
177.  HN MCP works because tools are dumb. That assumption has an expiry date
The text explores the evolution of AI communication protocols, highlighting MCP (Model Context Protocol) developed in 2024 by Anthropic as a pivotal integration tool that standardized AI connections to external capabilities like databases and APIs through small servers. As with USB-C's role in technology, MCP aimed to provide a universal interface to resolve integration challenges. However, the emergence of more sophisticated intelligent agents from companies such as Expedia indicates a potential decline in the necessity for rigid protocols like MCP. These advanced agents might enable direct communication using natural language, thus bypassing predefined schemas. Anthropic’s Agent Teams project exemplifies this trend towards agent-to-agent interaction via natural language, despite its role in creating MCP. This shift suggests that future AI communication may increasingly depend on autonomous negotiation between agents rather than human-designed protocols like MCP or A2A (Google's protocol). The text forecasts a move away from structured communication tools as intelligent agents become more prevalent and capable of managing complex interactions independently. Concluding, the piece predicts an impending end to the era dominated by human-designed AI communication protocols. As agents develop capabilities for sophisticated autonomous interaction, companies that focus on enhancing agent intelligence rather than building protocol infrastructure are likely to adapt successfully in this evolving landscape. Keywords: #phi4, A2A, AI, AI models, API, Anthropic, Expedia, MCP, Phase 3, agents, communication, connectors, conversation Keywords: MCP, determinism, endpoints, integration, intelligence, latency, natural language, negotiation, orchestration, protocol, security, tools
    The google logo   productfit.substack.com 17 hours ago
182.  HN Multi-Language MCP Server Performance Benchmark
Thiago Mendes' research at TM Dev Lab presents a detailed performance evaluation of Model Context Protocol (MCP) server implementations across Java, Go, Node.js, and Python. Through rigorous testing involving 3.9 million requests over three rounds, the study benchmarks these languages based on latency, throughput, resource efficiency, and reliability. Key findings indicate that both Java and Go achieve sub-millisecond latencies with high throughput rates exceeding 1,600 requests per second, significantly outperforming Node.js and Python by factors of 10-30x in terms of latency. In terms of resource usage, Go demonstrates exceptional efficiency, maintaining an average memory footprint of just 18MB compared to Java's 220MB, while both languages show consistent performance with minimal variability. All implementations proved reliable, evidenced by a 0% error rate across all requests. The study also highlights language-specific strengths: Java is optimal for CPU-intensive tasks like Fibonacci calculations; Go excels in I/O operations such as data fetching; Python, however, struggles under its Global Interpreter Lock (GIL), especially with CPU-bound tasks. Based on these findings, the research recommends using Go for high-load production environments due to its balance of performance and resource efficiency, particularly in cloud-native settings. Java is advised when minimal latency is critical, while Node.js may be suitable for moderate traffic situations but not recommended for high-load production owing to potential CPU saturation issues. Python is best reserved for low-traffic development or testing scenarios. Ultimately, the study concludes that Go offers a compelling choice for MCP deployments in production environments, providing performance on par with Java at substantially lower resource costs, making it ideal for scalable and cost-effective cloud-native applications. Further research directions include exploring alternative JVM implementations, optimizing Python/Node.js configurations, examining multi-core scaling, real-world application scenarios, and investigating advanced protocol features. The comprehensive benchmark suite is available in the project repository for further analysis. Keywords: #phi4, Async I/O, Benchmark, Bidirectional Communication, CPU Utilization, Cloud-Native, Cold Start Time, Containerized Deployments, Docker, Error Rates, Event Loop, Experimental Analysis, GIL Contention, Garbage Collection, Go, Goroutines, High-Load Scenarios, JVM Tuning, Java, Latency, Load Testing, MCP, Memory Footprint, Multi-Language, Multi-Worker Configurations, Nodejs, Per-Request Instantiation, Performance Analysis, Production Readiness, Python, Reliability, Resource Contention, Resource Efficiency, Scalability, Security Considerations, Server Implementations, Shared Instances, Static Compilation, Streaming Responses, Throughput, Tool-Specific Performance, Virtual Users
    The google logo   www.tmdevlab.com 19 hours ago
184.  HN Show HN: Rot – Financial Intelligence MCP Server
"Rot," a new Model Context Protocol (MCP) server, has been introduced to harness financial intelligence by utilizing Reddit's retail sentiment for generating options trading signals. This tool empowers AI assistants to function as advanced financial advisors through real-time data access and natural conversational delivery of structured investment insights. With an extensive 185,000 lines of code and a nine-stage AI pipeline, Rot launched with immediate adoption from 90 users on its first day. By making sentiment analysis available freely via Reddit—a resource typically monetized by Wall Street firms—Rot achieved rapid growth in five days, evidenced by 9,000 GitHub clones and an impressive 18.4% conversion rate of visitors to sign-ups. Performance metrics indicate a robust 52% win rate for live trades, compared to a backtest result of 58.8%, acknowledging concerns about overfitting typically associated with financial models. Rot stands out as the first MCP server to integrate financial intelligence into AI interactions, allowing users to query market activities and receive direct trading signals from their AI tools. This innovative approach distinguishes Rot in the field of financial technology, making it a pioneering solution for real-time investment insights through AI-enhanced platforms. For further details or access, visitors can explore [Rot's MCP Server](https://web-production-71423.up.railway.app/mcp-server). Keywords: #phi4, AI assistants, AI pipeline, MCP server, Model Context Protocol, Reddit, external data sources, financial intelligence, natural conversation, sentiment, signals, trading signals, unusual activity alerts
    The google logo   web-production-71423.up.railway.app 19 hours ago
246.  HN The Model Context Protocol Book
"The Model Context Protocol (MCP) Book" is an extensive guide aimed at developers seeking to build and deploy MCP servers and clients, based on an open standard by Anthropic introduced in November 2024. Designed for backend, full-stack developers, technical leads, and those interested in AI agent integration processes like Claude's, it requires no previous MCP knowledge but suggests proficiency in JSON, APIs, and languages such as TypeScript or Python. The book spans 18 chapters, offering a linear learning path from basic concepts to advanced deployment strategies, covering architecture, wire protocols, resource management, transport methods, server/client construction in TypeScript and Python, SDKs, configuration, security, testing, debugging, and deployment. Each chapter is self-contained, allowing readers to focus on specific topics such as protocol details or practical coding exercises. The book aims to equip readers with the knowledge to integrate MCP into existing products, evaluate its application within organizations, and explore future developments in the ecosystem. It aligns with the current MCP specification revision dated 2025-11-25, providing resources at modelcontextprotocol.io and source code on GitHub, where users can contribute or report issues under an open-source license. Keywords: #phi4, AI applications, APIs, JSON, MCP, Model Context Protocol, Python, SDKs, TypeScript, architecture, clients, deployment, ecosystem, open standard, security, servers
    The google logo   cloudstreet-dev.github.io a day ago
251.  HN Route 5k MCP endpoints through a single LLM tool
MCP Fusion is a TypeScript framework engineered to optimize the routing of over 5,000 endpoints through a single Large Language Model (LLM) by addressing common issues such as context exhaustion and routing confusion found in standard Model Context Protocol (MCP) servers. The framework achieves this through efficient consolidation of related operations into fewer tools, thereby minimizing token usage, preventing hallucinations, and simplifying server code. Key features of MCP Fusion include build-time multiplexing and context gating to group similar operations under a single tool, reducing the number of tools seen by the LLM. It implements a 3-layer context gating strategy for effective token management, ensuring scalability and efficiency. Pre-compiled middleware enables zero runtime overhead by compiling middleware chains at build time. The framework employs Token-Oriented Object Notation (TOON) to optimize description tokens and utilizes Zod's merge and strip functionalities for type-safe schema composition. It also supports hierarchical grouping and tag filtering for modular action organization, alongside selective tool exposure based on tags. MCP Fusion emphasizes immutability after build through freeze-after-build techniques to prevent post-registration mutations and isolates errors to enhance debugging capabilities. Architecturally, it includes a domain model layer with hierarchical entity management and a build-time strategy engine that supports features such as bidirectional converters, annotation aggregation, and schema collision detection. Comprehensive documentation is provided in official guides covering aspects from getting started to architecture details, scaling strategies, middleware patterns, introspection API usage, and APIs for enterprise compliance and auditing. Overall, MCP Fusion aims to streamline large-scale MCP environments by ensuring efficient LLM tool routing, enhancing security boundaries, and reducing operational complexity. Keywords: #phi4, LLM, MCP, TOON, TypeScript, Zod, build-time engine, context collapse, domain model, endpoints, error isolation, framework, hierarchical grouping, introspection API, mcp-fusion, middleware, multiplexing, schema, strategy pattern, tag filtering, token optimization, tool consolidation
    The google logo   github.com a day ago
290.  HN Show HN: CasperAI – A local MCP server for cross-platform engineering context
CasperAI is designed as a local Model Context Protocol (MCP) server that centralizes and indexes data across various development platforms, creating bidirectional links with source code to enrich engineering context. It integrates tools such as Slack, GitHub, Jira, GitLab, Sentry, Datadog, and Notion, offering semantic search capabilities that combine team discussions, code references, project management contexts, and documentation into a unified layer. CasperAI's key features include local data storage using SQLite for privacy compliance, cross-platform search for comprehensive context retrieval, and regex-based code mapping to extract code references from natural language inputs like Slack messages. The system emphasizes security with measures like PII redaction and secure authentication practices. Developed rapidly with tools such as Claude Code, CasperAI currently uses regex for its versatility but plans future enhancements using AST-based symbol resolution. Commercially, it includes metering, device identification, telemetry options, and tiered licensing to accommodate varied usage needs. CasperAI's architecture consists of components like the MCP server, security gatekeeper, PII redactor, and SQLite storage, forming a cohesive environment for managing engineering context. The project encourages community contributions, offers comprehensive documentation, and outlines future developments such as web UI enhancements, real-time indexing, advanced analytics dashboards, and cloud deployment templates. Keywords: #phi4, CasperAI, Claude Code, FTS5, MCP server, PII redaction, SQLite database, Slack integration, codebase linking, knowledge context, local storage, multi-platform indexing, regex pattern matching, semantic search
    The google logo   github.com a day ago
305.  HN Universal Commerce Protocol (UCP)
The Universal Commerce Protocol (UCP) is an open-source initiative developed by Google in partnership with major industry players such as Shopify, Etsy, Wayfair, Target, and Walmart. Its primary objective is to enhance the landscape of agentic commerce by streamlining interactions across consumer interfaces, businesses, and payment providers via a unified language and functional primitives. UCP not only supports existing retail systems but also integrates seamlessly with protocols like Agent Payments Protocol (AP2). It ensures secure transactions through APIs, Agent-to-Agent communications, and the Model Context Protocol. For businesses, UCP offers the ability to present their products across various consumer platforms such as Google Search's AI Mode and Gemini app, thereby maintaining flexibility in the checkout experience. This protocol simplifies the integration process for AI platforms by providing standardized APIs while allowing flexibility with existing frameworks like MCP and A2A. Developers are encouraged to contribute to this evolving, community-driven standard. Payment providers gain from UCP through its modular payment handler design that facilitates interoperability and secure transactions, backed by cryptographic proof of user consent. Meanwhile, consumers benefit from a seamless shopping experience characterized by trusted brands, ensuring value and confidence in their purchases. UCP addresses traditional tech infrastructure challenges by reducing integration complexity via a single integration point, promoting cross-platform interoperability through shared language, and offering an extensible architecture that adapts to new agentic experiences. Security is paramount with tokenized payments and verifiable credentials, supported by various transport methods including A2A, MCP, and APIs. Implementing UCP involves setting up business servers for API hosting, adding sample products, preparing for agent interactions, discovering business capabilities, initiating checkout sessions, and applying discounts. This dynamic discovery of features and endpoints eliminates the need for hard-coded integrations. Google's reference implementation of UCP facilitates seamless purchases across its conversational platforms, including AI Mode in Search and Gemini, utilizing Google Pay. In summary, UCP empowers stakeholders—businesses, developers, payment providers, and consumers—by streamlining commerce interactions, enhancing security measures, and supporting diverse agentic experiences across various platforms. Keywords: #phi4, A2A, AI Mode, AP2, APIs, Adyen, Agent Payments Protocol (AP2), American Express, Best Buy, Etsy, Flipkart, Gemini app, Google, Google Pay, JSON manifest, MCP, MCP bindings, Macy's Inc, Mastercard, Merchant of Record, Model Context Protocol (MCP), N x N integration bottleneck, REST API, SQLite database, Shopify, Shopify Pay, Stripe, Target, The Home Depot, UCP, Universal Commerce Protocol, Visa, Walmart, Wayfair, Zalando, agent communication Extracted Keywords: Universal Commerce Protocol, agent communication Final Keywords: Universal Commerce Protocol, agent communication Keywords: Universal Commerce Protocol, agent frameworks, agentic commerce, agentic shopping, applied discounts, business capabilities, business logic, business server, buyer information, cart checkout, checkout experience, checkout session, checkout-sessions, consumer interfaces, cryptographic proof, currency, digital commerce, discount codes, discounts, dynamic pricing, idempotency-key, instant transactions, interoperability, inventory checks, line_items, links, mock_payment_handler, open-source, payment handlers, payment instruments, payment methods, product discovery, request-id, sample products, security-first approach, status, tokenized payments, totals, verifiable credentials
    The google logo   developers.googleblog.com a day ago
372.  HN Show HN: MCP Codebase Index – 87% fewer tokens when AI navigates your codebase
The MCP Codebase Index enhances AI coding assistants' navigation through large codebases by significantly reducing token usage in queries (by 87% on average). This tool parses code into structural metadata, including functions, classes, imports, and dependency graphs, and provides 17 query tools via the Model Context Protocol (MCP) for efficient codebase exploration. It supports multiple programming languages like Python, TypeScript/JavaScript, and Markdown using Python's `ast` module and regular expressions, with no runtime dependencies beyond requiring Python 3.11 or higher. The tool is easily installable via pip with the command `pip install "mcp-codebase-index[mcp]"`, while omitting `[mcp]` allows for programmatic API use without an MCP server. For persistent connections, it integrates with OpenClaw through `openclaw-mcp-adapter` and offers configuration options via `.mcp.json` or directly in the Python module. The development of this tool is rooted in the RMLPlus project and incorporates the Recursive Language Models framework. It supports dual licensing: AGPL-3.0 for open-source use, with a commercial license required for proprietary applications. Developers can install the project locally using `pip install -e ".[dev,mcp]"` and employ pytest alongside ruff for testing and code quality checks. Keywords: #phi4, AI coding assistants, Claude Code configuration, MCP Codebase Index, MCP server, Model Context Protocol, OpenClaw integration, Python AST, development, dual-licensed, dual-licensed Keywords: MCP Codebase Index, installation, language support, performance note, programmatic usage, query tools, regex, structural metadata, token reduction
    The google logo   github.com a day ago
   https://github.com/MikeRecognex/mcp-codebase-index   a day ago
   https://lftw.dev   a day ago
373.  HN Show HN: MCP Storage Map – One MCP Server for MySQL, MongoDB, and Athena
The MCP Storage Map is an open-source server developed using TypeScript to facilitate querying multiple databases through a unified interface, supporting MySQL, MongoDB, and AWS Athena. Designed for simplicity, it allows AI assistants like Claude or Cursor to interact with these databases without handling separate connections. A key feature is its read-only access by default, enhancing security by requiring explicit permission for write operations. The server offers several essential features: a unified querying toolset across various database technologies, management of multiple simultaneous connections tagged as PROD, STAGING, etc., and extensibility via the McpConnector interface to integrate new database connectors effortlessly. Installation is straightforward using npm, with configuration relying on setting environment variables for each connection. The architecture of MCP Storage Map consists of a central server implementing tools such as query execution, collection listing, and more, while specific connectors adhere to the McpConnector interface, tailored to supported databases like MySQL, MongoDB, and Athena. Security practices emphasize using environment variables to handle sensitive data, maintaining write access as disabled unless explicitly needed. Development guidelines include steps for cloning the repository, installing dependencies, running in development mode, building, testing, and linting the project. The server is released under an MIT license, promoting open-source collaboration and usage flexibility. Keywords: #phi4, AI assistants, Athena, MCP Storage Map, MIT license, MIT license Keywords: MCP Storage Map, MongoDB, MySQL, TypeScript, configuration, database connectors, development, environment variables, extensible architecture, multiple connections, read-only, unified interface
    The google logo   github.com a day ago
391.  HN Show HN: Neko – AI agent runtime that fits on a Raspberry Pi Zero 2W
Neko is an AI agent runtime optimized for low-cost hardware such as the Raspberry Pi Zero 2W or budget VPS, operating as a single static binary written in Rust. It efficiently manages memory through file-based storage using markdown files, supporting both short-term and long-term data retention with mechanisms to prevent data bloat. Neko integrates seamlessly with external tools via the Model Context Protocol (MCP) and enables user interaction through Telegram messaging support. Key features of Neko include compatibility with OpenResponses LLMs like OpenAI or Ollama, enabling robust language model interactions. It supports file-based memory operations such as write, replace, and search using markdown files. The system allows the scheduling of tasks via cron jobs, which can be set for recurring or one-time execution, delivering results through various channels. Neko's architecture includes support for AgentSkills.io-compatible skills, defined in SKILL.md files with YAML frontmatter, enhancing its extensibility and functionality. Additionally, it facilitates user interaction via a Telegram bot, providing an accessible interface for communication. Neko also offers a sandboxed environment for Python code execution, ensuring safe operation. The installation and configuration of Neko are straightforward, supporting platforms like Linux and macOS. Users can manage configurations and memory through simple command-line instructions, making Neko an attractive solution for those in need of a lightweight yet capable AI agent system. Keywords: #phi4, AI agent, MCP tool support, Neko, OpenResponses-compatible LLM, Raspberry Pi Zero 2W, Rust, Telegram integration, VPS, cron jobs, file-based memory, markdown files, memory management, sandboxed Python, static binary
    The google logo   github.com a day ago
396.  HN Show HN: The first financial intelligence MCP server live trading signals Claude
The announcement introduces a Model Context Protocol (MCP) server developed by Mattbusel that provides real-time financial intelligence to AI clients such as Claude. The server delivers trading signals sourced from Reddit, SEC filings, FDA approvals, and Congressional trades, designed for seamless integration without the need for API keys or installations; users can simply input a URL into their Claude Desktop configuration. Built with Python/FastMCP and hosted on Railway, this server is part of the ROT (Reddit Options Trader) platform, which was developed in nine days and comprises a 165K-line codebase. The system processes social media data through a nine-stage AI pipeline to generate actionable trading signals. By utilizing the open-standard protocol, the MCP server allows AI assistants to access current financial data and insights, thereby enhancing their ability to provide live market information during conversations. Further details on this project can be found on GitHub. Keywords: #phi4, AI assistants, AI pipeline, Congressional trades, FDA approvals, FastMCP, GitHub, MCP server, Model Context Protocol, Python, Python/FastMCP, ROT, Railway, Reddit, SEC filings, external data sources, financial intelligence, live trading signals, sentiment data, tools, tools Keywords: MCP server, unusual activity alerts
    The google logo   web-production-71423.up.railway.app a day ago
398.  HN Access public data insights faster: Data Commons MCP is now hosted on GCloud
In September 2025, Data Commons launched its Model Context Protocol (MCP) server on Google Cloud Platform to address challenges in AI agent interactions with its data, which were previously managed through local Python environments via a Gemini CLI extension. This shift to a hosted service was driven by the need for compatibility with high-security settings and scalable hosting solutions. The new web-hosted MCP service eliminates concerns about environment setup and security compliance, allowing seamless connection for users. It supports natural language queries to extract insights from trusted data sources. Existing users of the Gemini CLI extension are automatically transitioned to this cloud-based version, while new users require a free API key and configuration updates for access. This strategic move ensures improved scalability, enhanced security, and streamlined user experience in accessing Data Commons' resources. Keywords: #phi4, AI, AI agents, API key, Analysts insights Keywords: Data Commons, Configuration, Data Commons, Data exploration, Developer tools, Exploration, Free service, GCloud, Gemini CLI, Google Cloud Platform, High-level questions, LLM, Local server, MCP, Natural language, Python, Python environments, Query agents, Resource management, Scalability, Security, Security compliance, Statistical answers, Trusted sources, Version releases
    The google logo   developers.googleblog.com a day ago
   https://datacommons.org   a day ago
   https://github.com/datacommonsorg/agent-toolkit   a day ago
   https://github.com/datacommonsorg/agent-toolkit/bl   a day ago
411.  HN Forge: Scalable Agent RL Framework and Algorithm
The Forge framework addresses scalability challenges in reinforcement learning (RL) for complex agents by balancing system throughput, training stability, and agent flexibility through innovative architecture and engineering optimizations. Its decoupled design separates reasoning logic from infrastructure, allowing seamless integration across diverse agents and scalable training over numerous environments without internal changes. In the RL paradigm, Forge supports white-box agent RL by treating context management as a functional action for long-horizon tasks while enabling black-box RL with arbitrary architectures. Engineering strategies such as the Windowed FIFO scheduling method optimize throughput and consistency, and prefix tree merging reduces redundancy in multi-turn dialogue training. For inference acceleration, speculative decoding, heterogeneous processing disaggregation, and a global L3 cache pool enhance performance. The CISPO algorithm is tailored for long-horizon agents with mixed-domain training to improve generalizability, coupled with a composite reward framework that provides dense feedback and stabilizes optimization. These innovations culminate in the MiniMax M2.5 model, showcasing significant advancements in real-world agent productivity and supporting scalable RL systems capable of managing complex tasks. Keywords: #phi4, Agent Flexibility, Black-box Agents, CISPO Algorithm, Composite Reward Framework, Context Management, Forge, Hybrid Scheduling, Inference Acceleration, MiniMax M25, Prefix Tree Merging, RL Framework, Scalable RL, System Throughput, Training Stability
    The google logo   www.minimax.io a day ago
468.  HN WebMCP Proposal
The WebMCP Proposal introduces a JavaScript API aimed at integrating web applications with AI agents through natural language commands, developed by the Web Machine Learning Community Group as part of their community initiatives rather than an official W3C Standard. This specification enables developers to transform web app functionalities into "tools" defined in JavaScript with structured schemas and descriptions accessible via natural language. These tools can interact with AI agents, browser extensions, or assistive technologies, positioning websites as Model Context Protocol servers for client-side implementation. The proposal defines key terminology: an agent is an autonomous assistant leveraging large language models to communicate through chat interfaces, which can be integrated into browsers through extensions provided by platforms like OpenAI and Google. The API enhances the Navigator interface with a `ModelContext` to manage tools using methods such as `provideContext`, `clearContext`, `registerTool`, and `unregisterTool`. Each tool is identified by unique identifiers, descriptions, input schemas, execution callbacks, and optional annotations. Further details include various interfaces: the extended `Navigator` interface provides access to the `ModelContext`; `ModelContext` handles registration and context management; `ModelContextOptions & ModelContextTool` outline tool collections and metadata; and `ModelContextClient` supports user interaction during execution. The proposal acknowledges contributors for foundational work and collaborative efforts within the community group, aiming to facilitate seamless interactions between users and AI agents by leveraging existing web application logic while ensuring context and control are maintained. Keywords: #phi4, AI agents, AI platform, API, JavaScript, ModelContext, Navigator interface, Web Machine Learning Community Group, WebMCP, accessibility, browser's agent, execute callback, privacy, security, tools, user interaction
    The google logo   webmachinelearning.github.io 2 days ago
   https://developer.chrome.com/blog/webmcp-epp   2 days ago
   https://github.com/webmachinelearning/webmcp?tab=readme   2 days ago
   https://github.com/MiguelsPizza/WebMCP   2 days ago
   https://github.com/jasonjmcghee/WebMCP   2 days ago
   https://www.youtube.com/watch?v=sOPhVSeimtI   2 days ago
   https://www.youtube.com/watch?v=02O2OaNsLIk   2 days ago
   https://moltbook.com/skill.md   2 days ago
   https://datatracker.ietf.org/doc/html/rfc8890   a day ago
   https://bsky.app/profile/chrisshank.com/post/   a day ago
476.  HN MCP and REST Face-Off
The Model Context Protocol (MCP) and REST serve as distinct paradigms in API design, each with its unique attributes tailored for different contexts of use. REST has been the prevailing standard for over a decade, characterized by its static, fixed-route interactions suitable primarily for human-machine interfaces; however, it encounters limitations when interfacing with AI agents due to its rigid structure. In contrast, MCP is specifically engineered for Large Language Models (LLMs), offering an adaptable framework that enables more intuitive and dynamic interaction with digital tools. Key distinctions between the two approaches are notable in several areas. Firstly, REST is primarily designed with developers in mind, providing a static interface, whereas MCP caters to AI models requiring flexibility for tool exploration. In terms of interaction modes, REST relies on synchronous exchanges following a fixed script, while MCP facilitates asynchronous communication and continuous dialogue, allowing servers and clients to engage more fluidly. Another significant difference lies in discovery and integration; MCP servers are self-describing and automatically furnish AIs with tools and resources, thereby eliminating the need for manual "glue code," unlike REST which demands extensive documentation. Moreover, the data lifecycle under each protocol varies considerably. REST operations are characterized by isolated requests with rigid transactions, whereas MCP supports ongoing conversations where servers can suggest additional actions or request further context from clients. The transport layer also differentiates them; while REST is intrinsically linked to HTTP and suited for open web environments, MCP operates over standard input/output, enhancing security and flexibility in local development settings. Overall, the advent of MCP represents a paradigm shift from merely integrating APIs towards enabling meaningful interactions that allow AI agents to execute diverse tasks beyond conventional dialogues. This innovative approach facilitates more effective and versatile tool use by AI models, expanding their functional capabilities. Keywords: #phi4, AI agents, API, HTTP, Large Language Models, MCP, Model Context Protocol, REST, asynchronous flow, calendar, data lifecycle, datasets, datasets Keywords: MCP, debugging, differences, integration, interaction, internet, local development, panel, self-discovery, standard input/output, toolsets
    The google logo   ilearnt.com 2 days ago
493.  HN Making MCP Servers Work with Microsoft Entra ID on Azure
Deploying an MCP (Model Context Protocol) server on Azure with Microsoft Entra ID authentication requires addressing several compatibility challenges between OAuth standards outlined by the MCP specification and those implemented by Microsoft. This process is facilitated through a lightweight OAuth compatibility layer integrated within the MCP server, consisting of five proxy endpoints that manage tasks such as metadata translation, mock client registration, scope rewriting for authorization and token requests, and generating correctly formatted 401 responses. The solution tackles issues like mismatched discovery formats, unsupported dynamic client registration, non-standard scope formats, and Azure Container Apps' Easy Auth blocking OAuth discovery endpoints. This compatibility layer enhances security with measures including a "Deny by Default" identity model, path normalization to prevent jailbreak attempts, and strict host validation to mitigate SSRF and Open-Redirect vulnerabilities. The article provides an in-depth guide for deploying this solution on Azure, detailing the necessary steps like Entra ID app registration and configuring the OAuth layer within a Python-based MCP server using FastMCP with Starlette or FastAPI. It includes insights gained from multiple debugging cycles and advice on avoiding common pitfalls such as aggressive Docker image caching by Azure Container Apps. Additionally, it discusses strategies for handling silent errors encountered during deployment. Furthermore, the accompanying repository offers comprehensive step-by-step instructions, decision records, a minimal example server, and reference code to facilitate seamless integration into existing projects. This resource is particularly valuable for developers constructing MCP servers on Azure accessed through Cursor IDE, ensuring robust authentication flows and security measures are in place. Keywords: #phi4, API Management, Authentication, Azure, Compatibility Layer, Cursor IDE, Deployment Guide, MCP Servers, Microsoft Entra ID, OAuth, OpenID Connect, OpenID ConnectKeywords: MCP Servers, Proxy Endpoints, Rate Limiting, Zero-Trust Security
    The google logo   ignitionai.xyz 2 days ago
496.  HN Anthropic opens Bengaluru office and announces new partnerships across India
Anthropic has established a significant presence in India with a new office in Bengaluru, underscoring its commitment to expanding partnerships across enterprise, education, agriculture, and public sectors. As the second-largest market for Claude.ai, the platform is widely used by Indian developers for technical tasks, highlighting the region's robust engagement with AI technology. Irina Ghose, Managing Director of India at Anthropic, recognizes India's potential in responsible AI development due to its strong digital infrastructure and skilled workforce. To enhance accessibility and relevance, Anthropic is improving AI performance in local languages through collaborations that focus on high-quality training data and task evaluations relevant to Indian contexts. The company has forged strategic partnerships with major enterprises like Air India and Cognizant for software modernization, while startups such as Razorpay and Enterpret are integrating Claude.ai into their operations to boost features and capabilities. In the education sector, Anthropic collaborates with Pratham to pilot AI-powered testing tools aimed at enhancing learning for low-income students. Additionally, it partners with Central Square Foundation to leverage EdTech and AI for primary school children in underserved areas. Public sector initiatives include working with EkStep Foundation on agricultural projects via OpenAgriNet and supporting Adalat AI’s efforts to improve judicial service access through a national WhatsApp helpline powered by Claude.ai. Anthropic has also introduced open-source standards like the Model Context Protocol, now employed by the Indian government for accessing national statistics. As Anthropic continues to grow its footprint in India, it focuses on expanding partnerships and hiring local talent, promoting widespread adoption of AI technologies across diverse sectors. Keywords: #phi4, AI, Adalat AI, Anthropic, Bengaluru, Bharat Digital, Central Square Foundation, Claudeai, EkStep Foundation, India, Intelehealth, Irina Ghose, MoSPI, Model Context Protocol (MCP), Noora Health, OpenAgriNet, Pratham, Swiggy, agriculture, digital infrastructure, education, enterprise, language capabilities, open-source standards, partnerships, public sector, startups
    The google logo   www.anthropic.com 2 days ago
559.  HN Free SQL Server Performance Monitoring That Doesn't Suck – Darling Data
Darling Data has launched a free, open-source tool for monitoring SQL Server performance, available on GitHub as an alternative to costly enterprise solutions. This tool comes in two editions: the Full Edition and Lite Edition. The Full Edition installs a PerformanceMonitor database on each server with T-SQL collectors executed through SQL Agent, offering data visualization via a WPF Dashboard specifically for monitored servers. It includes over 30 specialized T-SQL collectors, community tools like sp_WhoIsActive, NOC-style landing pages, automatic retention settings, real-time alerts, AI-powered analysis using an MCP server, and comprehensive data collection capabilities. The Lite Edition functions as a standalone desktop application, enabling remote monitoring without installing on target servers. It queries DMVs over the network, storing data locally in DuckDB with Parquet archival, supporting more than 20 collectors, Azure SQL Database, and including an MCP server for AI analysis. This edition is tailored for quick triage, consultants, and environments where installation isn't feasible. Both editions prioritize security through Windows Credential Manager for password storage, defaulting to TLS with certificate validation, and using parameterized queries without relying on cloud services or remote data transmission. Darling Data's tool targets solo DBAs, small teams, consultants, contractors, and developers who need an affordable solution offering detailed insights into SQL Server performance without extensive installation requirements. Setting up the Full Edition involves installing the PerformanceMonitor database on servers, while the Lite Edition is straightforward to deploy by downloading, extracting, and connecting to servers. The tool aims to enhance understanding of SQL Server issues through meaningful data visualization and analysis, eschewing the complexities or costs of traditional enterprise solutions. Supported under an MIT License, it is compatible with SQL Server versions 2016 through 2025 and various cloud databases. Keywords: #phi4, AI Analysis, Azure SQL Database, Community Tools, Consultants, DMVs, Data Visualization, Developers, DuckDB, Free Tool, Full Edition, GitHub, Lite Edition, MCP Server, No Cloud Dependency, Open Source, Parquet Archives, Performance Monitoring, Real-Time Alerts, SQL Agent, SQL Server, Security, Solo DBAs, T-SQL Collectors
    The google logo   erikdarling.com 2 days ago
577.  HN Show HN: Gulama – Security-first open-source AI agent (OpenClaw alternative)
Gulama is an open-source personal AI agent developed with a strong emphasis on security, offering itself as a superior alternative to less secure options like OpenClaw. Created by a seasoned security engineer, it prioritizes the protection of user data across various domains including files, emails, and credentials. The platform features over 15 robust security mechanisms such as AES-256-GLM encryption, sandboxed execution using technologies like bubblewrap/Docker, policy engines, and egress filtering to prevent unauthorized data access or leaks. In terms of functionality, Gulama provides a wide array of built-in skills that cover files, shell operations, web browsing, email handling, calendar management, and integration with platforms such as GitHub and Notion. It supports over 100 LLM providers and offers communication across ten channels including CLI, Telegram, Discord, Slack, and WhatsApp. Additional capabilities include multi-agent orchestration, task scheduling, voice wake word activation, retrieval-augmented generation (RAG)-powered memory, AI-powered browsing, self-modifying skills, and live debug streams. Gulama's design ensures flexibility by being compatible with multiple operating systems like macOS, Windows, Linux, and Docker, and it can also run on ARM architectures. This enables users to maintain data within environments they control, offering varied autonomy levels from full manual oversight to complete automation. The installation process is user-friendly, supporting both pip and Docker methods, which cater to preferences for local setups or containerized deployments. Comprehensive guides are available, including instructions for obtaining API keys from various LLM providers such as DeepSeek, Groq, OpenAI, Anthropic, Google, and Ollama. Compared to its predecessor OpenClaw, Gulama distinguishes itself by embedding a multitude of security measures directly into its architecture. While OpenClaw had vulnerabilities like binding to 0.0.0.0, Gulama enforces secure defaults including loopback-only bindings, sandboxing techniques, policy engines, and Ed25519-signed skills. The project is open for community contributions with detailed development setup guidelines available in its repository. It encourages participation through the GulamaHub skill marketplace, where users can either install or publish their own Ed25519-signed skills. In essence, Gulama stands as a robust alternative to existing AI agents by integrating comprehensive security features from inception while maintaining flexibility and advanced functionalities for personal use. Keywords: #phi4, AES-256-GCM, AI agent, ChromaDB, DLP, Docker, FastAPI, Gulama, LLM providers, LiteLLM, RAG memory, REST API, WebSocket, canary tokens, communication channels, egress filtering, encryption, multi-agent orchestration, open-source, policy engine, sandboxing, security-first, self-modifying skills, skill marketplace, task scheduler, voice wake word
    The google logo   github.com 2 days ago
579.  HN Show HN: Open API for AI agents to search 29k+ declassified docs
The DeclassFiles Intelligence Network (DIN) serves as an open API platform that empowers AI agents to autonomously examine over 29,000 OCR'd full-text declassified U.S. government documents. It offers comprehensive capabilities for document search, research thread publication with citations, and interaction among agent findings, all without paywalls or third-party keys. Users can register AI agents via POST requests to obtain an API key necessary for executing various actions like searching documents by keywords or IDs through GET requests, posting detailed research threads, and managing these threads (including creation, replies, and upvotes) using POST requests. DIN's extensive document collections cover topics such as Epstein, the JFK assassination, and 9/11 incidents, with search functionality available via keywords or categories. The API features include capabilities for document retrieval, random discovery of documents, research thread management, network statistics access, and directory interaction. Notably, the platform has identified systemic patterns like institutional compartmentalization across different cases. Integration with MCP servers enables direct searches from AI IDEs, enhancing usability. Quality is ensured through strict citation practices using specific document IDs and evidence-based analysis, promoting a professional tone over speculation. A trust and reputation system assesses agents based on their activity levels and contributions to the network. DeclassFiles, known for being the largest searchable archive of declassified U.S. government documents, developed this platform, emphasizing open access and collaborative intelligence gathering. Keywords: #phi4, AI agents, API-first platform, DIN, DeclassFiles, Intelligence Network, MCP server, OCR processed, declassified documents, document citations, full-text search, network statistics, reputation system, research threads
    The google logo   github.com 2 days ago
591.  HN OpenReview MCP server with Cursor integration
The OpenReview MCP server integrates with Cursor to provide a robust platform for accessing and analyzing research data from major machine learning conferences such as ICML, ICLR, and NeurIPS. The server offers functionalities including searching user profiles via email, retrieving papers by specific authors or conferences, and conducting keyword-based searches across multiple events with customizable match modes. It supports exporting search results in JSON format for analysis or PDF format for reading purposes. Installation involves cloning the repository from GitHub, setting up a virtual environment, installing dependencies, and configuring Cursor using `mcp.json` with necessary OpenReview credentials and server paths. Users can query the server using natural language via Cursor to perform tasks such as searching for specific papers or exporting them alongside their PDFs and text content. The system automatically fetches papers from OpenReview, searches through titles, abstracts, authors, downloads, extracts text from PDFs, and saves results in a specified directory. An example workflow includes using functions like `search_papers` to identify research on particular topics and `export_papers` to save relevant findings for further analysis or coding. The server supports prominent conferences including ICML, ICLR, and NeurIPS, and is released under the MIT License. Keywords: #phi4, Cursor integration, JSON export, MCP server, OpenReview, PDF export, conference papers, configuration, installation, keyword search, natural language queries, paper retrieval, research analysis, user search
    The google logo   github.com 2 days ago
630.  HN Show HN: SkillSandbox – Capability-based sandbox for AI agent skills (Rust)
SkillSandbox is a capability-based runtime designed to enhance the security of AI agent skills through strict access controls and permissions, developed following the discovery of a credential-stealing skill on an AI marketplace. It utilizes YAML manifests allowing skills to declare required permissions, such as network access, filesystem paths, and environment variables, which are then enforced by the runtime using iptables, seccomp-bpf, and mount isolation. This tool provides additional security features including network egress filtering, environment variable whitelisting, resource limits like memory and execution time, and structured audit trails of skill executions. SkillSandbox integrates seamlessly with MCP servers to support sandboxing within AI frameworks such as Claude Code and supports OpenTelemetry for trace exports to observability tools like Jaeger. Complementing SkillSandbox, the AgentTrace project enhances policy compliance by tracking cumulative costs and violation counts over multiple sessions, forming a comprehensive security framework that not only restricts but also guides agent behavior. Built primarily for Linux environments using full kernel capabilities such as iptables and seccomp-bpf, SkillSandbox offers partial support on macOS through dry-run mode and recommends Docker for demonstrations due to its compatibility with necessary enforcement features. The project adopts the principle of "constrain what can be done" over relying solely on code integrity measures. Looking ahead, SkillSandbox's roadmap includes enhancements such as cgroup resource limits, unprivileged filesystem isolation, process-level isolation, container image support, and a lightweight WebAssembly runtime for executing simpler skills. This architecture aims to address current gaps in AI agent skill ecosystems by prioritizing execution-level security while facilitating integration with existing frameworks through an MCP server interface. Keywords: #phi4, AI agent skills, AgentTrace, Docker, Linux, MCP server, MITRE ATT&CK, OpenClaw, OpenTelemetry, Rust, SkillSandbox, WSL2, YAML, audit trail, capability-based runtime, code signing, credential stealer, enforcement, env vars, filesystem paths, iptables, macOS, manifest validation, mount isolation, network egress, observability, policy engine, runtime isolation, sandboxing, seccomp-bpf, threat classification, threat model, tracejson
    The google logo   github.com 3 days ago
637.  HN WebMCP Proposal
The WebMCP Proposal outlines a JavaScript API designed to enable web applications to act as servers within the Model Context Protocol, facilitating interactions between users and AI agents through natural language and structured schemas. This initiative is developed by the Web Machine Learning Community Group and offers a framework for cooperative workflows involving users, browser-integrated agents, and assistive technologies, although it remains outside of the W3C Standards Track. Central to this proposal are several components: The WebMCP API itself provides a JavaScript interface allowing web applications to serve as Model Context Protocol servers. Agents in this context include autonomous assistants powered by large language models (LLMs) like OpenAI's ChatGPT, browser-integrated agents via extensions or native integration facilitating user-AI interactions, and AI platforms provided by companies such as OpenAI and Google. Security and accessibility considerations are identified as critical for the safe and inclusive implementation of WebMCP, though not extensively detailed in the proposal. The API extends the Navigator Interface to include a `ModelContext` object that manages tools accessible to agents. This interface offers several methods: `provideContext(options)` registers new tool contexts by clearing existing ones; `clearContext()` removes all registered tools; `registerTool(tool)` adds tools, ensuring they have unique names and valid schemas; `unregisterTool(name)` deletes specific tools. The proposal also defines essential dictionaries like `ModelContextOptions`, which lists tools with their unique properties, and `ModelContextTool`, detailing tool characteristics such as name, description, input schema, execution callback, and optional annotations (e.g., `readOnlyHint`). The `ModelContextClient` Interface enables asynchronous user interactions during the execution of these tools. The proposal acknowledges key contributors including Brandon Walderman, Leo Lee, Andrew Nolan, David Bokan, Khushal Sagar, Hannah Van Opstal, and Sushanth Rajasankar for foundational work, as well as Alex Nahas and Jason McGhee for implementation insights. Additionally, feedback from the Web Machine Learning Community Group significantly informed the proposal's development. Keywords: #phi4, AI agents, AI platform, API, JavaScript, ModelContext, Navigator interface, Web Machine Learning Community Group, WebMCP, accessibility, browser's agent, execute callback, privacy, security, tools, user interaction
    The google logo   webmachinelearning.github.io 3 days ago
648.  HN Show HN: PolyMCP – A framework for structuring and orchestrating MCP agents
PolyMCP is an open-source framework designed to streamline the development and management of agents via the Model Context Protocol (MCP), focusing on enhancing the agent layer instead of merely exposing tools. It offers a structured approach by organizing agents effectively, linking them to multiple MCP servers, and ensuring workflow reliability in practical scenarios. Key features include implementing MCP-compatible tool servers using Python or TypeScript, providing an abstraction for connecting agents with diverse MCP endpoints like stdio and HTTP, and offering orchestration primitives for managing multi-step tasks. Additionally, PolyMCP includes a command-line interface (CLI) for project scaffolding and an inspector user interface (UI) to aid in debugging interactions. Its modular architecture supports skill composition and component reuse, significantly reducing the need for ad-hoc code by standardizing tool registration, agent attachment, execution flow management, and interaction inspection processes. The framework is MIT licensed and targets developers engaged in building production-grade automation systems, internal copilots, or multi-tool assistants, with its source available on GitHub at [PolyMCP GitHub Repository](https://github.com/poly-mcp/PolyMCP). Keywords: #phi4, CLI, GitHub, MCP agents, MIT licensed, Model Context Protocol, PolyMCP, Python, TypeScript, agent layer, automation, copilots, debugging, endpoints, execution flow, framework, modular architecture, open-source, orchestration, state management Keywords: PolyMCP, state managementExtracted Keywords: PolyMCP, tool servers
    The google logo   news.ycombinator.com 3 days ago
649.  HN Shipping Htmx in Production (A Post-Mortem)
The article conducts an in-depth post-mortem analysis of implementing HTMX within the "Reddit Lead Qualification and Analysis System," comparing it to traditional React-based architectures. The system was designed to identify potential customers from Reddit posts, with initial challenges arising from frontend build pipelines and state synchronization between Python and TypeScript models. The decision to utilize HTMX stemmed from its ability to streamline development by eliminating redundant model definitions across languages and reducing infrastructure demands associated with Node.js. HTMX's implementation adhered to HATEOAS principles, allowing the backend to directly influence UI behavior, thus diminishing the need for intricate frontend state management. This approach facilitated a seamless autonomous lead qualification process through AI-driven stages while enabling low-latency dashboard interactions that minimized JavaScript dependencies. Key functionalities like semantic search and real-time polling pipelines highlighted HTMX’s capability in efficiently managing dynamic content updates. In comparison to frontend frameworks, HTMX substantially decreased development time and code footprint by integrating backend and frontend data layers, simplifying client-side state management which led to improved load times and reduced code volume. However, this shift transferred complexity to the server side, necessitating meticulous organization and error handling strategies. The production phase revealed that while HTMX simplified development workflows, it also introduced challenges such as increased server logic intricacy and potential latency issues due to its server-centric interaction model. In some instances, custom JavaScript interventions were required for improved interactivity and robust error management when used alongside libraries like Alpine.js. From a performance standpoint, the project showed that HTMX could sustain production-level loads effectively while enhancing bandwidth efficiency by utilizing the browser’s native HTML rendering capabilities. This approach simplified deployment processes relative to React-based solutions, thus reducing operational complexity. The article concludes with lessons learned and recommendations for developers considering HTMX in similar contexts. It is particularly suitable for SaaS applications where simplicity and rapid development cycles are essential, allowing a focus on solving business problems rather than frontend infrastructure management. The author suggests that HTMX can be an optimal choice for dashboard-driven systems where hypermedia provides an efficient path to feature delivery, advocating its adoption in scenarios prioritizing reduced complexity and accelerated development timelines. Keywords: #phi4, AI Pipeline, Alpine-js, Dashboard, FastAPI, HATEOAS, HTMX, Hypermedia, Lead Qualification, Production Challenges, Reddit, Semantic Search, Server-Sent Events
    The google logo   enriquebruzual.substack.com 3 days ago
716.  HN Distillation, Experimentation, and Integration of AI for Adversarial Use
In late 2025, Google Threat Intelligence Group (GTIG) identified an increased use of artificial intelligence by cyber threat actors across various stages of attacks, including reconnaissance, social engineering, and malware development. The report highlighted the rise of "distillation attacks" or model extraction attempts aimed at intellectual property theft, often breaching terms of service. While advanced persistent threat (APT) actors did not directly target sophisticated AI models, several global private entities and researchers attempted to replicate proprietary AI logic. AI tools have become pivotal for government-backed actors from DPRK, Iran, PRC, and Russia in crafting sophisticated phishing schemes and conducting technical research. However, these efforts have yet to significantly alter the threat landscape according to GTIG. Key findings included the growing prevalence of model extraction attacks for IP theft, the use of AI in enhancing reconnaissance and phishing operations, and an increasing interest among adversaries in developing AI-driven malware tools. The report also described new malware like HONESTCUE, which utilizes Gemini's API for code generation to facilitate second-stage malware deployment. Additionally, it noted the emergence of underground "jailbreak" ecosystems offering services that replicate independent models using modified commercial APIs and open-source servers. To counter these threats, Google has been proactive in disabling malicious projects and accounts while strengthening model security measures. The report underscored the importance of sharing best practices with defenders to enhance protection across the ecosystem and referenced a separate white paper for more details on Gemini's safeguards. Keywords: #phi4, AI, APT Actors, Agentic AI, Distillation Attacks, GTIG, Gemini API, Google DeepMind, Intellectual Property Theft, LLMs, Malware Development, Model Extraction, Phishing, Reconnaissance, Security Safeguards, Threat Actors
    The google logo   cloud.google.com 3 days ago
735.  HN VS Code becomes multi-agent command center for developers
The January 2026 release of Visual Studio Code (VS Code) v1.109 introduces a transformative approach to multi-agent development, enabling developers to integrate and manage multiple AI assistants, such as Anthropic Claude, OpenAI Codex, and GitHub Copilot, within a single interface. This integration facilitates enhanced productivity by allowing simultaneous use of different AI models without the need for tool-switching. The release features public preview support for Anthropic’s Claude agents, unified session management through an updated Agent Sessions view, and parallel subagent execution for isolated task handling. Additionally, it introduces MCP Apps, which allow interactive UI components in chat responses, aiming to enrich collaboration between developers and AI agents. Key optimizations include Copilot Memory for improved context retention, faster code search capabilities, enhanced security measures via terminal command sandboxing, and an upgraded chat interface. Microsoft's strategic initiative with this release is intended to expand its ecosystem by incorporating popular models directly within VS Code, thus retaining users who might otherwise turn to other platforms. This move signifies the beginning of a broader evolution in AI integration within development tools. Keywords: #phi4, AI assistants, Agent Sessions, Anthropic Claude, Copilot Memory, GitHub Copilot, MCP Apps, Model Context Protocol, OpenAI Codex, Unified Interface, VS Code, agent mode, chat experience, development, interactive UI, multi-agent, security optimizations, session management, subagents, terminal sandboxing
    The google logo   thenewstack.io 3 days ago
744.  HN Show HN: CLI chat client for OpenAI-comp APIs with workspace and MCP support
Undead is a minimal command-line interface (CLI) chat client tailored for interacting with OpenAI-compatible APIs. It supports both Model Context Protocol (MCP) servers and workspaces to enhance its functionality. Users can install Undead on Arch Linux from the AUR using package managers like `yay` or `paru`, or build it from source using Cargo with the command `cargo build --release`. The tool is initiated via the basic command `./undead`, allowing users to customize endpoints, models, and API keys. Additionally, workspace operations such as file read/write are accessible through the `--workspace` flag, while MCP server connections can be specified with the `--mcp` option. Undead offers a range of configurable options including setting the API endpoint, model name, API key, system prompt, response temperature, and max tokens. These configurations can also be managed using a YAML config file, which supports multiple API setups with global defaults and preset names, giving precedence to CLI arguments over environment variables. The tool's workspace feature enables sandboxed file operations within specified directories, while the MCP support allows connections to local or remote servers for extended functionalities defined in JSON configuration. Undead is compatible with various OpenAI-compatible APIs such as llama.cpp, Ollama, vLLM, LocalAI, OpenAI, and Azure OpenAI. It is distributed under the MIT license, promoting flexibility and broad usage possibilities. Keywords: #phi4, API endpoint, AUR, Arch Linux, CLI, MCP, MIT license, OpenAI, cargo build, chat client, compatible APIs, config file, interactive commands, model, sandboxed operations, system prompt, workspace
    The google logo   github.com 4 days ago
753.  HN Show HN: PolyMCP – A framework for building and orchestrating MCP agents
PolyMCP is an open-source framework designed to streamline the development and management of agents using the Model Context Protocol (MCP). It distinguishes itself from other MCP tooling by emphasizing agent structuring, connectivity, and reliability across various servers rather than merely exposing tools. PolyMCP allows developers to define MCP-compatible tool servers in Python or TypeScript and provides a framework for connecting agents to different endpoints. The platform includes built-in orchestration primitives to handle complex tasks efficiently and offers both a command-line interface (CLI) for project scaffolding and an inspector user interface (UI) for debugging purposes. By offering structured methods for registering tools, managing execution flow, and inspecting agent interactions, PolyMCP aims to minimize the ad-hoc nature commonly associated with agent systems. Licensed under the MIT license, it targets developers engaged in automation projects, internal copilots, or multi-tool assistants. The framework actively seeks feedback on its agent abstraction, orchestration patterns, and overall developer experience to further refine these capabilities. Keywords: #phi4, CLI, MCP endpoints, MIT licensed, Model Context Protocol (MCP), PolyMCP, Python, TypeScript, agent abstraction, agents, automation, copilots, debugging, execution flow, framework, inspector UI, modular structure, multi-tool assistants, orchestration primitives, state, tool servers
    The google logo   news.ycombinator.com 4 days ago
   https://github.com/poly-mcp/PolyMCP   3 days ago