Science & Technology

How to Use Markdown for Agents on Netfox for Optimized AI Crawling

Gilbert Whitman
Gilbert Whitman
13 days ago
1
126

As Artificial Intelligence continues to evolve, the way content is consumed by machines has become just as important as how it is consumed by humans. To ensure that Large Language Models (LLMs) and AI agents can efficiently index our platform without the bloat of HTML tags and client-side scripts, Netfox has officially adopted a new protocol inspired by Cloudflare.

In this Q&A, we'll explain how AI developers and agents can leverage our Markdown for Agents feature to extract pristine, structured data while saving up to 80% on token consumption.


Q: What is "Markdown for Agents"?

A: Traditionally, when an AI web crawler visits a webpage, it receives raw HTML. This HTML contains styling classes, navigation menus, footers, and scripts that the AI doesn't actually need—wasting context window space and processing power (tokens).

Markdown for Agents is a content delivery approach where our server detects when an AI agent is requesting a page and serve the content natively in Markdown instead of HTML. This provides a clean, distraction-free text payload complete with structured YAML frontmatter for metadata.

You can read more about the inspiration behind this feature in our recent news article: Markdown for Agents: Cloudflare's New AI Protocol Explained.

Q: How does Netfox implement this feature?

A: At Netfox, we use intelligent edge middleware to govern incoming traffic. When a request hits our core content routes—specifically News articles (/news/[slug]), Q&A threads (/qa/[id]), or Anti-Scam reports (/anti-scam/[id])—the middleware inspects the HTTP headers.

If the Accept: text/markdown header is present, the traffic is seamlessly routed to specialized API handlers that assemble the data (including top comments for Q&A) into a perfectly formatted Markdown response.

Here is a visual representation of the architecture:


Q: How can my AI bot or scraper use this?

A: Using the feature is incredibly simple. All you need to do is ensure your HTTP request includes the Accept: text/markdown header.

Here are practical curl examples you can use to test the outputs:

1. Fetching a News Article

curl -i "https://netfox.space/news/xz-utils-backdoor-how-jia-tan-infiltrated-linux" \
     -H "Accept: text/markdown"

2. Fetching a Q&A Thread (Includes Top Answers)

curl -i "https://netfox.space/qa/422940324504010752" \
     -H "Accept: text/markdown"

3. Fetching an Anti-Scam Report

curl -i "https://netfox.space/anti-scam/402885541743296512" \
     -H "Accept: text/markdown"

Expected Response

When successful, your bot will receive a payload similar to this, including custom headers specifying the estimated token count (x-markdown-tokens):

HTTP/1.1 200 OK
content-type: text/markdown; charset=utf-8
x-markdown-tokens: 1451
Content-Signal: ai-train=yes, search=yes, ai-input=yes

---
description: A short description of the content...
title: The Main Title of the Page
image: https://api.netfox.space/uploads/example.webp
---

# The Main Title of the Page

2024-10-31

* [![Author Name](avatar_url)](/u/author_id)
[Author Name](/u/author_id)

The pristine content body goes here, fully formatted in markdown without any HTML clutter...

Q: What are the Crawling Policies for Netfox?

A: We embrace AI indexing! However, we require that any content synthesized or utilized from our platform provides proper attribution back to the original source URL on Netfox.

You can find our complete, machine-readable guidelines for Large Language Models by reading our llms.txt file directly: 👉 https://netfox.space/llms.txt


Netfox is committed to building an open, accessible, and highly optimized ecosystem for both humans and autonomous agents alike.

1 Answers

2
Kane Blackwood

Markdown for Agents is a game-changer. It’s refreshing to see a platform prioritizing AI-friendly infrastructure so effectively. Keep it up!

6 days ago