Files API Overview

The Files API allows you to upload, manage, and retrieve Markdown and MDX files within your projects. Each file is stored in Google Cloud Storage with metadata tracked in the database.

What is a File?

A file in MarkdownAPI.io represents a Markdown (.md) or MDX (.mdx) document with the following characteristics:

Project-scoped: Files exist within a specific project
Unique filenames: Each filename must be unique within its project
Content storage: File content stored in GCS, metadata in database
Hash tracking: SHA-256 content hash for integrity verification
Custom metadata: Optional JSON metadata for organization and search
Type detection: Automatic MIME type detection

File Properties

Property	Type	Description
`id`	UUID	Unique file identifier
`filename`	string	File name (must end in .md or .mdx)
`file_type`	string	File extension without dot (“md” or “mdx”)
`size_bytes`	integer	File size in bytes
`gcs_path`	string	Full GCS path to file content
`content_hash`	string	SHA-256 hash of file content
`mime_type`	string	MIME type (“text/markdown” or “text/mdx”)
`custom_metadata`	object\|null	Optional JSON metadata
`is_active`	boolean	Whether file is active (false if soft-deleted)
`created_date`	datetime	When file was uploaded (UTC)
`updated_date`	datetime	When file was last modified (UTC)

Supported File Types

Only Markdown and MDX files are supported:

Extension	MIME Type	Description
`.md`	`text/markdown`	Standard Markdown files
`.mdx`	`text/mdx`	MDX (Markdown + JSX) files

Validation:

Filenames must end with .md or .mdx
Other extensions are rejected with 400 Bad Request
Case-insensitive validation (.MD, .MDX also accepted)

Custom Metadata

Files support optional JSON metadata for organization and search:

Example metadata:


{
  "tags": ["ai", "article", "published"],
  "author": "Agent-007",
  "version": "1.0",
  "category": "blog",
  "publish_date": "2025-01-09",
  "status": "draft"
}

Metadata rules:

Must be valid JSON object (not array or primitive)
Any structure allowed - no schema enforcement
Stored as JSONB in database for efficient querying
Can be updated independently of file content
Preserved when updating content (unless explicitly changed)

Content Hashing

Every file has a SHA-256 content hash for integrity verification:

Purpose:

Detect content changes
Verify upload/download integrity
Identify duplicate content
Track file modifications

Behavior:

Generated automatically on upload
Recalculated on content update
Returned in file metadata responses
Not user-modifiable

File Lifecycle


Upload → Active → [Update Content/Metadata] → Delete → Removed
            ↓
      [Multiple Updates]

Lifecycle Operations

Upload: File created with content and optional metadata
Read: File content and metadata retrieved
Update Metadata: Change custom metadata without touching content
Update Content: Replace file content, optionally update metadata
Delete: Permanently remove file from GCS and database

Storage Architecture

Files are stored in project-specific GCS buckets:

Storage path pattern:


gs://markdown-api-{user_id}-{project_id}/{filename}

Examples:


gs://markdown-api-user123-proj456/article.md
gs://markdown-api-user123-proj456/docs/guide.mdx

Benefits:

Project isolation
Fast retrieval
Scalable storage
Automatic backups

Naming Rules

Filenames must follow these rules:

Length: 1-255 characters
Uniqueness: Must be unique within project
Extension: Must end with .md or .mdx
Characters: Any UTF-8 characters except / and \
Case-sensitive: file.md and File.md are different
Path separators: Use / for logical organization in name

Valid filenames:


"article.md"
"blog/post-2025.md"
"docs/api/overview.mdx"
"文档.md" (Chinese)
"документ.mdx" (Cyrillic)

Invalid filenames:


"article.txt" (wrong extension)
"article" (no extension)
"article.MD.txt" (wrong final extension)
"article\\doc.md" (backslash not allowed)

Common Use Cases

1. Blog Article Management


Project: "Blog Articles 2025"
Files:
- "posts/ai-trends-2025.md" (metadata: {category: "ai", status: "published"})
- "posts/tech-review.md" (metadata: {category: "tech", status: "draft"})
- "drafts/future-post.md" (metadata: {status: "draft"})

2. Documentation


Project: "Product Docs"
Files:
- "getting-started.md"
- "api/overview.mdx"
- "api/authentication.mdx"
- "guides/quickstart.md"

3. AI Agent Memory


Project: "Agent Memory"
Files:
- "conversations/user123.md" (metadata: {user_id: "123", last_updated: "2025-01-09"})
- "context/system-prompt.md"
- "knowledge/facts.md" (metadata: {version: "1.0", verified: true})

4. Content Generation


Project: "Generated Content"
Files:
- "articles/article-001.md" (metadata: {generated_by: "gpt-4", date: "2025-01-09"})
- "summaries/summary-001.md" (metadata: {source: "article-001.md"})

Performance Characteristics

File Size Limits

Maximum file size: 10 MB per file (soft limit)
Recommended size: < 1 MB for optimal performance
Large files: 1-10 MB may have slower upload/download

Operation Speed

Operation	Typical Duration	Notes
Upload file	100-500ms	Depends on file size
Download file	50-300ms	Depends on file size
List files	50-150ms	Scales with file count
Update metadata	50-100ms	Fast - no content transfer
Update content	100-500ms	Depends on new file size
Delete file	100-300ms	Includes GCS deletion

Optimization Tips

Batch uploads: Upload multiple files in parallel
Cache file lists: List operations are cacheable
Stream large files: Use streaming for files > 1 MB
Compress content: Markdown compresses well (gzip)
Use metadata: Store searchable data in metadata, not filename

API Endpoints

Available Operations

Upload File List Files Download File Update Metadata Update Content Delete File

Quick Reference

Method	Endpoint	Description
POST	`/api/projects/{project_id}/files`	Upload new file with optional metadata
GET	`/api/projects/{project_id}/files`	List all files in project
GET	`/api/projects/{project_id}/files/{filename}`	Download file content
PUT	`/api/projects/{project_id}/files/{filename}`	Update file metadata
PUT	`/api/projects/{project_id}/files/{filename}/content`	Update file content
DELETE	`/api/projects/{project_id}/files/{filename}`	Delete file

Error Scenarios

Common Errors

Error	Status	Cause	Solution
Duplicate filename	409	File with name exists	Use unique filename or update existing
Invalid file type	400	Not .md or .mdx	Use supported file types
File too large	413	Exceeds size limit	Reduce file size or split content
Not found	404	File doesn’t exist	Check filename and project_id
Invalid metadata	400	Malformed JSON metadata	Validate JSON format
Project not found	404	Project doesn’t exist	Verify project_id

Best Practices

Naming Conventions

Use descriptive, hierarchical names:


# Good
"blog/2025/01/ai-trends.md"
"docs/api/authentication.mdx"
"articles/published/post-001.md"
 
# Avoid
"file1.md"
"temp.md"
"test123.md"

Metadata Strategy

Structure metadata consistently:


{
  "type": "blog-post",
  "category": "ai",
  "tags": ["machine-learning", "gpt"],
  "status": "published",
  "author": "Agent-007",
  "created": "2025-01-09",
  "version": "1.0"
}

Content Organization

Organize files logically within projects:


Project: "Content Management"
├── published/
│   ├── article-001.md
│   └── article-002.md
├── drafts/
│   └── wip-article.md
└── templates/
    └── article-template.md

Security Considerations

Access control: Files are user-scoped - only owner can access
Content validation: Files must be .md or .mdx
Size limits: Prevent storage abuse
Hash verification: Detect tampering
Bucket isolation: Per-project GCS buckets

Next Steps

Upload File - Learn how to upload files
Download File - Learn how to retrieve files
Quick Start Guide - Complete tutorial
Files Concepts - Deep dive into file concepts