Skip to Content
API ReferenceFiles APIFiles Overview

Files API Overview

The Files API allows you to upload, manage, and retrieve Markdown and MDX files within your projects. Each file is stored in Google Cloud Storage with metadata tracked in the database.

What is a File?

A file in MarkdownAPI.io represents a Markdown (.md) or MDX (.mdx) document with the following characteristics:

  • Project-scoped: Files exist within a specific project
  • Unique filenames: Each filename must be unique within its project
  • Content storage: File content stored in GCS, metadata in database
  • Hash tracking: SHA-256 content hash for integrity verification
  • Custom metadata: Optional JSON metadata for organization and search
  • Type detection: Automatic MIME type detection

File Properties

PropertyTypeDescription
idUUIDUnique file identifier
filenamestringFile name (must end in .md or .mdx)
file_typestringFile extension without dot (“md” or “mdx”)
size_bytesintegerFile size in bytes
gcs_pathstringFull GCS path to file content
content_hashstringSHA-256 hash of file content
mime_typestringMIME type (“text/markdown” or “text/mdx”)
custom_metadataobject|nullOptional JSON metadata
is_activebooleanWhether file is active (false if soft-deleted)
created_datedatetimeWhen file was uploaded (UTC)
updated_datedatetimeWhen file was last modified (UTC)

Supported File Types

Only Markdown and MDX files are supported:

ExtensionMIME TypeDescription
.mdtext/markdownStandard Markdown files
.mdxtext/mdxMDX (Markdown + JSX) files

Validation:

  • Filenames must end with .md or .mdx
  • Other extensions are rejected with 400 Bad Request
  • Case-insensitive validation (.MD, .MDX also accepted)

Custom Metadata

Files support optional JSON metadata for organization and search:

Example metadata:

{ "tags": ["ai", "article", "published"], "author": "Agent-007", "version": "1.0", "category": "blog", "publish_date": "2025-01-09", "status": "draft" }

Metadata rules:

  • Must be valid JSON object (not array or primitive)
  • Any structure allowed - no schema enforcement
  • Stored as JSONB in database for efficient querying
  • Can be updated independently of file content
  • Preserved when updating content (unless explicitly changed)

Content Hashing

Every file has a SHA-256 content hash for integrity verification:

Purpose:

  • Detect content changes
  • Verify upload/download integrity
  • Identify duplicate content
  • Track file modifications

Behavior:

  • Generated automatically on upload
  • Recalculated on content update
  • Returned in file metadata responses
  • Not user-modifiable

File Lifecycle

Upload → Active → [Update Content/Metadata] → Delete → Removed [Multiple Updates]

Lifecycle Operations

  1. Upload: File created with content and optional metadata
  2. Read: File content and metadata retrieved
  3. Update Metadata: Change custom metadata without touching content
  4. Update Content: Replace file content, optionally update metadata
  5. Delete: Permanently remove file from GCS and database

Storage Architecture

Files are stored in project-specific GCS buckets:

Storage path pattern:

gs://markdown-api-{user_id}-{project_id}/{filename}

Examples:

gs://markdown-api-user123-proj456/article.md gs://markdown-api-user123-proj456/docs/guide.mdx

Benefits:

  • Project isolation
  • Fast retrieval
  • Scalable storage
  • Automatic backups

Naming Rules

Filenames must follow these rules:

  • Length: 1-255 characters
  • Uniqueness: Must be unique within project
  • Extension: Must end with .md or .mdx
  • Characters: Any UTF-8 characters except / and \
  • Case-sensitive: file.md and File.md are different
  • Path separators: Use / for logical organization in name

Valid filenames:

"article.md" "blog/post-2025.md" "docs/api/overview.mdx" "文档.md" (Chinese) "документ.mdx" (Cyrillic)

Invalid filenames:

"article.txt" (wrong extension) "article" (no extension) "article.MD.txt" (wrong final extension) "article\\doc.md" (backslash not allowed)

Common Use Cases

1. Blog Article Management

Project: "Blog Articles 2025" Files: - "posts/ai-trends-2025.md" (metadata: {category: "ai", status: "published"}) - "posts/tech-review.md" (metadata: {category: "tech", status: "draft"}) - "drafts/future-post.md" (metadata: {status: "draft"})

2. Documentation

Project: "Product Docs" Files: - "getting-started.md" - "api/overview.mdx" - "api/authentication.mdx" - "guides/quickstart.md"

3. AI Agent Memory

Project: "Agent Memory" Files: - "conversations/user123.md" (metadata: {user_id: "123", last_updated: "2025-01-09"}) - "context/system-prompt.md" - "knowledge/facts.md" (metadata: {version: "1.0", verified: true})

4. Content Generation

Project: "Generated Content" Files: - "articles/article-001.md" (metadata: {generated_by: "gpt-4", date: "2025-01-09"}) - "summaries/summary-001.md" (metadata: {source: "article-001.md"})

Performance Characteristics

File Size Limits

  • Maximum file size: 10 MB per file (soft limit)
  • Recommended size: < 1 MB for optimal performance
  • Large files: 1-10 MB may have slower upload/download

Operation Speed

OperationTypical DurationNotes
Upload file100-500msDepends on file size
Download file50-300msDepends on file size
List files50-150msScales with file count
Update metadata50-100msFast - no content transfer
Update content100-500msDepends on new file size
Delete file100-300msIncludes GCS deletion

Optimization Tips

  1. Batch uploads: Upload multiple files in parallel
  2. Cache file lists: List operations are cacheable
  3. Stream large files: Use streaming for files > 1 MB
  4. Compress content: Markdown compresses well (gzip)
  5. Use metadata: Store searchable data in metadata, not filename

API Endpoints

Available Operations

Quick Reference

MethodEndpointDescription
POST/api/projects/{project_id}/filesUpload new file with optional metadata
GET/api/projects/{project_id}/filesList all files in project
GET/api/projects/{project_id}/files/{filename}Download file content
PUT/api/projects/{project_id}/files/{filename}Update file metadata
PUT/api/projects/{project_id}/files/{filename}/contentUpdate file content
DELETE/api/projects/{project_id}/files/{filename}Delete file

Error Scenarios

Common Errors

ErrorStatusCauseSolution
Duplicate filename409File with name existsUse unique filename or update existing
Invalid file type400Not .md or .mdxUse supported file types
File too large413Exceeds size limitReduce file size or split content
Not found404File doesn’t existCheck filename and project_id
Invalid metadata400Malformed JSON metadataValidate JSON format
Project not found404Project doesn’t existVerify project_id

Best Practices

Naming Conventions

Use descriptive, hierarchical names:

# Good "blog/2025/01/ai-trends.md" "docs/api/authentication.mdx" "articles/published/post-001.md" # Avoid "file1.md" "temp.md" "test123.md"

Metadata Strategy

Structure metadata consistently:

{ "type": "blog-post", "category": "ai", "tags": ["machine-learning", "gpt"], "status": "published", "author": "Agent-007", "created": "2025-01-09", "version": "1.0" }

Content Organization

Organize files logically within projects:

Project: "Content Management" ├── published/ │ ├── article-001.md │ └── article-002.md ├── drafts/ │ └── wip-article.md └── templates/ └── article-template.md

Security Considerations

  • Access control: Files are user-scoped - only owner can access
  • Content validation: Files must be .md or .mdx
  • Size limits: Prevent storage abuse
  • Hash verification: Detect tampering
  • Bucket isolation: Per-project GCS buckets

Next Steps

Last updated on