Skip to main content

Documentation Index

Fetch the complete documentation index at: https://fileguard.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

FileGuard provides comprehensive file validation to ensure only legitimate, safe files are stored.

Validation Pipeline

1

Extension Check

Verify file extension is in context’s allowed_extensions
2

Size Check

Verify file size is within context’s max_file_size_mb
3

Blank File Detection

Check for meaningful content (if reject_blank_files enabled)
4

Corrupt File Detection

Verify file integrity (if reject_corrupt_files enabled)
5

Virus Scanning

Scan with ClamAV (if scan_for_viruses enabled)

Blank File Detection

When reject_blank_files is enabled, FileGuard detects and rejects files with no meaningful content.
File TypeWhat’s Considered “Blank”
PDFNo text (3+ letter words) and no embedded images
ImagesSingle color or dimensions less than 2x2 pixels
ExcelNo data rows in any worksheet
CSVNo data rows (or only headers)
TextNo words (3+ letters)
JSONEmpty {}, [], null, or no meaningful values
XMLOnly empty root element, no children or text
DOCXNo paragraphs or tables with content
ZIPEmpty archive (no files)
MP3/MP4Zero duration
OtherOnly rejects 0-byte files

Example Error Messages

{
  "errors": [
    "File appears to be blank/empty: PDF has no readable content"
  ]
}
{
  "errors": [
    "File appears to be blank/empty: Image contains no meaningful content (single color)"
  ]
}

Corrupt File Detection

When reject_corrupt_files is enabled, FileGuard validates file integrity.
File TypeIntegrity Checks
PDFValid PDF structure, parseable
ImagesValid image data, decodable
ExcelValid ZIP-based format
CSVValid UTF-8, parseable
JSONValid JSON syntax
XMLValid XML structure
DOCXValid Word format
ZIPArchive integrity, CRC check
MP3Valid audio frames
MP4Valid container structure
AllFile signature (magic bytes) matches extension

Magic Bytes Validation

FileGuard checks that file signatures match declared extensions:
ExtensionExpected Magic Bytes
PDF%PDF
JPEG\xFF\xD8\xFF
PNG\x89PNG
ZIP/DOCX/XLSXPK
MP3ID3 or \xFF\xFB
MP4ftyp

Example Error Messages

{
  "errors": [
    "File is corrupt or unreadable: Invalid PDF file: file signature does not match"
  ]
}
{
  "errors": [
    "File is corrupt or unreadable: Invalid JSON format: Expecting property name"
  ]
}

Virus Scanning (ClamAV)

When scan_for_viruses is enabled, files are scanned using ClamAV.

How It Works

  1. File passes all other validations
  2. File content sent to ClamAV daemon
  3. If threat detected, upload rejected immediately
  4. Scan results logged in API call logs

Example Error Messages

{
  "errors": [
    "File contains malware: Eicar-Test-Signature"
  ]
}
{
  "errors": [
    "File contains malware: Win.Trojan.Generic-12345"
  ]
}

Testing Virus Scanning

Use the EICAR test file (detected by all antivirus engines):
X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*
The EICAR test file is harmless but will trigger virus detection.

Supported File Types

CategoryExtensionsLibrary
Documentspdf, docxpypdf, python-docx
Imagesjpg, jpeg, png, gif, webpPillow
Spreadsheetsxlsx, csvopenpyxl, built-in
Datajson, xmlBuilt-in
ArchiveszipBuilt-in
Mediamp3, mp4mutagen
TexttxtBuilt-in

Disabling Validation

Disable validation for specific use cases:
{
  "context_key": "raw_uploads",
  "reject_blank_files": false,
  "reject_corrupt_files": false,
  "scan_for_viruses": false
}
Disabling validation reduces security. Only do this for specific use cases like encrypted files or proprietary formats.

Fail-Safe Design

FileGuard uses a fail-safe approach:
  • If validation cannot be performed (e.g., missing library), the file is rejected
  • This ensures no potentially harmful files slip through due to errors
  • Errors are logged for debugging