Skip to main content
Version: 0.2.3

OneDrive

OneDrive

OneDrive Connector

The OneDrive connector indexes files from your Microsoft OneDrive, including text files, PDFs, and documents stored in your cloud storage.

Prerequisites

Before using the OneDrive connector:

  1. Create a Microsoft app registration (shared with other Microsoft connectors)
  2. Add the Files.Read API permission to your app registration

Required Permission

The OneDrive connector requires this Microsoft Graph permission:

Files.Read

This permission provides read-only access to files. Sercha cannot modify, create, or delete files.

Capabilities

CapabilitySupportedNotes
Full syncYesIndexes all files matching configured filters
Incremental syncYesUses OneDrive Delta API to track modifications
Watch modeNoWebhook integration not available in CLI
HierarchyYesParent folder relationships preserved
Binary contentPartialText files and PDFs; images/videos skipped
ValidationYesVerifies credentials before sync

What Gets Indexed

The connector indexes:

  • Text files (.txt, .md, .json, .xml, etc.)
  • PDF documents
  • Code files (.go, .py, .js, .ts, etc.)
  • Configuration files (.yaml, .toml, .ini)
  • File metadata (name, path, size, modification time)

File Types Support

CategoryExamplesContent Indexed
Text.txt, .md, .rtfFull content
Code.go, .py, .js, .ts, .javaFull content
Data.json, .xml, .yaml, .csvFull content
Documents.pdfExtracted text
Images.png, .jpg, .gifMetadata only
Media.mp4, .mp3, .wavMetadata only

Configuration

These options control what gets indexed during sync.

OptionDescriptionDefault
folder_idsComma-separated folder IDs to limit sync scopeAll folders
mime_typesSync only files matching these MIME typesNone (all supported)
max_resultsPage size for API requests100
include_sharedInclude files shared with youfalse

Folder Filter

Limit sync to specific folders using their IDs:

--config "folder_ids=ABC123DEF456"

To find a folder ID:

  1. Open OneDrive in a browser
  2. Navigate to the folder
  3. Copy the ID from the URL: https://onedrive.live.com/?id={folder_id}

MIME Types Filter

Limit sync to specific file types:

# Only sync PDF files
--config "mime_types=application/pdf"

# Sync multiple types
--config "mime_types=text/plain,application/json,text/markdown"

Include Shared Files

Include files shared with you by others:

--config "include_shared=true"

Document Structure

URI Pattern

Files are identified by URIs:

onedrive://files/{item_id}

Example: onedrive://files/ABC123DEF456

Parent Relationships

Files include a ParentURI linking to their parent folder:

onedrive://folders/{parent_id}

MIME Types

The connector uses the file's actual MIME type as reported by OneDrive.

Metadata

Each file includes:

FieldDescription
file_idOneDrive item ID
titleFile name
pathFull path including parent folders
sizeFile size in bytes
web_linkLink to view in OneDrive
modified_timeLast modification timestamp
created_timeCreation timestamp
parent_idParent folder ID
drive_idOneDrive drive ID
drive_typeDrive type (personal, business, documentLibrary)

Sync Behaviour

Full Sync

Full sync uses OneDrive's Delta API to retrieve all files:

  1. Fetches all items from root (or specified folders)
  2. Filters by MIME type if configured
  3. Downloads content for supported file types
  4. Stores the delta link for incremental sync

Incremental Sync

Incremental sync continues from the stored delta link:

  1. Fetches changes since the stored delta token
  2. Processes file additions, modifications, and deletions
  3. Updates the delta link cursor

Delta Token Expiration

If the delta token expires (indicated by HTTP 410 Gone):

  1. The connector detects the expiration
  2. Returns an error indicating full sync is required
  3. Run a full sync to re-establish the delta token

File Size Limits

LimitValue
Maximum file size5 MB

Files exceeding this limit are skipped (metadata only, no content).

Rate Limiting

Microsoft Graph has throttling limits. The connector uses:

SettingValue
Requests per second5
Burst size10

When throttled (HTTP 429), the connector waits and retries with exponential backoff.

Error Handling

ErrorHandling
Rate limit (429)Wait and retry with backoff
Delta expired (410)Trigger full resync
File not foundSkip and continue
Download failedSkip file content, continue
Authentication failureReport error, stop sync

Limitations

LimitationDescription
Binary filesImages and videos not indexed (metadata only)
File sizeMaximum 5 MB per file
SharePointSharePoint document libraries not yet supported
Watch modeNot supported in CLI
Trashed filesExcluded from sync

Example Usage

Create a OneDrive source with default settings:

sercha source add \
--type onedrive \
--name "My OneDrive" \
--auth "My Microsoft Account"

Create a source for a specific folder:

sercha source add \
--type onedrive \
--name "Work Documents" \
--auth "My Microsoft Account" \
--config "folder_ids=ABC123DEF456"

Create a source for PDFs only:

sercha source add \
--type onedrive \
--name "PDF Documents" \
--auth "My Microsoft Account" \
--config "mime_types=application/pdf"

Include shared files:

sercha source add \
--type onedrive \
--name "All Files" \
--auth "My Microsoft Account" \
--config "include_shared=true"

Sync the source:

sercha sync <source-id>

Next