Skip to main content
Version: 0.2.3

Google Drive

Google Drive

Google Drive Connector

The Google Drive connector indexes files from your Google Drive, including Google Docs, Sheets, Slides, and regular uploaded files.

Prerequisites

Before using the Google Drive connector:

  1. Create a Google OAuth client (shared with other Google connectors)
  2. Enable the Google Drive API for your project

Enable Drive API

  1. Go to Drive API in Google Cloud Console
  2. Select your project
  3. Click Enable

Required Scope

The Google Drive connector requires this OAuth scope:

https://www.googleapis.com/auth/drive.readonly

This scope provides read-only access to files and metadata. Sercha cannot modify, create, or delete files.

Capabilities

CapabilitySupportedNotes
Full syncYesIndexes all files matching configured filters
Incremental syncYesUses Drive Changes API to track modifications
Watch modeNoWebhook integration not available in CLI
HierarchyYesParent folder relationships preserved
Binary contentNoText files and Google Workspace files only
ValidationYesVerifies credentials before sync

Content Types

The connector supports different content types:

Content TypeDescriptionExport Format
filesRegular uploaded files (text only)Original format
docsGoogle DocsPlain text
sheetsGoogle SheetsCSV

By default, all content types are enabled.

Google Workspace File Export

Google Workspace files (Docs, Sheets, Slides) are stored in Google's proprietary format and must be exported:

Google TypeExported AsMIME Type
Google DocsPlain texttext/plain
Google SheetsCSVtext/csv
Google SlidesPlain texttext/plain

The exported content is then processed by Sercha's normalizers for indexing.

Configuration

These options control what gets indexed during sync—they filter what Sercha stores locally, not what you can search for within your indexed documents.

OptionDescriptionDefault
content_typesComma-separated content types to syncfiles,docs,sheets
mime_typesSync only files matching these MIME typesNone (all supported)
folder_idsSync only files within these folder IDsAll folders
max_resultsPage size for API requests100

Content Types Filter

Limit which content types are indexed during sync:

# Only sync Google Docs
--config "content_types=docs"

# Sync Google Docs and Sheets
--config "content_types=docs,sheets"

# Only sync regular files (no Google Workspace files)
--config "content_types=files"

MIME Types Filter

Limit sync to specific MIME types:

# Only sync Markdown files
--config "mime_types=text/markdown"

# Sync multiple types
--config "mime_types=text/plain,application/json"

Folder Filter

Limit sync to specific folders using their IDs:

--config "folder_ids=1a2b3c4d5e6f7g8h"

To find a folder ID, open the folder in Google Drive and copy the ID from the URL: https://drive.google.com/drive/folders/{folder_id}

Document Structure

URI Pattern

Files are identified by URIs:

gdrive://files/{file_id}

Example: gdrive://files/1a2b3c4d5e6f7g8h9i0j

MIME Types

The connector assigns MIME types based on content:

ContentMIME Type
Google Docstext/plain
Google Sheetstext/csv
Google Slidestext/plain
Regular filesOriginal MIME type

Metadata

Each file includes:

FieldDescription
file_idGoogle Drive file ID
titleFile name
pathPath including parent folder
sizeFile size in bytes
web_linkLink to view in Google Drive
modified_timeLast modification timestamp

Sync Behaviour

Full Sync

Full sync retrieves all files matching the configured filters:

  1. Lists all files accessible to the user
  2. Filters by content type and MIME type
  3. Downloads or exports file content
  4. Stores the start page token for incremental sync

Incremental Sync

Incremental sync uses Google Drive's Changes API:

  1. Fetches changes since the stored page token
  2. Processes file additions, modifications, and deletions
  3. Updates the page token cursor

File Size Limits

LimitValue
Maximum file size5 MB
Maximum export size5 MB

Files exceeding these limits are skipped (metadata only, no content).

Rate Limiting

Google Drive has generous quota limits. The connector uses:

SettingValue
Requests per second8
Burst size10

These limits are per user. Google allows up to 10 requests per second per user.

Error Handling

ErrorHandling
Rate limit (429)Wait and retry with backoff
File not foundSkip and continue
Export failedSkip file, log warning
Authentication failureReport error, stop sync

Supported File Types

The connector indexes text-based files:

CategoryExamples
Documents.txt, .md, .rtf
Code.go, .py, .js, .ts, .java
Data.json, .xml, .yaml, .csv
Config.env, .ini, .toml
Google WorkspaceDocs, Sheets, Slides

Binary files (images, PDFs, videos) are skipped.

Limitations

LimitationDescription
Binary filesNot indexed (text content only)
File sizeMaximum 5 MB per file
Shared drivesAccessible if user has permission
Watch modeNot supported in CLI
Trashed filesExcluded from sync

Example Usage

Create a Google Drive source with default settings:

sercha source add \
--type google-drive \
--name "My Drive" \
--auth "My Google Account"

Create a source for Google Docs only:

sercha source add \
--type google-drive \
--name "Google Docs" \
--auth "My Google Account" \
--config "content_types=docs"

Create a source for a specific folder:

sercha source add \
--type google-drive \
--name "Project Folder" \
--auth "My Google Account" \
--config "folder_ids=1a2b3c4d5e6f7g8h9i0j"

Sync the source:

sercha sync <source-id>

Next