Skip to main content
Version: 0.2.3

Gmail

Gmail

Gmail Connector

The Gmail connector indexes emails from your Gmail account. It retrieves full RFC 2822 formatted messages including headers, body content, and metadata.

Prerequisites

Before using the Gmail connector:

  1. Create a Google OAuth client (shared with other Google connectors)
  2. Enable the Gmail API for your project

Enable Gmail API

  1. Go to Gmail API in Google Cloud Console
  2. Select your project (if prompted)
  3. Click Enable

Required Scope

The Gmail connector requires this OAuth scope:

https://www.googleapis.com/auth/gmail.readonly

This scope provides read-only access to emails, labels, and threads. Sercha cannot modify, send, or delete emails.

Capabilities

CapabilitySupportedNotes
Full syncYesIndexes all emails matching configured labels
Incremental syncYesUses Gmail History API to track changes
Watch modeNoWebhook integration not available in CLI
HierarchyYesThreads are linked via parent URI
Binary contentNoEmail attachments not indexed
ValidationYesVerifies credentials before sync

What Gets Indexed

The connector indexes:

  • Email headers (From, To, Subject, Date)
  • Full email body (plain text and HTML)
  • Labels assigned to each message
  • Thread relationships
  • Message metadata (internal date, history ID)

Configuration

These options control what gets indexed during sync—they filter what Sercha stores locally, not what you can search for within your indexed documents.

OptionDescriptionDefault
label_idsComma-separated label IDs to filterINBOX
queryGmail search syntax to filter which emails are indexedNone
max_resultsPage size for API requests100
include_spam_trashInclude spam and trashfalse

Label IDs

Gmail uses label IDs (not names) for filtering. Common system labels:

LabelID
InboxINBOX
SentSENT
DraftsDRAFT
StarredSTARRED
ImportantIMPORTANT
All Mail(no filter)

To sync all mail, set label_ids to an empty string.

Sync Filter

The query option filters which emails are indexed during sync. It accepts Gmail search syntax and limits the scope of what Sercha stores locally—it does not affect searches within your indexed documents.

ExampleDescription
from:alice@example.comOnly index emails from Alice
subject:meetingOnly index emails with "meeting" in subject
after:2024/01/01Only index emails after January 2024
has:attachmentOnly index emails with attachments
is:unreadOnly index unread emails

Filters can be combined: from:alice@example.com after:2024/01/01

Document Structure

URI Pattern

Emails are identified by URIs:

gmail://messages/{message_id}

Example: gmail://messages/18e5a7c8b9d4f123

Thread Relationships

Messages in the same thread are linked:

  • Each message has a thread_id in metadata
  • Messages reference their thread via ParentURI: gmail://threads/{thread_id}

MIME Type

All Gmail documents use:

message/rfc822

The content is the full RFC 2822 formatted email, including headers and body.

Metadata

Each email includes:

FieldDescription
message_idGmail message ID
thread_idThread ID for grouping
labelsArray of label IDs
snippetShort preview of content
history_idHistory ID for incremental sync
internal_dateUnix timestamp (milliseconds)

Sync Behaviour

Full Sync

Full sync retrieves all messages matching the configured labels:

  1. Lists all message IDs matching label filter
  2. Fetches each message in raw RFC 2822 format
  3. Stores the current history ID for incremental sync

Incremental Sync

Incremental sync uses Gmail's History API:

  1. Fetches changes since the stored history ID
  2. Processes message additions, deletions, and label changes
  3. Updates the history ID cursor

If the history ID expires (typically after 7 days without sync), a full sync is required.

Rate Limiting

Gmail has conservative quota limits. The connector uses:

SettingValue
Requests per second2
Burst size5

These limits are per user, not per project. Large mailboxes may take time to sync initially.

Error Handling

ErrorHandling
Rate limit (429)Wait and retry with backoff
History expired (404)Trigger full resync
Authentication failureReport error, stop sync
Permission deniedReport error, stop sync

Limitations

LimitationDescription
AttachmentsNot indexed (email body only)^
Large emailsAPI limit of 25MB per message
History retention~7 days before full resync needed
Watch modeNot supported in CLI

^This is a known limitation, and will be added in a future version

Example Usage

Create a Gmail source with default settings (INBOX only):

sercha source add \
--type gmail \
--name "Work Gmail" \
--auth "My Google Account"

Create a source for all mail with a query filter:

sercha source add \
--type gmail \
--name "Project Emails" \
--auth "My Google Account" \
--config "label_ids=,query=subject:project-alpha"

Sync the source:

sercha sync <source-id>

Next