Skip to main content

Receipts Scanner Setup (Frappe + Google)

Historical Runbook

This runbook reflects the retired engine-era operating model and is preserved for reference only. It is not part of the current golden path.

Complexity: MEDIUM Time Required: 20-30 minutes Owner: Platform Team

This runbook covers the setup and configuration of the receipt scanning feature using Google Cloud Vision API and Google Drive integration.


Overview

The Receipts Scanner module enables:

  • OCR text extraction from receipt images
  • Automatic parsing of receipt data (date, total, tax)
  • Storage in Google Drive
  • Integration with Frappe ERP
  • Mobile app capture and submission

1) Create DocType in Frappe

EGI Receipt DocType

Create a DocType named EGI Receipt with the following fields:

Field NameField TypeDescription
receipt_dateDateDate from receipt
totalCurrencyTotal amount
taxCurrencyTax amount
raw_textLong TextOCR extracted text
drive_linkDataGoogle Drive file link
file_nameDataOriginal filename
uploaded_atDatetimeUpload timestamp

Additional Recommended Fields:

  • vendor_name (Data) - Merchant name
  • category (Select) - Expense category
  • project (Link → Project) - Associated project
  • expense_claim (Link → Expense Claim) - Link to claim
  • status (Select) - Processing status

Configure Permissions

# In Frappe console
import frappe

# Add role permissions
frappe.permissions.add_permission(
doctype="EGI Receipt",
role="Employee",
permlevel=0,
read=1,
write=1,
create=1
)

2) Google Cloud Setup

A. Enable APIs

  1. Go to Google Cloud Console
  2. Select or create project: egi-platform
  3. Enable APIs:
    • Cloud Vision API
    • Google Drive API
    • Google Sheets API (optional)
# Enable via gcloud CLI
gcloud services enable vision.googleapis.com
gcloud services enable drive.googleapis.com

B. Create Service Account

  1. Go to IAM & Admin → Service Accounts
  2. Click Create Service Account
  3. Name: egi-receipts-scanner
  4. Description: Service account for receipt scanning and Drive storage
  5. Click Create and Continue

C. Grant Permissions

Assign roles to service account:

  • Cloud Vision AI Service Agent
  • Drive File Admin (or more restrictive custom role)
# Via gcloud CLI
gcloud projects add-iam-policy-binding egi-platform \
--member="serviceAccount:egi-receipts-scanner@egi-platform.iam.gserviceaccount.com" \
--role="roles/vision.admin"

gcloud projects add-iam-policy-binding egi-platform \
--member="serviceAccount:egi-receipts-scanner@egi-platform.iam.gserviceaccount.com" \
--role="roles/drive.file"

D. Generate and Download Key

  1. Click on service account
  2. Go to Keys tab
  3. Click Add Key → Create new key
  4. Select JSON format
  5. Download file: egi-receipts-scanner-credentials.json

⚠️ Security Note: Store this file securely. Do not commit to Git.


3) Create Kubernetes Secret

Store Google Credentials

# Create secret from JSON file
kubectl create secret generic control-center-google \
--from-file=credentials.json=./egi-receipts-scanner-credentials.json \
--namespace hq

# Verify secret created
kubectl get secret control-center-google -n hq

# Check secret data
kubectl describe secret control-center-google -n hq

Alternative: Create from Base64

# Base64 encode credentials
cat egi-receipts-scanner-credentials.json | base64

# Create secret YAML
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
name: control-center-google
namespace: hq
type: Opaque
data:
credentials.json: $(cat egi-receipts-scanner-credentials.json | base64 -w 0)
EOF

4) Configure Google Drive

A. Create Shared Drive (Optional)

For team access:

  1. Go to Google Drive
  2. Click Shared drives
  3. Click New
  4. Name: EGI Receipts
  5. Add service account email as member: egi-receipts-scanner@egi-platform.iam.gserviceaccount.com

B. Create Folder Structure

EGI Receipts/
├── 2026/
│ ├── January/
│ ├── February/
│ └── ...
├── Unprocessed/
└── Archive/

C. Get Folder IDs

  1. Navigate to target folder in Drive
  2. Copy folder ID from URL: https://drive.google.com/drive/folders/{FOLDER_ID}
  3. Note IDs for configuration

5) Configure Control Center

Update Helm Values

Edit charts/control-center/values.yaml:

env:
# Google Cloud Configuration
googleProjectId: "egi-platform"
googleDriveSharedDriveId: "{SHARED_DRIVE_ID}" # Optional
googleDriveFolderId: "{TARGET_FOLDER_ID}"

# Secret reference
googleCredentials:
secretName: control-center-google
keyName: credentials.json

Deploy Changes

# Upgrade Control Center
helm upgrade control-center ./charts/control-center \
--set env.googleProjectId="egi-platform" \
--set env.googleDriveFolderId="{FOLDER_ID}" \
--set googleCredentials.secretName="control-center-google" \
--namespace hq

# Restart to apply changes
kubectl rollout restart deployment/control-center -n hq

# Verify deployment
kubectl rollout status deployment/control-center -n hq

Verify Configuration

# Check environment variables
kubectl exec -n hq deployment/control-center -- env | grep -i google

# Check secret mounted
kubectl exec -n hq deployment/control-center -- ls -la /secrets/google/

# Test Vision API connection
kubectl exec -n hq deployment/control-center -- python -c "
from google.cloud import vision
client = vision.ImageAnnotatorClient()
print('Vision API client initialized successfully')
"

6) Test Receipt Scanning

A. Via Mobile App

  1. Open EGI mobile app
  2. Navigate to Finance → Scan Receipt
  3. Take photo or select from gallery
  4. Submit receipt
  5. Verify:
    • Receipt appears in Frappe
    • File uploaded to Google Drive
    • OCR text extracted

B. Via API

# Upload receipt via API
curl -X POST https://control-center.egintegrations.com/api/receipts/scan \
-H "Content-Type: multipart/form-data" \
-F "file=@receipt.jpg" \
-F "user=test@example.com"

# Check response
# Should return: receipt_id, extracted_text, drive_link

C. Verify in Frappe

# Check via Frappe API
curl -X GET https://{frappe-instance}/api/resource/EGI Receipt \
-H "Authorization: token `{api_key}`:`{api_secret}`"

Monitoring & Analytics

Uptime Kuma Monitors

Create monitors for:

  1. Vision API Health

    • Type: Keyword
    • URL: Control Center health endpoint
    • Keyword: "vision_api_ok"
    • Alert: Slack #alerts-warning
  2. Drive API Health

    • Type: Keyword
    • URL: Control Center health endpoint
    • Keyword: "drive_api_ok"
    • Alert: Slack #alerts-warning

PostHog Analytics

Track receipt scanning events:

// Receipt scanned
posthog.capture('receipt_scanned', {
user: 'user@example.com',
file_size_kb: 245,
processing_time_ms: 1850
})

// OCR completed
posthog.capture('receipt_ocr_completed', {
text_length: 450,
confidence_score: 0.95,
total_amount: 125.50
})

// Drive upload completed
posthog.capture('receipt_uploaded_to_drive', {
file_id: 'abc123',
folder: '2026/March'
})

Create Dashboard: "Receipt Scanner Analytics"

  • Receipts scanned per day
  • Average processing time
  • OCR accuracy rate
  • Upload success rate
  • Popular expense categories

Slack Notifications

Configure alerts:

  • Receipt processing failed → #alerts-warning
  • Vision API quota exceeded → #alerts-critical
  • Drive storage limit reached → #alerts-warning
  • High OCR confidence receipts → #finance-receipts

Troubleshooting

Vision API Errors

"Unable to initialize Vision client"

# Check credentials mounted
kubectl exec -n hq deployment/control-center -- \
cat /secrets/google/credentials.json

# Verify service account permissions
gcloud projects get-iam-policy egi-platform \
--flatten="bindings[].members" \
--filter="bindings.members:egi-receipts-scanner@"

# Test API from pod
kubectl exec -it -n hq deployment/control-center -- python3 <<EOF
from google.cloud import vision
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/secrets/google/credentials.json'
client = vision.ImageAnnotatorClient()
print("Success!")
EOF

Drive Upload Fails

"Permission denied" errors

# Verify service account has Drive access
# In Google Drive: Share folder with service account email

# Test Drive API
kubectl exec -it -n hq deployment/control-center -- python3 <<EOF
from googleapiclient.discovery import build
from google.oauth2 import service_account

credentials = service_account.Credentials.from_service_account_file(
'/secrets/google/credentials.json',
scopes=['https://www.googleapis.com/auth/drive']
)

service = build('drive', 'v3', credentials=credentials)
results = service.files().list(pageSize=10).execute()
print(f"Found {len(results.get('files', []))} files")
EOF

OCR Quality Issues

Low confidence scores or incorrect text

  • Ensure receipt image is clear and well-lit
  • Minimum resolution: 1024x768
  • Supported formats: JPG, PNG, PDF
  • File size: less than 10MB

Improve OCR accuracy:

# In Control Center, adjust Vision API parameters
image_context = vision.ImageContext(
language_hints=['en'], # Add language hints
)

response = client.text_detection(
image=image,
image_context=image_context
)

Maintenance Checklist

Daily:

  • Check receipt processing success rate in PostHog
  • Monitor Vision API quota usage
  • Verify Drive storage capacity

Weekly:

  • Review failed receipts in Frappe
  • Audit Drive folder organization
  • Check OCR accuracy metrics
  • Review Slack alerts

Monthly:

  • Archive old receipts
  • Review and optimize folder structure
  • Update service account keys (if policy requires)
  • Audit permissions
  • Review costs (Vision API usage, Drive storage)

Cost Monitoring

Google Cloud Pricing

Vision API:

  • First 1,000 images/month: Free
  • Additional: $1.50 per 1,000 images

Drive Storage:

  • 15 GB: Free
  • 100 GB: $1.99/month
  • 200 GB: $2.99/month

Monitor Usage

# Check Vision API usage
gcloud logging read "resource.type=cloud_vision" \
--project=egi-platform \
--format="table(timestamp, protoPayload.methodName)" \
--limit=100

# Check Drive storage
# Via Google Cloud Console: Cloud Storage → Browser

Set Billing Alerts

  1. Go to Billing → Budgets & Alerts
  2. Create budget: "Vision API Monthly"
  3. Set threshold: $50/month
  4. Alert: Email + Slack webhook

Security Considerations

CrowdSec Integration

Monitor for:

  • Excessive receipt upload attempts
  • Suspicious file types
  • Large file uploads (potential abuse)
# CrowdSec scenario: Receipt upload abuse
type: leaky
name: egi/receipt-upload-abuse
description: "Detect excessive receipt uploads"
filter: "evt.Meta.service == 'receipts-scanner'"
leakspeed: "60s"
capacity: 20
labels:
service: receipts-scanner
remediation: ban

Data Privacy

  • Receipts may contain sensitive information
  • Ensure proper access controls in Frappe
  • Implement data retention policy
  • Consider GDPR compliance for EU users
  • Encrypt Drive storage (Google encrypts by default)

Last Updated: 2026-03-25 Version: 1.1