Receipts Scanner Setup (Frappe + Google)
This runbook reflects the retired engine-era operating model and is preserved for reference only. It is not part of the current golden path.
Complexity: MEDIUM Time Required: 20-30 minutes Owner: Platform Team
This runbook covers the setup and configuration of the receipt scanning feature using Google Cloud Vision API and Google Drive integration.
Overview
The Receipts Scanner module enables:
- OCR text extraction from receipt images
- Automatic parsing of receipt data (date, total, tax)
- Storage in Google Drive
- Integration with Frappe ERP
- Mobile app capture and submission
1) Create DocType in Frappe
EGI Receipt DocType
Create a DocType named EGI Receipt with the following fields:
| Field Name | Field Type | Description |
|---|---|---|
receipt_date | Date | Date from receipt |
total | Currency | Total amount |
tax | Currency | Tax amount |
raw_text | Long Text | OCR extracted text |
drive_link | Data | Google Drive file link |
file_name | Data | Original filename |
uploaded_at | Datetime | Upload timestamp |
Additional Recommended Fields:
vendor_name(Data) - Merchant namecategory(Select) - Expense categoryproject(Link → Project) - Associated projectexpense_claim(Link → Expense Claim) - Link to claimstatus(Select) - Processing status
Configure Permissions
# In Frappe console
import frappe
# Add role permissions
frappe.permissions.add_permission(
doctype="EGI Receipt",
role="Employee",
permlevel=0,
read=1,
write=1,
create=1
)
2) Google Cloud Setup
A. Enable APIs
- Go to Google Cloud Console
- Select or create project:
egi-platform - Enable APIs:
- Cloud Vision API
- Google Drive API
- Google Sheets API (optional)
# Enable via gcloud CLI
gcloud services enable vision.googleapis.com
gcloud services enable drive.googleapis.com
B. Create Service Account
- Go to IAM & Admin → Service Accounts
- Click Create Service Account
- Name:
egi-receipts-scanner - Description:
Service account for receipt scanning and Drive storage - Click Create and Continue
C. Grant Permissions
Assign roles to service account:
- Cloud Vision AI Service Agent
- Drive File Admin (or more restrictive custom role)
# Via gcloud CLI
gcloud projects add-iam-policy-binding egi-platform \
--member="serviceAccount:egi-receipts-scanner@egi-platform.iam.gserviceaccount.com" \
--role="roles/vision.admin"
gcloud projects add-iam-policy-binding egi-platform \
--member="serviceAccount:egi-receipts-scanner@egi-platform.iam.gserviceaccount.com" \
--role="roles/drive.file"
D. Generate and Download Key
- Click on service account
- Go to Keys tab
- Click Add Key → Create new key
- Select JSON format
- Download file:
egi-receipts-scanner-credentials.json
⚠️ Security Note: Store this file securely. Do not commit to Git.
3) Create Kubernetes Secret
Store Google Credentials
# Create secret from JSON file
kubectl create secret generic control-center-google \
--from-file=credentials.json=./egi-receipts-scanner-credentials.json \
--namespace hq
# Verify secret created
kubectl get secret control-center-google -n hq
# Check secret data
kubectl describe secret control-center-google -n hq
Alternative: Create from Base64
# Base64 encode credentials
cat egi-receipts-scanner-credentials.json | base64
# Create secret YAML
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
name: control-center-google
namespace: hq
type: Opaque
data:
credentials.json: $(cat egi-receipts-scanner-credentials.json | base64 -w 0)
EOF
4) Configure Google Drive
A. Create Shared Drive (Optional)
For team access:
- Go to Google Drive
- Click Shared drives
- Click New
- Name:
EGI Receipts - Add service account email as member:
egi-receipts-scanner@egi-platform.iam.gserviceaccount.com
B. Create Folder Structure
EGI Receipts/
├── 2026/
│ ├── January/
│ ├── February/
│ └── ...
├── Unprocessed/
└── Archive/
C. Get Folder IDs
- Navigate to target folder in Drive
- Copy folder ID from URL:
https://drive.google.com/drive/folders/{FOLDER_ID} - Note IDs for configuration
5) Configure Control Center
Update Helm Values
Edit charts/control-center/values.yaml:
env:
# Google Cloud Configuration
googleProjectId: "egi-platform"
googleDriveSharedDriveId: "{SHARED_DRIVE_ID}" # Optional
googleDriveFolderId: "{TARGET_FOLDER_ID}"
# Secret reference
googleCredentials:
secretName: control-center-google
keyName: credentials.json
Deploy Changes
# Upgrade Control Center
helm upgrade control-center ./charts/control-center \
--set env.googleProjectId="egi-platform" \
--set env.googleDriveFolderId="{FOLDER_ID}" \
--set googleCredentials.secretName="control-center-google" \
--namespace hq
# Restart to apply changes
kubectl rollout restart deployment/control-center -n hq
# Verify deployment
kubectl rollout status deployment/control-center -n hq
Verify Configuration
# Check environment variables
kubectl exec -n hq deployment/control-center -- env | grep -i google
# Check secret mounted
kubectl exec -n hq deployment/control-center -- ls -la /secrets/google/
# Test Vision API connection
kubectl exec -n hq deployment/control-center -- python -c "
from google.cloud import vision
client = vision.ImageAnnotatorClient()
print('Vision API client initialized successfully')
"
6) Test Receipt Scanning
A. Via Mobile App
- Open EGI mobile app
- Navigate to Finance → Scan Receipt
- Take photo or select from gallery
- Submit receipt
- Verify:
- Receipt appears in Frappe
- File uploaded to Google Drive
- OCR text extracted
B. Via API
# Upload receipt via API
curl -X POST https://control-center.egintegrations.com/api/receipts/scan \
-H "Content-Type: multipart/form-data" \
-F "file=@receipt.jpg" \
-F "user=test@example.com"
# Check response
# Should return: receipt_id, extracted_text, drive_link
C. Verify in Frappe
# Check via Frappe API
curl -X GET https://{frappe-instance}/api/resource/EGI Receipt \
-H "Authorization: token `{api_key}`:`{api_secret}`"
Monitoring & Analytics
Uptime Kuma Monitors
Create monitors for:
-
Vision API Health
- Type: Keyword
- URL: Control Center health endpoint
- Keyword: "vision_api_ok"
- Alert: Slack
#alerts-warning
-
Drive API Health
- Type: Keyword
- URL: Control Center health endpoint
- Keyword: "drive_api_ok"
- Alert: Slack
#alerts-warning
PostHog Analytics
Track receipt scanning events:
// Receipt scanned
posthog.capture('receipt_scanned', {
user: 'user@example.com',
file_size_kb: 245,
processing_time_ms: 1850
})
// OCR completed
posthog.capture('receipt_ocr_completed', {
text_length: 450,
confidence_score: 0.95,
total_amount: 125.50
})
// Drive upload completed
posthog.capture('receipt_uploaded_to_drive', {
file_id: 'abc123',
folder: '2026/March'
})
Create Dashboard: "Receipt Scanner Analytics"
- Receipts scanned per day
- Average processing time
- OCR accuracy rate
- Upload success rate
- Popular expense categories
Slack Notifications
Configure alerts:
- Receipt processing failed →
#alerts-warning - Vision API quota exceeded →
#alerts-critical - Drive storage limit reached →
#alerts-warning - High OCR confidence receipts →
#finance-receipts
Troubleshooting
Vision API Errors
"Unable to initialize Vision client"
# Check credentials mounted
kubectl exec -n hq deployment/control-center -- \
cat /secrets/google/credentials.json
# Verify service account permissions
gcloud projects get-iam-policy egi-platform \
--flatten="bindings[].members" \
--filter="bindings.members:egi-receipts-scanner@"
# Test API from pod
kubectl exec -it -n hq deployment/control-center -- python3 <<EOF
from google.cloud import vision
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/secrets/google/credentials.json'
client = vision.ImageAnnotatorClient()
print("Success!")
EOF
Drive Upload Fails
"Permission denied" errors
# Verify service account has Drive access
# In Google Drive: Share folder with service account email
# Test Drive API
kubectl exec -it -n hq deployment/control-center -- python3 <<EOF
from googleapiclient.discovery import build
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(
'/secrets/google/credentials.json',
scopes=['https://www.googleapis.com/auth/drive']
)
service = build('drive', 'v3', credentials=credentials)
results = service.files().list(pageSize=10).execute()
print(f"Found {len(results.get('files', []))} files")
EOF
OCR Quality Issues
Low confidence scores or incorrect text
- Ensure receipt image is clear and well-lit
- Minimum resolution: 1024x768
- Supported formats: JPG, PNG, PDF
- File size: less than 10MB
Improve OCR accuracy:
# In Control Center, adjust Vision API parameters
image_context = vision.ImageContext(
language_hints=['en'], # Add language hints
)
response = client.text_detection(
image=image,
image_context=image_context
)
Maintenance Checklist
Daily:
- Check receipt processing success rate in PostHog
- Monitor Vision API quota usage
- Verify Drive storage capacity
Weekly:
- Review failed receipts in Frappe
- Audit Drive folder organization
- Check OCR accuracy metrics
- Review Slack alerts
Monthly:
- Archive old receipts
- Review and optimize folder structure
- Update service account keys (if policy requires)
- Audit permissions
- Review costs (Vision API usage, Drive storage)
Cost Monitoring
Google Cloud Pricing
Vision API:
- First 1,000 images/month: Free
- Additional: $1.50 per 1,000 images
Drive Storage:
- 15 GB: Free
- 100 GB: $1.99/month
- 200 GB: $2.99/month
Monitor Usage
# Check Vision API usage
gcloud logging read "resource.type=cloud_vision" \
--project=egi-platform \
--format="table(timestamp, protoPayload.methodName)" \
--limit=100
# Check Drive storage
# Via Google Cloud Console: Cloud Storage → Browser
Set Billing Alerts
- Go to Billing → Budgets & Alerts
- Create budget: "Vision API Monthly"
- Set threshold: $50/month
- Alert: Email + Slack webhook
Security Considerations
CrowdSec Integration
Monitor for:
- Excessive receipt upload attempts
- Suspicious file types
- Large file uploads (potential abuse)
# CrowdSec scenario: Receipt upload abuse
type: leaky
name: egi/receipt-upload-abuse
description: "Detect excessive receipt uploads"
filter: "evt.Meta.service == 'receipts-scanner'"
leakspeed: "60s"
capacity: 20
labels:
service: receipts-scanner
remediation: ban
Data Privacy
- Receipts may contain sensitive information
- Ensure proper access controls in Frappe
- Implement data retention policy
- Consider GDPR compliance for EU users
- Encrypt Drive storage (Google encrypts by default)
Last Updated: 2026-03-25 Version: 1.1