The AI Engine
Behind Our Games

Processing 14.9 billion tokens daily across 3 core AI scenarios with 72% cost optimization

Built on Claude Sonnet 4.6 and Haiku 4.5, our platform delivers enterprise-grade AI for coding productivity, multilingual customer service, and intelligent game operations — all running on AWS infrastructure with smart model routing.

7 Bedrock Agents

99.9% uptime SLA

Enterprise security

14.9B

Tokens/Day

AI inference scale

$200K

Monthly MRR

Platform revenue

72%

Cost Saved

vs base pricing

200+

Engineers

Using AI daily

Three Core Scenarios

AI Capability Overview

Our platform serves three distinct AI workloads, each optimized for its unique performance and cost requirements.

AI Coding & Dev Productivity

200+ engineers across Shenzhen · Singapore · Bangalore

~580M tokens/day

daily throughput

Model

Claude Sonnet 4.6 (complex generation) + Haiku 4.5 (fast completion)

Tech Stack

Cocos Creator (TypeScript), Unity (C#), Go, Python

Smart Code Review

PR review time: 4.2h → ≤1.5h (↓64%)

Bug Diagnosis

Localization time: 3.5h → ≤1h (↓71%)

AI Code Generation

IDE completion for VS Code / JetBrains

Auto Test Generation

Unit & integration test scaffolding

Multilingual AI Customer Service & Localization

12 countries · 10+ languages · Real-time conversations

~5,095M tokens/day

daily throughput

Model

Sonnet (complex dialogue/localization) + Haiku (moderation/QA)

Tech Stack

Real-time chat, content pipeline, moderation system

Intent Recognition

72% (Qwen) → 94%+ (Claude)

Auto Resolution Rate

≥70%, response <3s

Content Moderation

85 items/sec, 94.2% accuracy

Cost Savings

$150K/mo → ≤$60K/mo (↓60%+)

Game Operations AI Agent & Analytics

Intelligent operations across all game titles

~9,250M tokens/day

daily throughput

Model

Haiku 4.5 (high throughput) + Sonnet (deep analysis)

Tech Stack

Agent framework, analytics pipeline, recommendation engine

Card Balance Analysis

Adjustment cycle: 3 weeks → ≤3 days (↓86%)

Anti-Cheat System

False positive: 5% → 0.5% (↓90%)

Operations Copilot

Analysis: 3 days → 5 min (NL→SQL→Report)

NPC Dynamic Dialogue

3M calls/day, <800ms, 10+ languages

Personalized Push

Open rate: 12% → 18%+

Churn Prediction

Accuracy ≥75%

Benchmark Results

Claude vs Qwen Comparison

April 2026 blind test across 1,200 samples and 450 code tasks. Claude consistently outperforms across all scenarios.

Test Scenario

Claude Sonnet 4.6

Qwen-Max

Gap

Multilingual Intent Recognition (6 langs, 1200 samples)

92.6%

73.7%

+18.9pp

Code Generation Accuracy (450 tasks)

84.8%

61.0%

+23.8pp

Long-Context Bug Diagnosis (20 cases)

85.0%

35.0%

+50.0pp

Content Moderation Throughput (10K items)

85/s, 94.2%

45/s, 78.5%

+89% throughput

Card Balance Reasoning (5 cases)

4.2/5

2.8/5

+1.4 pts

Weighted Overall Score

92.3

56.5

+35.8

Key Takeaway: Claude achieves a weighted overall score of 92.3 vs Qwen's 56.5 — a +63% improvement that directly translates to better player experiences.

Infrastructure

Technical Architecture

A 5-layer architecture designed for high throughput, low latency, and cost efficiency at scale.

Access Layer

CloudFront + ALB (Dual AZ)

Layer 1

Routing Layer

GameAI Hub — Smart routing (95% → Haiku, 5% → Sonnet)

Layer 2

AI Layer

7 Bedrock Agents + Knowledge Bases (RAG) + OpenSearch

Layer 3

Cache Layer

ElastiCache Redis (response cache hit rate 45%+)

Layer 4

Data Layer

Aurora MySQL + S3

Layer 5

Dual AZ Deployment

High availability across availability zones

Smart Routing

95% requests to Haiku for cost efficiency

45% Cache Hit Rate

Redis response caching reduces AI calls

Cost Efficiency

72% Cost Reduction

Smart optimization strategies that reduce our AI infrastructure costs from $723K to $200K per month without sacrificing performance.

Cost Breakdown

Base Price (no optimization)$723K/mo

Prompt Cache (Sonnet 45% hit, Haiku 78% hit)

ElastiCache response cache (45% hit rate)

Smart model routing (95% requests → Haiku)

Batch processing for non-realtime workloads

Actual MRR~$200K/mo

Total Savings: $523K/month (72% reduction)

Optimization Strategies

Intelligent Model Routing

~60% savings

GameAI Hub routes 95% of requests to Haiku 4.5 ($0.25/MTok) and only escalates complex tasks to Sonnet 4.6 ($3/MTok).

Multi-Layer Caching

~25% savings

Prompt caching reduces repeated context costs. Redis response cache eliminates redundant AI calls entirely.

Batch Processing

~15% savings

Non-realtime workloads (analytics, reports, content generation) run in batch mode at 50% discount.

Token Optimization

~12% savings

Structured prompts, response compression, and context windowing minimize token usage per request.

Roadmap

Project Milestones

A phased rollout strategy that minimizes risk while maximizing value delivery at each stage.

Phase 0—

2026/05

AI Readiness Assessment

Phase 1 MVP~$50K

2026/06-07

AI Coding + India Customer Service

Phase 2 Expansion~$150K

2026/07-09

Full-language CS + Moderation + Anti-Cheat

Phase 3 Full Scale~$200K

2026/10-11

All scenarios live + optimization

Build With Our AI Platform

Whether you're looking to integrate AI into your game operations or explore partnership opportunities, let's discuss how our platform can accelerate your goals.

The AI EngineBehind Our Games

AI Capability Overview

AI Coding & Dev Productivity

Multilingual AI Customer Service & Localization

Game Operations AI Agent & Analytics

Claude vs Qwen Comparison

Technical Architecture

72% Cost Reduction

Cost Breakdown

Optimization Strategies

Intelligent Model Routing

Multi-Layer Caching

Batch Processing

Token Optimization

Project Milestones

Build With Our AI Platform

The AI Engine
Behind Our Games