# E-commerce Analytics Conversation Example
This example shows how a business analyst might interact with an LLM to analyze e-commerce data using the PostgreSQL MCP Server.
## Conversation Flow
### š§āš¼ User
"I need to analyze our e-commerce performance for the last month. Can you help me understand our sales trends, top products, and customer behavior?"
### š¤ Assistant
I'll help you analyze your e-commerce performance using your PostgreSQL database. Let me start by exploring the database structure and then gather the key metrics you need.
**First, let me check what tables are available:**
```json
{
"name": "list_tables",
"arguments": {
"schema": "public"
}
}
```
**Result:**
```
Tables in schema 'public':
- users (BASE TABLE)
- categories (BASE TABLE)
- products (BASE TABLE)
- addresses (BASE TABLE)
- orders (BASE TABLE)
- order_items (BASE TABLE)
- cart_items (BASE TABLE)
- reviews (BASE TABLE)
- coupons (BASE TABLE)
- order_coupons (BASE TABLE)
```
**Now let me get detailed information about the orders table to understand the data structure:**
```json
{
"name": "describe_table",
"arguments": {
"table_name": "orders",
"schema": "public"
}
}
```
**Result:**
```
Table 'public.orders' structure:
- id: integer NOT NULL
- order_number: character varying NOT NULL
- user_id: integer NULL
- status: character varying NOT NULL DEFAULT 'pending'
- subtotal: numeric NOT NULL
- tax_amount: numeric NULL DEFAULT 0
- shipping_amount: numeric NULL DEFAULT 0
- discount_amount: numeric NULL DEFAULT 0
- total_amount: numeric NOT NULL
- currency: character varying NULL DEFAULT 'USD'
- payment_status: character varying NOT NULL DEFAULT 'pending'
- payment_method: character varying NULL
- shipping_address_id: integer NULL
- billing_address_id: integer NULL
- notes: text NULL
- created_at: timestamp without time zone NULL DEFAULT CURRENT_TIMESTAMP
- updated_at: timestamp without time zone NULL DEFAULT CURRENT_TIMESTAMP
- shipped_at: timestamp without time zone NULL
- delivered_at: timestamp without time zone NULL
```
Perfect! Now let me analyze your sales performance for the last month. I'll start with overall sales metrics:
**Monthly Sales Overview:**
```json
{
"name": "query",
"arguments": {
"sql": "SELECT COUNT(*) as total_orders, SUM(total_amount) as total_revenue, AVG(total_amount) as average_order_value, COUNT(DISTINCT user_id) as unique_customers FROM orders WHERE created_at >= CURRENT_DATE - INTERVAL '30 days' AND status NOT IN ('cancelled', 'refunded')"
}
}
```
**Query Results (1 rows):**
```json
[
{
"total_orders": 152,
"total_revenue": "45280.50",
"average_order_value": "297.90",
"unique_customers": 89
}
]
```
š Query Analysis:
ā” High complexity query (complexity: 3/10)
š” Quick optimization tips:
⢠Run EXPLAIN ANALYZE to see the actual execution plan
⢠Consider adding an index on orders.created_at if queries are slow
**Daily Sales Trend:**
```json
{
"name": "query",
"arguments": {
"sql": "SELECT DATE(created_at) as sales_date, COUNT(*) as daily_orders, SUM(total_amount) as daily_revenue FROM orders WHERE created_at >= CURRENT_DATE - INTERVAL '30 days' AND status NOT IN ('cancelled', 'refunded') GROUP BY DATE(created_at) ORDER BY sales_date DESC LIMIT 10"
}
}
```
**Query Results (10 rows):**
```json
[
{"sales_date": "2024-06-20", "daily_orders": 8, "daily_revenue": "2156.75"},
{"sales_date": "2024-06-19", "daily_orders": 6, "daily_revenue": "1892.40"},
{"sales_date": "2024-06-18", "daily_orders": 12, "daily_revenue": "3421.80"},
{"sales_date": "2024-06-17", "daily_orders": 5, "daily_revenue": "1654.25"},
{"sales_date": "2024-06-16", "daily_orders": 9, "daily_revenue": "2789.30"}
]
```
**Top-Selling Products:**
```json
{
"name": "query",
"arguments": {
"sql": "SELECT p.name, p.sku, SUM(oi.quantity) as units_sold, SUM(oi.total_price) as total_revenue, COUNT(DISTINCT oi.order_id) as number_of_orders FROM order_items oi JOIN products p ON oi.product_id = p.id JOIN orders o ON oi.order_id = o.id WHERE o.created_at >= CURRENT_DATE - INTERVAL '30 days' AND o.status NOT IN ('cancelled', 'refunded') GROUP BY p.id, p.name, p.sku ORDER BY total_revenue DESC LIMIT 5"
}
}
```
**Query Results (5 rows):**
```json
[
{"name": "MacBook Pro 16\"", "sku": "MBP16-M3", "units_sold": 8, "total_revenue": "19992.00", "number_of_orders": 8},
{"name": "iPhone 15 Pro", "sku": "IPH15-PRO", "units_sold": 15, "total_revenue": "14985.00", "number_of_orders": 15},
{"name": "The Great Gatsby", "sku": "BOOK-GG-001", "units_sold": 45, "total_revenue": "584.55", "number_of_orders": 32}
]
```
### š§āš¼ User
"This is great! Can you also show me customer segmentation - like new vs returning customers, and identify any potential issues with abandoned carts?"
### š¤ Assistant
Absolutely! Let me analyze your customer segmentation and cart abandonment patterns.
**Customer Segmentation (New vs Returning):**
```json
{
"name": "query",
"arguments": {
"sql": "WITH customer_first_order AS (SELECT user_id, MIN(created_at) as first_order_date FROM orders WHERE status NOT IN ('cancelled', 'refunded') GROUP BY user_id), monthly_orders AS (SELECT o.user_id, COUNT(*) as order_count FROM orders o WHERE o.created_at >= CURRENT_DATE - INTERVAL '30 days' AND o.status NOT IN ('cancelled', 'refunded') GROUP BY o.user_id) SELECT CASE WHEN cfo.first_order_date >= CURRENT_DATE - INTERVAL '30 days' THEN 'New Customer' ELSE 'Returning Customer' END as customer_type, COUNT(*) as customer_count, SUM(mo.order_count) as total_orders, AVG(mo.order_count) as avg_orders_per_customer FROM monthly_orders mo JOIN customer_first_order cfo ON mo.user_id = cfo.user_id GROUP BY customer_type"
}
}
```
**Query Results (2 rows):**
```json
[
{
"customer_type": "New Customer",
"customer_count": 34,
"total_orders": 42,
"avg_orders_per_customer": "1.24"
},
{
""customer_type": "Returning Customer",
"customer_count": 55,
"total_orders": 110,
"avg_orders_per_customer": "2.00"
}
]
```
**Cart Abandonment Analysis:**
```json
{
"name": "query",
"arguments": {
"sql": "SELECT COUNT(DISTINCT ci.user_id) as users_with_cart_items, COUNT(DISTINCT o.user_id) as users_who_ordered, COUNT(DISTINCT ci.user_id) - COUNT(DISTINCT o.user_id) as potential_abandoned_carts FROM cart_items ci LEFT JOIN orders o ON ci.user_id = o.user_id AND o.created_at >= CURRENT_DATE - INTERVAL '7 days' WHERE ci.created_at >= CURRENT_DATE - INTERVAL '7 days'"
}
}
```
**Query Results (1 rows):**
```json
[
{
"users_with_cart_items": 67,
"users_who_ordered": 23,
"potential_abandoned_carts": 44
}
]
```
**Let me also check for any performance issues in our queries:**
```json
{
"name": "validate_query",
"arguments": {
"sql": "SELECT * FROM orders WHERE user_id = 123 AND created_at LIKE '%2024%'",
"schema": "public"
}
}
```
**Query Analysis Report:**
```
==================================================
Valid: ā No
Complexity: 3/10
š Security Issues:
WARNING: Potential SQL injection with OR condition
š” Use parameterized queries instead
ā” Performance Warnings:
WARNING: SELECT * can be inefficient
š” Specify only needed columns instead of using SELECT *
WARNING: Functions in WHERE clause can prevent index usage
š” Consider restructuring query to avoid functions on indexed columns
š” Optimization Suggestions:
1. Run EXPLAIN ANALYZE to see the actual execution plan
2. Consider adding an index on orders.user_id if queries are slow
3. Avoid unnecessary type conversions by ensuring proper column data types
```
**Here's a better version of that query:**
```json
{
"name": "validate_query",
"arguments": {
"sql": "SELECT id, order_number, total_amount, status, created_at FROM orders WHERE user_id = $1 AND created_at >= '2024-01-01' AND created_at < '2025-01-01' ORDER BY created_at DESC",
"schema": "public"
}
}
```
**Query Analysis Report:**
```
==================================================
Valid: ā
Yes
Complexity: 2/10
š” Optimization Suggestions:
1. Run EXPLAIN ANALYZE to see the actual execution plan
2. For ORDER BY with LIMIT, ensure there's an index on the ORDER BY columns
```
## š Analysis Summary
Based on the data analysis:
### š **Sales Performance (Last 30 Days)**
- **Total Orders:** 152 orders
- **Revenue:** $45,280.50
- **Average Order Value:** $297.90
- **Unique Customers:** 89
### š„ **Customer Insights**
- **New Customers:** 34 (38% of customer base)
- **Returning Customers:** 55 (62% of customer base)
- **Returning customers** place 2x more orders on average (2.00 vs 1.24)
### š **Cart Abandonment**
- **Abandonment Rate:** ~66% (44 out of 67 users with cart items didn't complete purchase)
- **Opportunity:** Focus on cart recovery campaigns
### š **Top Products**
1. **MacBook Pro 16"** - $19,992 revenue (8 units)
2. **iPhone 15 Pro** - $14,985 revenue (15 units)
3. **The Great Gatsby** - $585 revenue (45 units)
### š” **Recommendations**
1. **Implement cart abandonment email campaigns** - 66% abandonment rate suggests significant recovery opportunity
2. **Focus on customer retention** - Returning customers are 2x more valuable
3. **Optimize high-value product pages** - MacBook Pro drives 44% of revenue from just 8 sales
### š§āš¼ User
"Can you create a query to identify customers who might be interested in a premium product bundle based on their purchase history?"
### š¤ Assistant
Great idea! Let me create a sophisticated query to identify high-value customers who would be good candidates for premium product bundles.
**First, let me validate the query approach:**
```json
{
"name": "validate_query",
"arguments": {
"sql": "SELECT u.id, u.email, u.first_name, u.last_name, COUNT(o.id) as total_orders, SUM(o.total_amount) as total_spent, AVG(o.total_amount) as avg_order_value, MAX(o.created_at) as last_order_date, STRING_AGG(DISTINCT c.name, ', ') as purchased_categories FROM users u JOIN orders o ON u.id = o.user_id JOIN order_items oi ON o.id = oi.order_id JOIN products p ON oi.product_id = p.id JOIN categories c ON p.category_id = c.id WHERE o.status = 'delivered' AND o.created_at >= CURRENT_DATE - INTERVAL '6 months' GROUP BY u.id, u.email, u.first_name, u.last_name HAVING COUNT(o.id) >= 2 AND SUM(o.total_amount) >= 500 ORDER BY total_spent DESC",
"schema": "public"
}
}
```
**Query Analysis Report:**
```
==================================================
Valid: ā
Yes
Complexity: 7/10
ā” Performance Warnings:
INFO: Consider adding LIMIT clause
š” Add LIMIT clause to prevent large result sets
š” Optimization Suggestions:
1. Run EXPLAIN ANALYZE to see the actual execution plan
2. Consider adding an index on orders.user_id if queries are slow
3. For complex JOINs, consider the join order - start with the most selective table
4. For ORDER BY with LIMIT, ensure there's an index on the ORDER BY columns
```
The query looks good but has high complexity. Let me execute it with a LIMIT for better performance:
**Premium Bundle Candidates:**
```json
{
"name": "query",
"arguments": {
"sql": "SELECT u.id, u.email, u.first_name, u.last_name, COUNT(o.id) as total_orders, SUM(o.total_amount) as total_spent, AVG(o.total_amount) as avg_order_value, MAX(o.created_at) as last_order_date, STRING_AGG(DISTINCT c.name, ', ') as purchased_categories FROM users u JOIN orders o ON u.id = o.user_id JOIN order_items oi ON o.id = oi.order_id JOIN products p ON oi.product_id = p.id JOIN categories c ON p.category_id = c.id WHERE o.status = 'delivered' AND o.created_at >= CURRENT_DATE - INTERVAL '6 months' GROUP BY u.id, u.email, u.first_name, u.last_name HAVING COUNT(o.id) >= 2 AND SUM(o.total_amount) >= 500 ORDER BY total_spent DESC LIMIT 10"
}
}
```
**Query Results (10 rows):**
```json
[
{
"id": 42,
"email": "john.doe@email.com",
"first_name": "John",
"last_name": "Doe",
"total_orders": 5,
"total_spent": "3247.50",
"avg_order_value": "649.50",
"last_order_date": "2024-06-15T10:30:00",
"purchased_categories": "Electronics, Computers"
},
{
"id": 17,
"email": "sarah.tech@company.com",
"first_name": "Sarah",
"last_name": "Johnson",
"total_orders": 3,
"total_spent": "2890.00",
"avg_order_value": "963.33",
"last_order_date": "2024-06-18T14:22:00",
"purchased_categories": "Electronics, Computers, Smartphones"
}
]
```
š Query Analysis:
ā” High complexity query (complexity: 7/10)
š” Quick optimization tips:
⢠Run EXPLAIN ANALYZE to see the actual execution plan
⢠For complex JOINs, consider the join order - start with the most selective table
⢠For ORDER BY with LIMIT, ensure there's an index on the ORDER BY columns
## šÆ Premium Bundle Recommendations
### **Target Criteria Met:**
- **Minimum 2 orders** in the last 6 months
- **Minimum $500 total spent**
- **Delivered orders only** (confirmed satisfaction)
### **Top Candidates for Premium Bundles:**
1. **John Doe** (john.doe@email.com)
- 5 orders, $3,247 spent
- High AOV: $649.50
- Categories: Electronics, Computers
- **Bundle Suggestion:** "Pro Creator Bundle" (MacBook + Accessories)
2. **Sarah Johnson** (sarah.tech@company.com)
- 3 orders, $2,890 spent
- Very High AOV: $963.33
- Categories: Electronics, Computers, Smartphones
- **Bundle Suggestion:** "Complete Tech Bundle" (Laptop + Phone + Accessories)
### **Next Steps:**
1. **Personalized Email Campaign** with bundle recommendations
2. **Early Access** to new premium products
3. **Exclusive Discounts** on high-value bundles
4. **VIP Customer Program** enrollment
This analysis shows clear patterns of high-value customers who purchase across multiple technology categories - perfect candidates for premium bundles! š
---
*This conversation demonstrates how the MCP server enables sophisticated business intelligence queries while providing real-time optimization suggestions and security validation.*