Files
pm-skills/pm-execution/commands/generate-data.md
T
Pawel Huryn 77dbdfa1b9 v1.0
2026-03-02 00:36:23 +01:00

2.7 KiB

description, argument-hint
description argument-hint
Generate realistic dummy datasets for testing — CSV, JSON, SQL inserts, or Python scripts <description of the data you need>

/generate-data -- Test Data Generator

Create realistic dummy datasets for development, testing, demos, or prototyping. Outputs as ready-to-use files in your preferred format.

Invocation

/generate-data 1000 users with names, emails, plan tier, signup date, and activity score
/generate-data E-commerce orders dataset: products, customers, timestamps, amounts
/generate-data Sample data matching this schema: [paste table definition]

Workflow

Step 1: Define the Dataset

Understand:

  • What entities? (users, orders, events, products, etc.)
  • What columns? (with data types and constraints)
  • How many rows?
  • Any relationships between tables?
  • Any specific distributions? (e.g., "80% should be on the free plan")
  • Any realistic constraints? (emails should be unique, dates should be chronological)

Step 2: Generate the Data

Apply the dummy-dataset skill:

  • Create a Python script that generates the dataset
  • Use realistic-looking data (not random strings): proper names, valid email formats, real-seeming dates
  • Respect constraints: unique IDs, foreign key relationships, chronological ordering
  • Apply specified distributions
  • Execute the script and produce the output file

Step 3: Deliver

Output in the requested format (or ask):

  • CSV: Most common, works everywhere
  • JSON: For API testing or frontend development
  • SQL INSERT: For populating test databases
  • Python script: For reproducible generation (user can tweak and re-run)
## Generated Dataset: [Description]

**Rows**: [count]
**Columns**: [list]
**Format**: [CSV / JSON / SQL / Python]

### Schema
| Column | Type | Constraints | Distribution |
|--------|------|-----------|-------------|

### Sample (first 5 rows)
[Preview of the data]

### Files
- [data file]
- [generator script, if applicable]

Save data file and generator script to the user's workspace.

Step 4: Offer Follow-ups

  • "Want me to add more columns or increase the dataset size?"
  • "Should I create related tables (e.g., orders for these users)?"
  • "Want me to write test scenarios that use this data?"
  • "Should I create SQL queries to analyze this dataset?"

Notes

  • Always provide the generator script so the user can regenerate with different parameters
  • For demo datasets, make the data tell a story (e.g., seasonal trends, a retention problem, a power user segment)
  • Respect realistic cardinality: 1000 users don't have 1000 unique cities
  • For financial data, use realistic price distributions — not uniform random
  • Never include real personal data — all names, emails, and identifiers must be fake