47 lines
1.6 KiB
Markdown
47 lines
1.6 KiB
Markdown
# AI/ML/LLM Python Guidelines
|
|
|
|
## General approach
|
|
|
|
- Start from a clear problem definition: inputs, outputs, constraints, evaluation.
|
|
- Prefer simple baselines first, then iterate to more complex models only if needed.
|
|
- Isolate model logic from IO, configuration, and orchestration.
|
|
|
|
## Libraries and tooling
|
|
|
|
- Use mainstream, well-supported libraries:
|
|
- `numpy`, `pandas` for data handling
|
|
- `torch` or `tensorflow` where heavy ML is required
|
|
- `scikit-learn` for classical ML.
|
|
- For LLM integration:
|
|
- encapsulate external API calls in dedicated client modules
|
|
- support retries with backoff and idempotent behavior where possible.
|
|
|
|
## LLM usage patterns
|
|
|
|
- Separate:
|
|
- prompt construction
|
|
- model invocation
|
|
- parsing and validation of responses.
|
|
- Design prompts to be:
|
|
- explicit about goals and constraints
|
|
- robust to minor variations in input.
|
|
- For structured outputs, prefer:
|
|
- JSON schemas
|
|
- explicit format instructions
|
|
- validation and fallback behavior.
|
|
|
|
## Performance and cost awareness
|
|
|
|
- Minimize redundant calls to external LLMs:
|
|
- cache deterministic or semi-deterministic sub-steps where possible
|
|
- batch requests when APIs support it.
|
|
- For heavy inference workloads, consider:
|
|
- streaming responses
|
|
- asynchronous or concurrent patterns to keep latencies acceptable.
|
|
|
|
## Evaluation and safety
|
|
|
|
- For ML/LLM components, propose evaluation strategies:
|
|
- metrics, test datasets, golden test cases.
|
|
- Explicitly note limitations and potential failure modes.
|
|
- Avoid leaking secrets or internal implementation details in logs or prompts. |