Week 7 Automating LLM Calls Tutorial
In this week’s tutorial you will:
- Write Python scripts to use the OpenAI API (basic setup)
- Use the text completion feature to generate content
- Integrate extracted code knowledge into prompts and automatically call the API to generate tests
- Format and save results
Tutorials are one hour. Work through the core activities first; extension topics (embedding, fine-tuning, etc.) are for self-study using the references.
Prerequisites
- Python 3.8+ — for running scripts
- OpenAI API key — from OpenAI platform (or the provided API keys). Store it in an environment variable (e.g.
OPENAI_API_KEY) and never commit it to version control. - openai Python package — install with
pip install openai
References: OpenAI API documentation, Ultimate guide to OpenAI Python library
Outline (1 hour)
| Part | Activity | Time (guide) |
|---|---|---|
| 1 | API setup and text completion | ~20 min |
| 2 | Integrate code knowledge into a prompt and generate tests | ~25 min |
| 3 | Format and save results | ~15 min |
| 4 | Extension: Embedding and fine-tuning (reference only) | — |
Activity 1: API setup and text completion (~15 min)#
Task 1.1: Environment and first call#
- Create a new Python file (e.g.
llm_client.py). - Set your API key (e.g. from environment):
api_key = os.getenv("OPENAI_API_KEY") - Use the OpenAI client to call the chat completions (or text completion) API with a short prompt, e.g. “In one sentence, what is a unit test?”
- Print the model’s reply.
Example structure (adapt to your endpoint and model name):
from openai import OpenAI
import os
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
response = client.chat.completions.create(
model="gpt-4o-mini", # or the model your course uses
messages=[{"role": "user", "content": "In one sentence, what is a unit test?"}],
)
print(response.choices[0].message.content)
What would you need to change to send a system message (e.g. “You are a Java testing expert”) plus a user message?
Task 1.2: Text completion with a simple test-generation prompt#
Call the API with a prompt that asks for a single JUnit test method for a given method signature, e.g.
“Generate one JUnit 4 test method for: public int add(int a, int b). Return only the test code.”
Inspect the response: is it valid Java? Does it need trimming (e.g. markdown code fences)?
Responses may be wrapped in ` java ... `. Strip the fences and extract the code before saving or compiling.
Activity 2: Integrate code knowledge and generate tests (~25 min)#
Here you use extracted code knowledge (e.g. from SootUp or your A2 pipeline) inside the prompt, then call the API to generate tests.
Task 2.1: Build a prompt that includes code context#
Assume you have a string variable code_context containing, for example:
- Focal class and method signature
- Relevant field types and method names (e.g. from a simple code analysis or stub)
Write a template that combines:
- A short system or user instruction: “You are a Java testing expert. Generate JUnit 4 test methods.”
- The code context (class name, method signature, and any other knowledge).
- A clear user request: “Generate exactly one JUnit test method for the focal method. Return only the test code, no explanation.”
Example:
def build_test_generation_prompt(class_name: str, method_signature: str, extra_context: str = "") -> str:
instruction = "You are a Java testing expert. Generate JUnit 4 test methods."
context = f"Class: {class_name}\nFocal method: {method_signature}\n{extra_context}"
request = "Generate exactly one JUnit test method for the focal method. Return only the test code."
return f"{instruction}\n\n{context}\n\n{request}"
Call the API with this prompt and capture the returned text.
Task 2.2: Automate one shot per focal method#
Using your template, write a small loop (or single call) that, for one focal method (e.g. from a list or from your A2 output):
- Builds the prompt with that method’s code knowledge.
- Calls the OpenAI API.
- Extracts the raw response (and strips markdown if present).
- Stores the result in a list or dict (e.g.
method_id -> generated_code).
Respect rate limits and cost. Use a small model or few calls during the tutorial.
Activity 3: Format and save results (~10 min)#
Task 3.1: Normalise and format generated code#
- From the raw API response, remove markdown code blocks (e.g. `
java ` and ``) and leading/trailing whitespace. - Optionally run a formatter (e.g. a Java formatter script or “pretty-print” step) so saved tests look consistent.
- Decide a simple convention: e.g. one file per focal method, or one file per class with multiple test methods.
Task 3.2: Save to files#
Write the processed test code to disk:
- Use a clear naming scheme, e.g.
Test_ClassName_methodName.javaorClassName_methodName_test.java. - Save under a dedicated folder (e.g.
generated_tests/) so you can run them or inspect them later.
You now have a minimal pipeline: code context → prompt → API call → extract → format → save. You can extend this with more code knowledge (e.g. branch coverage goals) or multiple focal methods.
Extension (self-study)#
- Embedding feature: Use the API to obtain embeddings for method names or code snippets; useful for retrieval or clustering. See the OpenAI embeddings guide (or equivalent).
- Fine-tuning: For custom behaviour on your codebase, see Fine-tuning. Fine-tuning is not expected to be done during the one-hour tutorial.