Apply Single-Case Experimental Designs: Testing What Works for Your Client
If you’re a BCBA or clinical supervisor, you’ve likely faced this question: How do I know this intervention actually caused the behavior change I’m seeing, not just coincidence or maturation? Single-case experimental designs (SCEDs) are the answer. They let you prove that an intervention caused a specific change in one client’s behavior—and they’re one of the most practical tools in your clinical toolkit.
This post walks you through what SCEDs are, why they matter in practice, and how to design, run, and interpret them ethically. You’ll see concrete examples, learn common pitfalls, and get clear guidance on choosing the right design for your situation.
One-Paragraph Summary
A single-case experimental design measures one individual’s behavior repeatedly over time, then systematically changes the intervention to see if the behavior changes too. The key ingredients are a baseline phase (measuring behavior before you intervene), one or more intervention phases (applying treatment while continuing to measure), and replication (repeating this cycle to show the effect isn’t a fluke). You analyze the resulting graph visually—looking at level, trend, variability, and how quickly the behavior shifts—to judge whether the intervention caused the change. SCEDs work best when you need data-backed decisions for a single client, want to validate a new treatment before scaling up, or need to monitor progress in real time. The ethical foundation is clear: define what you’ll measure, get informed consent, set stopping rules in advance, and avoid unnecessarily withholding effective treatment just to prove a point.
Clear Explanation of the Topic
What Is a Single-Case Experimental Design?
An SCED involves three elements working together. First, you measure a specific behavior repeatedly and reliably—often daily or several times per week—so you have a steady stream of data points. Second, you deliberately change the conditions (usually by adding, removing, or switching an intervention) and keep measuring. Third, you use the resulting graph to see whether the behavior shifts in line with your condition changes.
The person becomes their own control group. Instead of comparing one group receiving treatment to another that doesn’t, you compare the same person to themselves across different phases. That’s the logic of “single-case”—within-subject comparison, not between-subject.
The Four Core Components
Baseline (Phase A): Before introducing an intervention, you collect data on the target behavior. This isn’t a quick snapshot; it’s a period of stable measurement—often 5–10 data points or longer, depending on variability. The baseline answers: What is the behavior like naturally, without this intervention? Stability matters because if behavior is already trending up or down before you intervene, you can’t confidently attribute later changes to your intervention.
Intervention (Phase B): Once baseline is solid, you introduce your treatment—and define it clearly. If it’s Functional Communication Training, write down exactly what that looks like: which words the client will learn, how you’ll teach them, what reinforcer you’ll use, how often you’ll practice. This ensures anyone reading your data understands what actually changed.
Phase Changes: Depending on your design, you move from baseline to intervention, then possibly back to baseline, or stagger interventions across different settings or behaviors. Each condition change tests whether the behavior responds.
Replication: This separates a lucky observation from proof. If behavior improves during intervention, returns to baseline levels when you remove the intervention, and improves again when you reintroduce it, you’ve replicated the effect. Replication—whether across multiple phase cycles or different baselines—is your strongest evidence that the intervention caused the change.
Reading the Graph: Visual Analysis
When you graph SCED data, look for four visual features:
Level refers to the height of the data points. Did the behavior drop when you introduced the intervention, or rise? A sudden shift in level is strong evidence that something changed.
Trend is the slope—whether the line is moving up, down, or staying flat. A steady downward trend during intervention (for behavior you want to reduce) is good; a downward trend already happening during baseline is a red flag.
Variability is how scattered the data points are. High variability makes it harder to see real changes. If your baseline is all over the place, it’s harder to claim the intervention caused a tighter, more stable pattern.
Immediacy of effect asks: did the behavior change right when you introduced the intervention, or did it take weeks? A quick shift strengthens your causal claim.
Single-Case Designs vs. Group Designs
In a group design, you randomly assign people to an intervention group or control group and compare average outcomes. In an SCED, you measure one person intensively across conditions. Group designs answer “Does this intervention work on average?” SCEDs answer “Did this intervention work for this person?”
Both are valuable, but SCEDs are better when you have one client who needs answers now, when group sizes are tiny, or when a treatment is new enough to warrant careful testing before scaling up.
Experimental Control vs. Mere Observation
An important line exists between monitoring (collecting repeated data) and experimenting (collecting repeated data and deliberately changing conditions). If you collect 12 weeks of behavior data but never change your intervention, that’s progress monitoring—valuable, but not experimental. To claim experimental control, you need to manipulate conditions and see the behavior respond to your changes, not just time passing.
SCEDs vs. Functional Analysis
SCEDs and functional analysis are related but serve different purposes. A functional analysis tests why a behavior occurs—what maintains it (attention, escape, sensory, tangible). An SCED tests whether your chosen intervention actually changes the target behavior. You might use FA results to design an intervention, then use an SCED to prove that intervention works.
Why This Matters
The Clinical Reality
In practice, behavior changes for lots of reasons. The client could be maturing, a family crisis could have ended, or the weather could have shifted. Without careful measurement and design, you might credit your intervention for changes it didn’t cause—or miss credit for changes it did. That’s not just a research problem; it’s a clinical ethics problem.
SCEDs let you answer with confidence: This intervention caused this change for this person. That confidence matters when deciding whether to continue an intervention, when families ask if treatment is working, when insurance demands proof of progress, or when training a team member to deliver the same intervention reliably.
Data-Driven Decisions
Beyond proving cause-and-effect, SCEDs give you quick, ongoing feedback. Because you’re graphing data frequently, you can see patterns emerge in real time. If an intervention isn’t working, a well-designed SCED will show you within days or weeks, not months. That feedback loop lets you adjust before time and resources are wasted.
Protecting Clients from Ineffective or Harmful Treatments
SCEDs also protect clients. If an intervention is ineffective, clear data shows it. If a treatment is unintentionally harmful—increasing aggression when it’s meant to reduce it—visual analysis will catch it. This is especially important in ABA, where behavior-reduction strategies can carry risk if poorly selected or implemented.
Key Features and Defining Characteristics
An SCED is defined by five hallmarks:
- Repeated, frequent measurement of the target behavior—multiple times per day or at least several times per week
- Clear baseline and intervention phases that everyone can identify
- Systematic manipulation of the independent variable according to a plan
- Replication of effects within the design—whether through ABAB reversal, multiple-baseline staggering, or alternating treatments
- Visual analysis as the primary tool—you draw a graph and look at it
Additional hallmarks include interobserver agreement checks and clear operational definitions of what you’re measuring.
Boundary Conditions and Limitations
SCEDs work wonderfully for behaviors you can measure frequently and reliably—aggression, on-task behavior, hand-raising, compliance, communication attempts. They’re less suited to very rare events (if a behavior happens once a month, you’ll collect baseline data for a year) or outcomes that change very slowly.
SCEDs are also ethically complicated when the behavior is serious and the intervention is known to work; it may be unethical to withhold effective treatment just to complete an ABAB design. That’s when you pivot to multiple-baseline or alternating-treatments designs, which demonstrate control without requiring withdrawal.
When You Would Use This in Practice
Decision Points for Choosing an SCED
Ask yourself: Do I need to test whether this intervention works for this specific person? If yes, an SCED is appropriate.
You might choose an SCED when:
- A client hasn’t responded to standard interventions and you’re piloting something new
- You’re introducing a program to a classroom and want to validate it with one student before rolling it out
- A family is investing significant time and money and deserves proof that treatment is working
- You can’t do a group design due to ethical or logistical constraints
- You need fast feedback—within days or weeks, not months
Real-World Scenarios
Imagine a third-grader displays aggressive outbursts in the classroom. You could run an ABAB design: baseline for two weeks, introduce Functional Communication Training for two weeks, temporarily reduce FCT coaching to see if aggression returns, then reinstate FCT. If aggression follows that pattern—high, down, up, down again—you’ve shown experimental control.
Or consider a teenager who struggles with task-following across three settings: classroom, home, and community vocational placement. You could use a multiple-baseline design: measure task-following in all three settings without intervention, then introduce your intervention in the classroom first while baselines continue elsewhere. After classroom task-following improves and holds, introduce the intervention at home, then in the community. Because each setting shows a level shift precisely when you introduce the intervention, you’ve demonstrated that the intervention—not time or other factors—drove the change.
Examples in ABA
Example 1: ABAB Reversal Design for Aggressive Outbursts
A 7-year-old child with autism displays aggressive outbursts (hitting, kicking) in the classroom, averaging 3–5 incidents per school day. You hypothesize the behavior is maintained by escape from difficult tasks. You operationalize Functional Communication Training: the child learns to say “break” when task demands feel overwhelming, and the teacher honors the request by pausing the task for 30 seconds.
Phase A (Baseline): For 10 school days, you record each outburst. Data shows a stable average of 4 per day.
Phase B (FCT Intervention): For 10 school days, the teacher implements FCT exactly as defined. Outbursts drop to an average of 1 per day, and communication attempts increase.
Phase A2 (Return to Baseline): You reduce FCT coaching temporarily. Outbursts creep back up to 3 per day.
Phase B2 (Intervention Reintroduced): FCT coaching resumes fully. Outbursts drop to 1 per day again.
The behavior followed your intervention changes in a predictable pattern. That replication across four phases is powerful evidence that FCT caused the change.
Example 2: Multiple-Baseline Across Settings for Task-Following
A 15-year-old with an intellectual disability struggles to follow multi-step tasks across his school vocational program. You design a task-simplification and visual-schedule intervention and stagger it:
Weeks 1–3: Measure task-following in all three settings (classroom, community job site, home) without intervention. Baseline completion rates hover around 40% in each.
Weeks 4–6: Introduce visual schedules in the classroom only. Classroom completion jumps to 75%; community and home stay around 40%.
Weeks 7–9: Introduce the intervention at the community job site. Community completion rises to 70%; classroom holds at 75%; home still around 40%.
Weeks 10–12: Roll out to home. All three settings now show 70%+ completion.
You never withdraw a successful intervention, so it’s ethically cleaner than reversal. And because improvement in each setting happens only when you introduce the intervention there, the staggered pattern itself is your experimental control.
Examples Outside of ABA
Example 1: Physical Therapy Pain Management
A physical therapist works with a patient recovering from knee surgery. Daily pain ratings (0–10 scale) are collected during baseline—averaging 7 for two weeks. The therapist introduces a progressive resistance exercise program, and pain ratings drop to an average of 4 over the next two weeks. This simple AB design follows SCED logic, even though the outcome is pain rather than a behavioral target.
Example 2: Education—Testing Reading Prompts
A special education teacher works with a fifth-grader who reads slowly and without comprehension. She uses an alternating-treatments design: On Mondays and Thursdays, she uses “prediction prompts” before each paragraph. On Tuesdays and Fridays, she uses “clarification prompts.” Reading fluency is measured each day.
Over four weeks, data shows prediction prompts consistently yield higher fluency. The teacher can now confidently recommend prediction prompts for this student—without creating a formal control condition where the student receives no help.
Common Mistakes and Misconceptions
Insufficient Baseline Data
A common trap is ending baseline too early. You might collect data for three days, see stability, and jump into intervention. The problem: three days isn’t enough to distinguish a true pattern from random noise. Collect at least five to ten data points and watch for stability visually. If baseline looks chaotic or is drifting, you need more data before moving on.
Changing Multiple Variables at Once
Imagine you implement FCT and increase reinforcement and add a visual schedule simultaneously. If behavior improves, which part worked? You’ve lost experimental control. Change one thing at a time. If multiple changes are necessary for safety, document that reasoning, but understand you’ve sacrificed clarity about what caused the effect.
Phases That Are Too Short
Short intervention phases can mislead you. Behavior needs time to stabilize. A two-day intervention phase won’t show you much. Aim for at least five to ten data points per phase, or long enough that stability or trend becomes visible.
Overrelying on Statistics and Ignoring Visual Analysis
SCEDs are visual. You draw a graph and look at it. Some clinicians run statistical tests without really examining the graph. Clear visual separation of phases is usually more persuasive than any statistical test. Use stats to supplement visual analysis if you wish, but never let them override what the graph shows.
Failing to Replicate Effects
One good phase change might be coincidence. Replication—showing the effect again in a second or third cycle, or across a second baseline—builds confidence.
Confusing Descriptive Monitoring with Experimental Design
You can track a client’s behavior every day without it being an SCED. If you measure progress but never deliberately manipulate conditions, that’s progress monitoring—important, but not experimental. An SCED requires you to systematically change something and watch the behavior respond.
Ethical Considerations
The Withdrawal Problem
The most ethically fraught moment is reversal. To show experimental control with an ABAB design, you withdraw an intervention that’s working. If that intervention prevents serious harm—like aggression causing injuries—deliberately stopping it can hurt the person you’re trying to help.
The solution isn’t to avoid SCEDs; it’s to choose designs that don’t require withdrawal. A multiple-baseline design demonstrates control without removing an effective intervention. An alternating-treatments design compares interventions side by side without fully stopping either. If reversal is your only option and withdrawal would cause harm, include built-in stopping rules: “If aggression exceeds X incidents per day, we reinstate the intervention immediately.”
Informed Consent and Transparency
Before you begin, explain the plan in plain language to the client (if age-appropriate) and their family. Walk them through: What behavior are we measuring? How will we measure it? What does the intervention involve? What are the risks? When will we decide to stop? Get written consent.
Measurement Reliability and Data Integrity
Ensure that whoever collects data is trained and reliable. Interobserver agreement—having two people independently measure the same behavior and comparing their scores—is the gold standard. Aim for 80%+ agreement. Guard against measurement drift and observer bias.
Defining Stopping Criteria in Advance
Before you start, decide when you’ll consider the intervention successful and when you’ll try something else. Predefined stopping criteria protect the client and prevent cherry-picking when to end based on results you like.
Social Validity and Client Dignity
Beyond the data, ask: Does the client and family believe the intervention is worthwhile? Do they notice improvement in areas that matter to them? Use informal check-ins to stay grounded in the client’s lived experience, not just numbers on a graph.
Practice Questions
Question 1: Strengthening Causal Claims
Scenario: A BCBA collects baseline data for five days and sees stable performance. She introduces an intervention and observes immediate improvement. What would most strengthen her causal claim?
Correct answer: Replication of the effect across phases (e.g., ABAB) or across settings/behaviors (multiple-baseline design).
Why it’s correct: One phase change could be coincidence. When behavior follows your intervention changes in a predictable pattern, coincidence becomes far less plausible. Replication is the hallmark of experimental control.
Question 2: Ethics and Withdrawal
Scenario: A BCBA wants to use ABAB reversal to test a behavior-reduction intervention. However, withdrawal would mean removing an intervention that significantly reduces self-injurious behavior. What’s the better choice?
Correct answer: Multiple-baseline across settings, behaviors, or participants; or alternating-treatments design.
Why it’s correct: Safety comes first. These designs demonstrate experimental control without removing a protective treatment.
Question 3: Baseline Trends and Interpretation
Scenario: A graph shows a gradual decline in target behavior during baseline and a continued (but slightly steeper) decline after intervention begins. What’s the most likely interpretation?
Correct answer: Cannot confidently conclude intervention caused change; the pre-existing baseline trend undermines the inference.
Why it’s correct: If behavior was already improving before intervention, the post-intervention improvement might just be a continuation of that trend.
Question 4: Alternating-Treatments Design Pitfall
Scenario: A teacher uses alternating treatments to compare two reading interventions. She alternates them daily for four weeks. After two weeks, she notices one is working better and starts using it more often. What’s the main problem?
Correct answer: Not counterbalancing or randomizing the order of conditions, allowing sequence effects to bias results.
Why it’s correct: Without counterbalancing, you can’t trust the comparison. Consistency in design matters.
Question 5: Documentation and Ethics
Scenario: A BCBA is about to start an SCED with a new client. What should she document before beginning?
Correct answer: Informed consent, clear operational definitions, measurement method, specific phase-change criteria, and predefined stopping rules.
Why it’s correct: These elements protect the client, ensure transparency, enable replication, and document ethical practice.
Related Concepts
Functional analysis identifies why a behavior occurs. Results often inform the intervention you test with an SCED.
Multiple-baseline design staggers intervention introduction across settings, behaviors, or participants, demonstrating control without withdrawal.
Reversal (ABAB) design returns to baseline conditions after intervention, then reinstates intervention. Powerful experimentally but ethically constrained when withdrawal causes harm.
Alternating-treatments design rapidly compares two or more interventions applied to the same person.
Visual analysis is the primary method for interpreting SCED graphs—examining level, trend, variability, and immediacy of effect.
Interobserver agreement measures whether two independent observers agree on their data, assuring measurement reliability.
Social validity captures whether stakeholders believe the intervention is worthwhile and notice real-world improvements.
FAQs
What is a single-case experimental design?
A systematic approach where you repeatedly measure one person’s behavior over time, deliberately change conditions, and measure again to see if behavior changed in response. It uses the person as their own control and relies on visual analysis to judge whether the intervention caused the observed change.
How many baseline data points do I need?
Aim for at least five to ten data points, or long enough to see a clear pattern. If baseline is chaotic, collect more. The rule: enough to see the pattern clearly.
When should I use ABAB versus multiple-baseline design?
Use ABAB when withdrawing the intervention is safe and you can reliably replicate the effect. Use multiple-baseline when withdrawal is unethical, impractical, or you’re comparing across settings or behaviors.
Do I need statistical tests for SCEDs?
Visual analysis is primary and usually sufficient. If you add statistics, choose methods designed for single-case data and report both visual and statistical findings. Never let a statistic override what the graph shows.
How do I show experimental control?
Demonstrate that behavior changes only when you change conditions. Use immediacy, magnitude, replication, and consistency. Replication is your strongest card.
Is it ethical to stop a study early if the client is improving?
Yes—if you predefined stopping rules. Stopping early to protect a client’s welfare is ethically sound. Document the decision.
Can SCEDs be used for skill acquisition and behavior reduction?
Absolutely. SCEDs work for increasing desired behaviors and decreasing unwanted ones. The measurement and design logic is the same.
Key Takeaways
Single-case experimental designs answer the question: Did my intervention cause this change? They rest on three pillars: repeated, reliable measurement; systematic manipulation of an intervention; and replication showing the behavior responds to your changes.
Choose your design based on what’s ethical and feasible. ABAB reversal designs are powerful but risky if withdrawal would cause harm. Multiple-baseline and alternating-treatments designs offer strong experimental control without that risk.
Visual analysis—looking at level, trend, variability, and immediacy of change—is your primary tool. Measurement reliability matters; ensure clear definitions, trained data collectors, and agreement checks.
An SCED is not just a research exercise. It protects clients from ineffective treatments, documents progress transparently, and enables data-informed decisions in real time. Used thoughtfully, SCEDs transform individual client care from guesswork into science.



