automated test generator

AI-powered Assessment Generator

From 90 Minutes to 5: Designing an AI Workflow that Cut Assessment Creation Time by 95%

Project overview & Context

My Role

Lead Product Designer

Team

2 Product Managers,
5+ Software Engineers,
1 AI/ML Specialist

Product Focus

A client-facing, B2B SaaS platform for creating skills assessments for various purposes

Core Business Goal

Reduce the Ops dependency on the product and the time to create assessments, by creating a self-served AI-powered platform that dramatically works specifically on test creation aspect

Foundational Research and Problem Quantification

The First Impression

Our product was failing at the most critical moment: a user's first impression.
The manual assessment builder was so complex that it was actively driving away new business and frustrating our loyal users.

Undertaking Research

Research Method

Strategic Focus

Key Insight Gained

Contextual Inquiry

I observed users as they tried to complete the task using the current system

Setting up baseline metrics to identify the problem areas of the workflow

Post-task Survey

I employed a quick survey that captures the effort on user part on completing the task

This is done to capture the raw experience of users into quantifiable metrics to refer in later stages

Semi-Structured Interviews

I asked users open-ended questions about their experience, frustrations, and workarounds.

Qualitative insights from the users on the current system to dig up more problem areas based on usage

Competitive Analysis (with PMs)

Reviewed various organisation who are offering AI-powered solutions in all shapes and form to identify market offering and product-level insigts

Analyzing the products & features offered by various organizations to see what are benefits & drawbacks of the approaches

The Problem: Quantified

I ran a baseline study observing 10 participants (mix of new and experienced users), to understand the "why" behind the reported numbers, so I conducted a qualitative research to quantify the exact operational cost of the workflow

Metric (The Problem)

Pre-Redesign Result (Baseline)

Success Criteria

Impact on Strategic Goal

Time-to-First-Value (TTFV)

~ 90 Minutes

~ 5 Minutes

For a new user, an average of 90 minutes is needed to get any value from the product, which proves an steep learning curve to adoption

Expert User Time on Task

~ 25 Minutes

~ 5 Minutes

It proved that it is not a training issue.
Even power users needed ~25 minutes for a task they did regularly.

Zero-Intervention Success Rate

40%

90%

Only 2 out of 5 new users were able to complete the task without help.
This proved self-service model was fundamentally broken.

User Effort Score

6.2/7

< 2.0 / 7 (Very Easy)

The score in the survey showed that users find the task "Very Difficult"

Support Ticket Volume

40%

Reduce by > 75%

This one broken system was responsible for ~40% of all new user support tickets, creating a significant and unnecessary operational cost.

Workflow bias: Our research sample of 10 participants was split into two key segments: 5 new users (to measure our baseline TTFV) and 5 experienced users (to measure expert time on task). This allowed us to quantify both the onboarding failure and the ongoing inefficiency of the old system.

To summarise

The data and metrics collected showcased we have a Leaky Bucket.
A phenomenon that tells us that we are spending money to acquire new clients, only to have them unable to use the core features immediately, because our core assessment workflow is sub-optimal.

This baseline data was the foundation for my design strategy.
The goal was no longer to just "make it better," but to radically reduce Time-to-First-Value and create a frictionless self-service experience.

Finding the Right Solution

Exploration & The Process

My process was not linear.

It involved rapid exploration, using low-fidelity designs to reveal critical constraints, and negotiating a hybrid solution that was best for the user and the business.

Ideation: From Sketches to Two Conflicting Paths

I began with broad exploration, creating numerous paper sketches and user flows. This work helped my team visualize the possibilities, and our ideas quickly converged into two conflicting paths.

To move the debate from abstract to concrete, I created low-fidelity wireframes for both potential solutions.

Path A: The "In-line Assistant"

Path B: The "Generator"

An AI assistant embedded inside the new manual test creation flow, offering contextual suggestions at each step.

A completely separate workflow where a user enters a prompt and the AI generates a complete, finished test.

Review & Decision: Using Design to Reveal Constraints

I presented these two low-fidelity flows to Product and Engineering leadership in a single meeting. This was the turning point. My wireframes made the concepts real enough for the team to provide critical feedback.

Stakeholder

Artifact

Feedback on My Low-Fi Designs

Tech / Engineering

Did a quick Proof-of-concept

Implementation Issue:
The In-line assistant is technically too complex and difficult to implement deeply with our codebase and current resources.

Product / Business

Validating with clients & product roadmap

Strategic Issue:
The AI Generator is a strategic problem. It would compete with the newly developed manual test-creation platform and totally negate this new product, which is not a wise product growth strategy.

We were at an impasse. Tech had vetoed Path A, and Product had vetoed Path B.

Crucially, my own competitive analysis confirmed that Path B was a flawed user experience anyway, as competitors with 'Full Generators' all suffered from the same user complaint: zero control.

My Proposal (The "Blueprint")

This is where I proposed a third, hybrid solution based on what we had learned. We could get the speed of the "Generator" (which Tech approved) without negating our other product (which satisfied Product's concern).

My proposal was "The Blueprint Generator."

This was the right decision because it solved all three problems at once:

User Problem

It gives them the 95% speed boost they need, but also provides the control that competitors were missing.

Business Problem

It doesn't compete with the manual platform; it acts as a "super-powered on-ramp" to it, enhancing its value.

Technology Problem

It was feasible to build in parallel with the new manual workflow.

To summarise

I began by creating low-fidelity wireframes for two potential paths: a "In-line Assistant" and a "Full Generator." My designs helped reveal that the In-line Assistant was technically unfeasible, while competitive analysis and product strategy showed the Generator was a flawed user experience.

I successfully navigated this impasse by proposing a third, hybrid "Blueprint" solution that was technically feasible, strategically aligned with our other products, and offered the ideal balance of speed and user control.

The Solution

The Solution: The "Blueprint" Generator

Based on our strategic decision, I designed the "Blueprint Generator."
This solution transforms the user's role from a manual builder into a strategic reviewer, solving our 90-minute onboarding crisis.

The entire experience is powered by our in-house AI model, which was trained on over 1000 of our own high-quality assessments. This gave us the unique ability to accurately extract skills from a job description and build a relevant test structure - a key technical enabler that made this design possible.

Key Design Decisions

To make this successful, my design was guided by few core principles:

Guideline

Thought Process

Speed to Value Above All

We wanted to expedite the process from Blank page to a usable blueprint in less than 5 minutes. This demanded a simple, single-prompt interface.

Build Trust Through Transparency

The user must be able to see what the AI did (e.g., "Skills identified: Python, SQL") and is always in control and have the ability to edit it. This was critical for user trust.

Empower, Don't Trap

The user must never feel stuck in an automated flow. The "Seamless Handoff" was designed to feel like an empowerment step, not a restrictive one.

Increased Visibility of Collateral

The user should have increased visibility of the Pre-built Assessments, Custom Assessments available in their accounts at all time, so that they can reuse the existing one (if they want to), instead of creating new one everytime.

Blueprint Generator: The Three-Step Flow

Step 1: Provide Input

Step 2: Generate Blueprint

Step 3: Seamless Handoff

Action: The user pastes a job description or types a simple prompt.

Benefit: Minimal effort required from the user to get started.

Action: The AI reads the input, identifies key skills, and builds a recommended 'blueprint.' It also surfaces existing 'Recommended Tests' from the user's account to prevent duplicate work.

Benefit: Eliminates the 90-minute "blank page" problem and does 90% of the work.

Action: The user reviews the blueprint and is seamlessly transitioned into the manual workflow, which is now pre-populated with the AI's blueprint.

Benefit: The user gets the speed of a generator and the full control of the manual builder.

The Results

Design Validation (Testing the Solution)

Since the final product is still in development, we could not measure post-launch business metrics. The critical next step was to validate our "Blueprint" hypothesis and prove, with data, that the design was ready for development.

I ran a final round of moderated usability testing with our high-fidelity prototype. We used the same 10-participant split (5 new, 5 expert) to measure our new design directly against the "before" metrics.

The results proved that our "Blueprint" strategy was a success.

Final Result: Quantified Impact

Metric

Pre-design Result (Baseline)

Post-Design Result (Validation Test)

The Impact

Time-to-First-Value (TTFV)

~ 90 Minutes

~ 4.5 Minutes

A 95% reduction in time for new users. We turned a 90-minute crisis into a 5-minute "wow" moment.

Expert User Time on Task

~ 25 Minutes

~ 4.5 Minutes

An 82% reduction in time for expert users. This proved the new flow was a superior path for all users.

Zero-Intervention Success Rate

40%

92% (9 of 10 users)

We met our >90% goal. This proved the design was truly intuitive and solved our self-service problem.

User Effort Score

6.2/7

1.8 / 7 (Very Easy)

We flipped user sentiment from "frustration" to "delight." The task was no longer a pain point.

The data was clear: the Blueprint solution successfully solved our critical onboarding crisis and met or exceeded every success criterion we set.

What Users Said (Qualitative Feedback)

The numbers showed us the design was a success, but the user quotes told us why.

On Speed & Ease of Use

"This was magical. I used to spend 25 minutes just getting the structure right. With AI help, I can just focus on content, not the structure."

"Can I really have a full-blown usable assignment in 10 mins? Amazing!"

On Trust & Control

"I've used other AI tools that just spit out a final thing, and I hate it. I loved this because it showed me the blueprint, let me agree with it, and then I could still make my own changes. I actually trust this."

"I love that AI model we are using to get the blueprint right is trained directly on our content library which we painstaking populated for last 5-7 years"

Conclusion

Key Challenges & Learnings

This project was not just a design challenge; it was a strategic one. Navigating these two issues was critical to the project's success.

Challenge

Description

Learning

Building User Trust in AI

Our early research showed that expert users were skeptical of AI. They didn't trust or think that a "black box" can build a high-quality, usable assessment for them

I learned that for AI products, transparency is more important than speed.

My "Blueprint" solution was successful because it was transparent. It let the user review and approve the AI's work at every step before committing, which built the trust necessary for them to adopt the feature.

Balancing User Needs vs. Business Strategy

We were stuck in an impasse. Tech vetoed the "Co-pilot," and Product vetoed the "Full Generator" as it competed with our other new platform

The best solution is often a hybrid that re-frames the problem.

My 'Blueprint' solution did more than just compromise; it created a superior outcome that was better than either of our original paths.

Proving the Final Business Value

Our validation testing proved the usability of the design, but the product is not yet launched. We have not yet proven the business impact

A designer's job isn't done at the validation stage.
The key learning is that the ultimate measure of this design's success is not just the 95% time reduction, but a measurable decrease in customer churn.

My immediate next step on this project is to partner with our data team to track these business metrics post-launch to prove the design's true ROI.

This design has been validated to solve our core usability crisis, and I am confident that its launch will drive the significant business results we set out to achieve.