Make Your Resume Now

Software Test Lead

Job Overview

Key Responsibilities

 Software Testing & QA Leadership

  • Design, review, and lead the implementation of test plans, test cases, and test strategies for various software components (APIs, services, UI).
  • Oversee test automation script development using tools such as PyTest, Selenium, Playwright, or Postman.
  • Maintain and optimize test automation pipelines, integrating with CI/CD tools (e.g., Jenkins, GitLab CI, Azure DevOps).
  • Lead functional, regression, smoke, and performance testing efforts to validate system readiness.
  • Ensure traceability from requirements to test cases and bug reports.

 LLM Evaluation & Benchmarking

  • Lead a team responsible for the evaluation of Large Language Model (LLM) outputs.
  • Design capability-based evaluation benchmarks (e.g., summarization, reasoning, math, code generation).
  • Guide the development and execution of auto-evaluation scripts, using LLM-as-a-judge, rule-based, and metric-based methods.
  • Build and maintain evaluation pipelines to track model accuracy, hallucination rates, robustness, and more.
  • Collaborate closely with AI Engineers and Data Scientists to align evaluations with development priorities.

Team Leadership & Technical Coaching

  • Mentor and support a team of QA engineers and model evaluators.
  • Allocate tasks, define sprint goals, and ensure timely and high-quality delivery of testing and evaluation artifacts.
  • Foster a culture of test-first thinking, technical quality, and continuous improvement.
  • Communicate evaluation insights and quality reports to product managers and stakeholders.

Required Qualifications

  • Bachelor's or Master’s degree in Computer Science, Software Engineering, AI, or a related field.
  • Minimum 5+ years in software testing, including experience as a Senior QA Engineer or Test Lead.
  • Strong experience in test case writing, test scenario design, and test automation scripting.
  • Proficiency in scripting languages like Python, JavaScript, or Java for test automation.
  • Experience with tools such as PyTest, Selenium, JUnit, Playwright, Postman, etc.
  • Familiarity with LLMs (e.g., DeepSeek, Mistral, LLaMA) and AI evaluation metrics (BLEU, ROUGE, Accuracy, etc.).
  • Experience in building or maintaining benchmark datasets for AI evaluation.
  • Understanding of prompt engineering, response validation, and error case analysis.

Preferred Skills

    • Experience with LLM evaluation libraries/tools like OpenAI Evals, TruLens, LangChain Eval, or custom scripts.
    • Experience working with MLOps or AI pipelines and integrating tests within them.
    • Familiarity with dataset labeling platforms or human-in-the-loop evaluation systems.
    • Strong data analysis and reporting skills using Excel, Python (Pandas/Matplotlib), or dashboards.
    • Ability to define and customize evaluation logic per customer or business domain.

Ready to Apply?

Take the next step in your career journey

Stand out with a professional resume tailored for this role

Create Resume