Job Search and Career Advice Platform

Enable job alerts via email!

QA Engineer - Load Testing Specialist (2 months contract)

Monolith AI

Greater London

On-site

GBP 50,000 - 70,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A technology company in Greater London is hiring an experienced QA Engineer to lead load testing efforts for critical system releases. The role involves designing automated load tests, analyzing system performance, and collaborating with teams to establish measurable performance criteria. Candidates should have 4+ years in QA roles, hands-on experience with distributed systems, strong Python skills, and familiarity with AWS services. Immediate start is available for a focused engagement lasting 3 months.

Qualifications

  • 4+ years of experience in QA/performance testing roles.
  • Experience with load testing distributed systems in AWS environments.
  • Strong proficiency in Python for automation tasks.

Responsibilities

  • Design and implement automated load testing framework.
  • Monitor system performance and identify bottlenecks.
  • Collaborate on defining measurable non-functional requirements.

Skills

Performance testing
Load testing tools
Python
AWS services
REST API testing

Tools

JMeter
k6
Grafana
Job description
Position Overview

Monolith AI is seeking an experienced QA Engineer to lead load testing efforts for a critical system release focused on improving concurrency and high request load handling.

This fast‑paced, short‑term engagement requires someone who can quickly understand complex distributed systems, design comprehensive load tests, and work collaboratively with a rapidly growing engineering team to ensure our new environment meets performance requirements.

Primary Responsibilities
  • Design and Implement Automated Load Testing Framework
    • Develop comprehensive load tests for FastAPI endpoints, Temporal workflows/activities, and AWS service interactions.
    • Create realistic test scenarios simulating concurrent workflow execution patterns, including graph‑based workflow orchestration.
    • Build automated test suites that measure system behavior under varying concurrency levels and request loads.
  • Performance Analysis and Bottleneck Identification
    • Monitor and analyze system performance across the entire stack (API layer, Temporal workers, AWS services).
    • Identify concurrency limitations in Temporal workflow execution, AWS service limits (Athena, ECS), and inter‑component communication.
    • Document performance characteristics including response times, throughput limits, and failure modes under load.
  • Collaborate on Non‑Functional Requirements (NFR) Definition
    • Work with Customer Success and Product teams to understand business requirements and translate them into measurable performance criteria.
    • Iterate on acceptable concurrency thresholds, latency targets, and throughput requirements.
    • Validate that proposed NFRs are realistic and achievable given architectural constraints.
  • System Documentation and Knowledge Extraction
    • Understand the existing system through code review, discussions with the development team, and exploratory testing.
    • Create clear documentation of test methodologies, results, and recommendations for future testing.
  • Recommendation and Optimization Guidance
    • Provide actionable recommendations for removing identified bottlenecks.
    • Suggest configuration optimizations for Temporal (worker pools, task queues) and AWS services (Athena concurrency, ECS capacity).
  • Rapid Communication and Status Reporting
    • Maintain daily/frequent communication with the Tech Lead regarding project progress, blockers, and findings.
    • Quickly escalate issues that could impact the aggressive timeline.
    • Present findings and recommendations to technical and non‑technical stakeholders.
  • Cross‑Component Integration Testing
    • Test complex scenarios involving graph execution triggering node workflows across multiple system boundaries.
    • Validate S3 read/write operations under concurrent load.
    • Ensure inter‑component communication (API → Temporal, Temporal Activity → API triggers) performs reliably at scale.
Key Performance Indicators
  1. Test Coverage and Execution
    • Complete automated load test suite covering all critical components within the first 3 weeks.
    • Execute baseline and progressive load tests identifying maximum sustainable concurrency levels.
  2. Bottleneck Identification and Impact
    • Identify and document top 5–7 performance bottlenecks with clear impact analysis.
    • Provide actionable remediation recommendations with estimated effort and impact for each bottleneck.
  3. NFR Definition and Validation
    • Collaborate with stakeholders to define measurable NFRs within the first 2 weeks.
    • Validate system meets or document gaps against agreed NFR criteria by the project end.
  4. Documentation and Knowledge Transfer
    • Deliver comprehensive test documentation, results analysis, and system performance characteristics.
    • Conduct knowledge transfer sessions ensuring the team can maintain and extend the testing framework.
  5. Project Velocity and Communication
    • Meet weekly milestone targets in the fast‑paced 2‑month engagement.
    • Maintain proactive communication rhythm (daily standups, weekly detailed reports to Tech Lead).
Required Qualifications

Experience:

  • 4+ years of experience in QA/performance testing roles.
  • 2+ years of hands‑on experience with load testing distributed systems and microservices architectures.
  • Proven experience with load testing tools (e.g., k6, JMeter, Locust, Gatling, Artillery).
  • Experience testing workflow orchestration systems (Temporal, Airflow, Prefect, or similar).
  • Demonstrated ability to test systems integrating with AWS services (particularly Athena, ECS, S3).

Technical Skills:

  • Strong proficiency in Python (required for test automation and working with FastAPI/Temporal).
  • Experience with REST API testing and performance validation.
  • Understanding of distributed systems concepts: concurrency, queueing, backpressure, rate limiting.
  • Familiarity with AWS infrastructure and service limits.
  • Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, or similar).
  • Proficiency with Git and CI/CD pipelines.
  • Ability to read and understand code in order to design effective tests.

Immediate Availability:

  • Ability to start in early January 2025 and commit to focused 3‑month engagement.
  • Availability for full‑time contract work during the project duration.
Preferred Qualifications
  • Direct experience with Temporal.io (workflows, activities, workers).
  • Experience with containerized workloads and Docker/ECS.
  • Prior work in fast‑paced startup or scale‑up environments.
  • Experience with infrastructure‑as‑code (Terraform, CloudFormation).
  • Background in Site Reliability Engineering (SRE) or DevOps practices.
  • Familiarity with data processing pipelines and analytics systems.
  • Previous contract/consulting experience with rapid knowledge acquisition.
  • Experience with graph‑based workflow systems or DAG execution engines.
  • Knowledge of AWS service limits and optimization strategies.
Essential Soft Skills

Self‑Direction and Initiative:

  • Ability to operate independently in an ambiguous, fast‑moving environment with minimal documentation.
  • Proactive problem‑solving mindset; doesn’t wait for perfect information before taking action.
  • Comfortable making pragmatic decisions quickly in a time‑constrained project.

Communication and Collaboration:

  • Exceptional communication skills for extracting knowledge through conversations with existing team members.
  • Ability to translate technical findings into clear, actionable recommendations for diverse audiences.
  • Comfortable asking clarifying questions and challenging assumptions respectfully.
  • Strong written communication for documentation and status updates.

Adaptability and Learning Agility:

  • Quick learner who can rapidly understand complex, poorly documented systems.
  • Flexible and comfortable with changing priorities in a 15‑person team that’s doubling in size.
  • Thrives in fast‑paced environments with aggressive timelines.

Pragmatism and Results Orientation:

  • Focused on delivering practical, actionable outcomes within tight timeframes.
  • Understands the balance between thoroughness and speed in a 2‑month engagement.
  • Comfortable with "good enough" when perfection isn’t achievable within constraints.

Stakeholder Management:

  • Skilled at managing expectations with technical leadership about realistic timelines and trade‑offs.
  • Diplomatic when delivering difficult news about performance limitations or bottlenecks.
  • Collaborative approach when working with CS and Product on NFR definition.
Key Challenges in This Role
  1. Rapid Knowledge Acquisition with Limited Documentation
    • The existing system lacks comprehensive documentation, requiring you to quickly build understanding through code review, system exploration, and frequent discussions with the development team.
    • Success requires comfort with ambiguity and strong investigative skills.
  2. Aggressive Timeline with High Impact
    • A 3‑month timeline to design tests, execute comprehensive load testing, identify bottlenecks, and deliver actionable recommendations is extremely tight.
    • Must balance thoroughness with pragmatism; prioritize ruthlessly to ensure critical areas are covered.
  3. Complex Distributed System with Multiple Integration Points
    • The system involves multiple layers (FastAPI, Temporal, AWS services) with complex inter‑component communication patterns (graph → node workflows).
    • Must understand the entire stack sufficiently to design realistic, comprehensive load tests that expose real‑world bottlenecks.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.