Skip to main content

Quality Engineering Strategy: Building Confidence in Our Code

1. Introduction: Quality is Everyone's Responsibility

At XOPS, "Quality Engineering" is not the sole responsibility of a separate QA team; it's a cultural imperative embedded in every stage of our development lifecycle. We aim to build reliable, robust, and performant software. This strategy outlines our comprehensive approach to ensuring quality, from initial development through to production monitoring.

Our philosophy is to shift quality left—detecting and preventing defects as early as possible—and to maintain a high standard of quality throughout the software's life.

Core Mission: To establish and maintain an exceptionally high standard of software quality, ensuring reliability, security, and performance through integrated testing, automation, and continuous feedback loops.


2. Pillars of Our Quality Strategy

Our approach to quality is multi-faceted, encompassing various types of testing and engineering practices.

  • Developer-Led Quality: Developers are the first line of defense for quality. They are responsible for writing clean code, unit tests, and comprehensive integration tests.
  • Automated Testing: A robust suite of automated tests runs on every commit and PR, providing fast feedback.
  • Manual & Exploratory Testing: For complex features and user experience validation, manual and exploratory testing are performed by QA engineers and product teams.
  • Production Monitoring & Feedback: Quality doesn't end at deployment. We continuously monitor production for issues and feed learnings back into the development process.
  • AI-Assisted Review: As documented in GitHub Actions, AI (Claude) performs initial code and security reviews to catch common issues.

3. Automated Testing Strategy

Automated testing is the backbone of our fast, reliable CI/CD pipeline.

Unit & Integration Testing

  • Tools:
    • Vitest: For JavaScript/TypeScript projects (frontend and backend Node.js services). Vitest is chosen for its speed and modern feature set.
    • Pytest: For Python projects (backend services, data pipelines).
  • Philosophy:
    • Unit Tests: Every function and module should have comprehensive unit tests covering edge cases. Aim for >80% code coverage.
    • Integration Tests: Tests that verify the interaction between components or services (e.g., testing an API endpoint that interacts with the Knowledge Graph). These run against a local or ephemeral test environment.
  • Execution: These tests are run as the first step in our GitHub Actions CI pipeline. A PR will fail if tests do not pass.

End-to-End (E2E) Testing

  • Tools:
    • Playwright: For testing our front-end applications (Control Center, Experience Center) from a user's perspective. Playwright allows us to script browser interactions and verify UI states.
  • Strategy:
    • E2E tests are run against the staging environment after a successful deployment from the CI pipeline.
    • These tests focus on critical user journeys and business flows.
    • We aim for a focused, high-value set of E2E tests to keep execution times reasonable.

4. Test Case Management & Orchestration

  • Tool: Qase is our primary platform for managing manual and automated test cases.
  • Usage:
    • All test cases, whether manual or automated, are documented in Qase.
    • For automated tests (Vitest, Playwright), Qase is integrated to report test run results. This provides a single dashboard for all quality metrics.
    • Manual test cases for exploratory testing and user acceptance testing (UAT) are written and executed within Qase.
  • Linkage: Test runs in GitHub Actions are configured to report results back to Qase, linking pipeline runs to specific test cases.

5. Quality Engineering in the Lifecycle

Quality is not confined to a single phase; it's integrated throughout.

  • Development: Developers write unit tests and integration tests using Vitest/Pytest. They also write manual tests in Qase for complex logic.
  • Code Review: Human reviewers and AI (Claude) perform code reviews, checking for bugs, security issues, and adherence to best practices.
  • CI Pipeline: GitHub Actions automatically runs Vitest/Pytest, static analysis (ruff), security scans (semgrep), and dependency checks (FOSSA).
  • Staging Deployment: Playwright E2E tests run against the staging environment. Chaos engineering experiments are also run to test resilience.
  • Production Monitoring: New Relic, Sentry, and Cerebro monitor for errors, performance degradation, and anomalies. Feedback from production issues is used to create new Jira tickets and improve tests.
  • Incident Post-Mortems: Blameless post-mortems identify systemic quality issues and lead to improvements in our testing strategies and documentation.

6. Security Testing Integration

Security is a critical aspect of quality.

  • Static Analysis: semgrep and bandit are run in CI to catch common vulnerabilities.
  • AI Security Review: Claude is used for deeper analysis of code for subtle security flaws.
  • Dependency Scanning: FOSSA ensures no vulnerable or non-compliant open-source libraries are introduced.
  • Penetration Testing: Regularly scheduled penetration tests, as detailed in the Platform Operations Guide, provide an external security validation.

7. Conclusion

By embedding quality engineering practices at every step—from developer-led testing and AI-assisted reviews to robust automated pipelines and continuous production monitoring—we aim to build a platform that is not only functional but also exceptionally reliable, secure, and performant. This holistic approach ensures that quality is a shared responsibility and a fundamental part of our engineering culture.