Qualytics: How I Built It and How You Can Contribute
An in-depth discussion about Qualytics so that you can contribute to the project
Introduction
Qualytics is an open-source VS Code extension that analyzes TypeScript projects for code quality metrics. This blog post provides a comprehensive technical overview of the extension's architecture, key algorithms, performance considerations, and visualization techniques. Whether you're a curious user or a potential contributor, this guide will give you a deep understanding of Qualytics' internals and hopefully inspire you to contribute to the project on GitHub.
1. Extension Architecture
Qualytics is built using TypeScript and leverages the VS Code Extension API. Let's break down the main components and their interactions:
1.1 Entry Point: extension.ts
The extension.ts file serves as the entry point for the extension. It registers the command qualytics.showMetrics and defines the activate function, which is called when the extension is activated.
This code sets up the command and handles the main flow of the extension:
- Analyze the workspace
- Calculate metrics
- Visualize the results
1.2 File Utilities: file-utils.ts
The file-utils.ts
module provides functions for traversing the workspace and identifying TypeScript files:
This function recursively traverses directories, excluding those specified in the configuration, and collects all TypeScript files.
1.3 Metrics Calculation: metrics.ts
The metrics.ts
file contains the core logic for calculating various code quality metrics. It uses the @typescript-eslint/typescript-estree
parser to generate an Abstract Syntax Tree (AST) for each TypeScript file.
1.4 AST Utilities: ast-utils.ts
The ast-utils.ts
module provides utility functions for working with the AST:
This traverseAST
function is crucial for walking through the AST and applying various metric calculations.
1.5 Visualization: webview.ts
The webview.ts
file handles the creation of a webview to display the calculated metrics using charts:
This function creates a new webview panel and populates it with HTML content that includes the metrics data and references to the necessary scripts and styles.
2. Key Algorithms and Metric Calculations: A Line-by-Line Analysis
Let's dive deep into the core algorithms of Qualytics, examining the most significant lines of code and explaining their purpose and impact.
2.1 Parsing the TypeScript AST
import { TSESTree, parse, AST_NODE_TYPES } from "@typescript-eslint/typescript-estree";
: This line imports specific utilities from the TypeScript ESTree parser.TSESTree
provides type definitions for the AST nodes,parse
is the function used to generate the AST, andAST_NODE_TYPES
is an enum of all possible node types.ast = parse(code, { ... });
: This line parses the TypeScript code into an AST. The options passed toparse
are crucial:loc: true
: Adds location information to each node, which is essential for mapping metrics back to specific lines of code.range: true
: Adds start and end character indices to each node, useful for precise code manipulation if needed.comment: true
: Includes comments in the AST, allowing for potential comment analysis in future iterations.tokens: true
: Includes a list of tokens in the AST, which could be used for more detailed analysis in the future.sourceType: "module"
: Treats the code as an ECMAScript module, allowing for import/export statements.ecmaFeatures: { jsx: true }
: Enables parsing of JSX syntax, crucial for React projects.
2.2 Traversing the AST
enter(node);
: This line calls theenter
function for the current node before processing its children. This allows for pre-order traversal operations.for (const key in node) { ... }
: This loop iterates over all properties of the node. It's used instead of Object.keys() to include inherited properties, which is necessary for complete AST traversal.if (Object.prototype.hasOwnProperty.call(node, key)) { ... }
: This check ensures we're only processing own properties of the node, not inherited ones from the prototype chain.if (Array.isArray(child)) { ... } else if (isASTNode(child)) { ... }
: This conditional handles two cases:- If the child is an array (e.g., for statement bodies), it recursively traverses each item in the array.
- If the child is a single AST node, it recursively traverses that node.
if (leave) { leave(node); }
: This line calls theleave
function after processing all of a node's children, allowing for post-order traversal operations.
2.3 Calculating Cyclomatic Complexity
let complexity = 0;
: Initializes the complexity counter. We start at 0 and increment it for each decision point in the code.traverseAST(ast, (node) => { ... });
: Uses our custom AST traversal function to visit each node in the tree.switch (node.type) { ... }
: This switch statement is the core of the complexity calculation. It checks for specific node types that represent decision points or branching in the code.complexity++;
: This line increments the complexity for each decision point found. It's called for control flow statements, conditionals, and certain logical expressions.if ((node as TSESTree.SwitchCase).test !== null) { complexity++; }
: This line is specifically for switch cases. It only increments complexity for cases with a test condition (i.e., not the default case).if (["&&", "||", "??"].includes((node as TSESTree.LogicalExpression).operator)) { complexity++; }
: This line handles short-circuit logical expressions. These are considered decision points because they can alter the flow of execution.return complexity + 1;
: The final complexity is the count of decision points plus one. This "+1" represents the single entry point to the code, ensuring that even linear code has a complexity of at least 1.
This function effectively calculates the McCabe Cyclomatic Complexity, which quantifies the number of linearly independent paths through a program's source code. Higher complexity indicates more complex control flow and potentially harder-to-maintain code. Note that we do not use the McCabe Formula here. We approximate the value.
2.4 Calculating Halstead Metrics
const operators = new Set<string>();
andconst operands = new Set<string>();
: These lines initialize Sets to store unique operators and operands. Using Sets ensures we only count each unique operator/operand once, which is crucial for Halstead metrics.let operatorCount = 0;
andlet operandCount = 0;
: These variables keep track of the total number of operators and operands, including duplicates.- The
switch
statement categorizes different AST node types:- For expressions (binary, logical, assignment, etc.), it adds the operator to the
operators
Set and incrementsoperatorCount
. - For identifiers and literals, it adds them to the
operands
Set and incrementsoperandCount
. - For function calls, it treats the function name as an operator.
- For member expressions (e.g.,
object.property
), it treats the property as an operand. - The ternary operator (
?:
) andnew
keyword are treated as operators.
- For expressions (binary, logical, assignment, etc.), it adds the operator to the
const n1 = operators.size;
andconst n2 = operands.size;
: These lines get the count of unique operators and operands.const N1 = operatorCount;
andconst N2 = operandCount;
: These represent the total count of operators and operands.const vocabulary = n1 + n2;
: This calculates the program vocabulary, which is the sum of unique operators and operands.const length = N1 + N2;
: This calculates the program length, which is the sum of total operators and operands.const volume = vocabulary > 0 ? length * Math.log2(vocabulary) : 0;
: This calculates the Halstead Volume metric. It represents the information content of the program. The conditional check preventsMath.log2(0)
which would result inInfinity
.
The Halstead Volume metric provides a measure of the program's size in terms of its operators and operands. It's useful for estimating the amount of information a reader might need to understand the code fully.
2.5 Calculating Maintainability Index
const volumeLog = volume > 0 ? Math.log(volume) : 0;
: This line calculates the natural logarithm of the Halstead Volume. The conditional check preventsMath.log(0)
which would result inInfinity
. If the volume is 0 or negative (which shouldn't happen in practice), it setsvolumeLog
to 0.const locLog = linesOfCode > 0 ? Math.log(linesOfCode) : 0;
: Similarly, this calculates the natural logarithm of the Lines of Code, with the same safeguard against non-positive values.const mi = 171 - 5.2 * volumeLog - 0.23 * complexity - 16.2 * locLog;
: This is the core of the Maintainability Index calculation. The formula is based on empirical studies and combines three metrics:- Halstead Volume (represented by
volumeLog
) - Cyclomatic Complexity
- Lines of Code (represented by
locLog
) The coefficients (171, 5.2, 0.23, 16.2) are derived from these studies and calibrate the impact of each metric on the final score.
- Halstead Volume (represented by
return Math.max(0, (mi * 100) / 171);
: This line normalizes the Maintainability Index to a 0-100 scale. TheMath.max(0, ...)
ensures the result is never negative. Dividing by 171 and multiplying by 100 converts the raw score to a percentage, where higher values indicate better maintainability.
This function effectively combines multiple metrics into a single score, providing a holistic view of code maintainability. It's particularly useful for comparing the relative maintainability of different modules or tracking how maintainability changes over time.
2.6 Analyzing Class Structure
let classCount = 0;
: This variable keeps track of the total number of classes encountered.if (node.type === AST_NODE_TYPES.ClassDeclaration && node.id) { ... }
: This condition checks if the current node is a class declaration and has an identifier (name). This ensures we're only processing named classes.let depth = 1;
: Initially sets the inheritance depth to 1 for each class, assuming it doesn't inherit from any other class.if (node.superClass && node.superClass.type === AST_NODE_TYPES.Identifier) { ... }
: This checks if the class has a superclass (i.e., it extends another class) and if that superclass is identified by a simple identifier (not a computed expression).depth = (inheritanceMap.get(superClassName) || 1) + 1;
: Calculates the inheritance depth of the current class. It looks up the depth of the superclass in theinheritanceMap
and adds 1 to it. If the superclass isn't in the map (which could happen if classes are defined out of order), it assumes a depth of 1 for the superclass.inheritanceMap.set(node.id.name, depth);
: Stores the calculated depth for the current class in theinheritanceMap
.const maxInheritanceDepth = Math.max(...inheritanceMap.values(), 0);
: After processing all classes, this line finds the maximum inheritance depth. The...
spread operator is used to pass all values in the map toMath.max()
. The0
is included to handle the case where no classes were found (preventingInfinity
as a result).return { classCount, maxInheritanceDepth };
: Returns an object with the total number of classes and the maximum inheritance depth.
This function provides insights into the class structure of the codebase, which can be useful for identifying overly complex inheritance hierarchies or assessing the overall object-oriented design of the project.
2.7 Analyzing Function Structure
let functionCount = 0;
: Initializes a counter for the total number of functions encountered.traverseAST(ast, (node) => { ... });
: Uses the custom AST traversal function to visit each node in the tree.- The condition in the
if
statement checks for four different types of function-like constructs:AST_NODE_TYPES.FunctionDeclaration
: Standalone function declarations (e.g.,function foo() { ... }
)AST_NODE_TYPES.FunctionExpression
: Function expressions (e.g.,const foo = function() { ... }
)AST_NODE_TYPES.ArrowFunctionExpression
: Arrow functions (e.g.,const foo = () => { ... }
)AST_NODE_TYPES.MethodDefinition
: Methods in classes (e.g.,class Foo { bar() { ... } }
)
functionCount++;
: Increments the function count each time one of these function types is encountered.return { functionCount };
: Returns an object with the total count of functions.
This function provides a simple but useful metric for understanding the structure of the codebase in terms of how many functions it contains. A high number of functions might indicate a well-modularized codebase, but could also potentially signal over-engineering if the number is excessively high relative to the project's size and complexity.
By analyzing both class and function structures, Qualytics provides developers with insights into the overall architecture of their codebase. This can be particularly useful for identifying areas that might benefit from refactoring or for understanding the general coding style used in the project.
3. Performance Considerations: A Closer Look
Let's examine how Qualytics handles performance for large projects:
This function demonstrates several performance optimizations:
- Progress Reporting: It uses VS Code's progress API to show a progress bar, keeping the user informed during long-running analyses.
- Batch Processing: Files are processed in batches (default size of 10) to balance between parallelism and system resource usage.
- Asynchronous File Reading: Files are read asynchronously to prevent blocking the main thread.
- Parallel Processing: Within each batch, files are processed in parallel using
Promise.all
. - Error Handling: Errors in processing individual files don't stop the entire analysis. They're logged to an output channel for later review.
These optimizations allow Qualytics to handle large projects efficiently, processing files in parallel while avoiding overwhelming the system or becoming unresponsive.
4. Theory Behind Code Quality Metrics
Qualytics uses several well-established code quality metrics:
- Cyclomatic Complexity: Measures the number of linearly independent paths through a program's source code. Higher complexity indicates more difficult-to-maintain code.
- Halstead Metrics: Based on the number of operators and operands in the code. It provides various sub-metrics like program volume, difficulty, and effort.
- Maintainability Index: A composite metric that aims to give an overall score for how maintainable the code is. It combines other metrics like cyclomatic complexity, Halstead volume, and lines of code.
- Lines of Code: While simple, it's still a useful metric for understanding the size and potential complexity of a codebase.
- Depth of Inheritance: Measures the maximum depth of the inheritance tree. Deep inheritance hierarchies can make code harder to understand and maintain.
- Class and Method Counts: These provide insight into the structure and organization of the code.
Each of these metrics provides a different perspective on code quality, and when used together, they can give a comprehensive view of a codebase's health.
5. Visualization Techniques
Qualytics uses Chart.js to create interactive visualizations of the calculated metrics. The main visualization logic is in the media/main.js
file:
These charts provide:
- A bar chart of cyclomatic complexity per file
- A bar chart of maintainability index per file
- A scatter plot of lines of code vs. cyclomatic complexity
The visualizations are interactive, allowing users to hover over data points for more information and zoom in/out of the charts.
6. Setting Up the Development Environment
To contribute to Qualytics, follow these steps:
-
Clone the repository:
git clone <https://github.com/aritra741/Qualytics>
-
Install dependencies:
cd qualytics npm install
-
Open the project in VS Code:
code .
-
Build the project:
npm run compile
-
To run the extension in debug mode:
- Press
F5
or select "Run and Debug" from the sidebar - Choose "Run Extension" from the dropdown
- Press
-
Try the extension:
- Open a folder on VS Code which you want to analyze.
- Press
cmd+shift+P (macOS) or ctrl+shift+P (windows or linux)
and select “Qualytics: Show Code Metric”. It will analyze your whole folder and show you a table and charts.
Project Structure
src/
: Contains the TypeScript source filesout/
: Contains the compiled JavaScript files (generated after building)media/
: Contains static assets for the webview
Key Files for Contributors
src/extension.ts
: The main entry point of the extensionsrc/metrics.ts
: Contains the core logic for calculating metricssrc/ast-utils.ts
: Utilities for working with the Abstract Syntax Treesrc/webview.ts
: Handles the creation and population of the webviewmedia/main.js
: Contains the visualization logic using Chart.js
7. Potential Features and Enhancements
Here are some ideas for features and enhancements that contributors could work on:
-
Trend Analysis: Implement a feature to track and visualize how code metrics change over time. This could involve storing historical data and creating line charts to show metric trends.
-
Custom Metric Thresholds: Allow users to set custom thresholds for each metric and highlight files that exceed these thresholds.
-
Integration with Version Control: Analyze metrics for changed files in the current Git branch or pull request.
-
Code Smell Detection: Implement algorithms to detect common code smells based on the calculated metrics and AST analysis.
-
Metric Explanations: Provide detailed explanations and suggestions for improvement for each metric.
-
Export Functionality: Allow users to export the metrics and visualizations as PDF reports or CSV files.
-
Support for More Languages: Extend the analysis to support other popular languages like JavaScript, Python, or Java.
8. Contributing Guidelines
When contributing to Qualytics, please follow these guidelines:
- Code Style: Follow the existing code style. The project uses TypeScript's strict mode and ESLint for linting.
- Documentation: Add JSDoc comments to new functions and update the README.md file if you're adding new features or changing existing ones.
- Pull Requests: Fork the repo, create a new branch for your feature or bug fix, and submit a pull request with a clear description of your changes.
- Performance: Keep performance in mind, especially when working with large codebases. Use asynchronous operations where appropriate and consider the impact of your changes on analysis time.
- Compatibility: Ensure your changes are applicable as a VSCode.
Conclusion
Qualytics is a powerful tool for visualizing code quality metrics in TypeScript projects, and there's great potential for expanding its capabilities. By understanding its architecture and inner workings, you're now equipped to contribute to its development.
Whether you're improving existing features, adding new metrics, or enhancing the visualization capabilities, your contributions can help developers worldwide better understand and improve their code quality.
I look forward to your contributions and ideas to make Qualytics even better. Happy coding!