In this lab, you will explore Java’s multithreading capabilities by implementing concurrent word counting operations. You’ll compare the performance of single-threaded and multi-threaded approaches for processing large text files.
Introduction
A developer is working for a company that processes large volumes of text data daily, such as analyzing customer reviews, indexing books, or processing legal documents. As the data grows, the need for faster and more efficient processing becomes critical. In this lab, you will step into the role of this developer tasked with optimizing a word counting application.
You will start by implementing a simple single-threaded solution to count words in text files. Then, you will explore the power of multithreading to divide the workload across multiple threads, enabling faster processing by utilizing modern multi-core processors. Finally, you will compare the performance of both approaches, analyzing metrics such as execution time, CPU utilization, and memory usage. This hands-on activity will give you practical experience with Java’s threading capabilities and help you understand how concurrency can improve application performance.
Prerequisites
Java Development Kit (JDK) 17 or later
Your preferred Java IDE
Git for version control
Getting Started
If your instructor is using GitHub classroom, you will need to accept the assignment using the link at the bottom of this page, clone your auto-generated repository, and import it as a project into your IDE.
import java.io.*;
publicclassSingleThreadedWordCounter {
/**
* Counts the number of words in the given file using a single thread
* @param filePath Path to the file to count words in
* @return the number of words in the file
*/publicstaticlongcountWords(String filePath) throws IOException, IllegalArgumentException {
if(filePath ==null|| filePath.isEmpty()) {
thrownew IllegalArgumentException("File path cannot be null or empty");
}
long totalWords = 0;
return totalWords;
}
/**
* Counts words in multiple files using a single thread (sequentially)
* @param filePaths Array of file paths to count words in
* @return Array of word counts corresponding to each file
*/publicstaticlong[]countWordsInFiles(String[] filePaths) throws IOException, IllegalArgumentException {
if(filePaths ==null|| filePaths.length== 0) {
thrownew IllegalArgumentException("File path cannot be null or empty");
}
long[] wordCounts;
return wordCounts;
}
/**
* Helper method that counts words in a given text
*/privatestaticlongcountWordsInText(String text) {
if (text ==null|| text.isEmpty()) {
return 0;
}
return 0;
}
}
Run the unit test at src/test/java/cpit305/fcit/kau/edu/sa/SingleThreadedWordCounterTest.java to verify your implementation.
Task 2: Multi-Threaded Word Counter
Implement a multi-threaded word counter at src/main/java/cpit305/fcit/kau/edu/sa/MultiThreadedWordCounter.java that:
Counts words in a single file:
Divides the file into equal chunks (number of lines / number of threads)
Each thread counts words in its assigned chunk
Counts words in multiple files:
Uses one thread per file
Each thread processes its entire file independently.
package cpit305.fcit.kau.edu.sa;
import java.io.*;
publicclassMultiThreadedWordCounter {
/**
* Helper method to read file content
*/privatestatic String readFileContent(String filePath) throws IOException {
StringBuilder content =new StringBuilder();
return content.toString();
}
/**
* Helper method that counts words in a given text
*/privatestaticlongcountWordsInText(String text) {
if (text ==null|| text.isEmpty()) {
return 0;
}
return 0;
}
/**
* Counts words in a single file by dividing it into chunks and processing each chunk in a separate thread
* @param filePath Path to the file to count words in
* @param numThreads Number of threads to use
* @return Total word count in the file
*/publicstaticlongcountWords(String filePath, int numThreads) throws IOException, InterruptedException {
if (filePath ==null|| filePath.isEmpty()) {
thrownew IllegalArgumentException("File path cannot be null or empty");
}
if (numThreads <= 0) {
thrownew IllegalArgumentException("Number of threads must be positive");
}
String content = readFileContent(filePath);
return 0;
}
/**
* Counts words in multiple files using multiple threads (one thread per file)
* @param filePaths Array of file paths to count words in
* @return Array of word counts corresponding to each file
*/publicstaticlong[]countWordsInFiles(String[] filePaths) throws IOException, InterruptedException {
if (filePaths ==null) {
thrownew IllegalArgumentException("File paths array cannot be null");
}
long[] wordCounts =newlong[filePaths.length];
Thread[] threads =new Thread[filePaths.length];
for (int i = 0; i < filePaths.length; i++) {
}
return 0;
}
}
Run the unit tests at src/test/java/cpit305/fcit/kau/edu/sa/MultiThreadedWordCounterTest.java to verify your implementation.
Task 3: Performance Comparison
We will assess the CPU utilization and memory usage for the single-threaded and multi-threaded implementations and compare the performance of both implementations.
In the main class App.java, run the performance metrics utility in src/main/java/cpit305/fcit/kau/edu/sa/PerformanceMetrics.java
and the performance comparison utility in src/main/java/cpit305/fcit/kau/edu/sa/PerformanceComparison.java:
import java.io.*;
import java.nio.file.*;
import java.net.URL;
publicclassPerformanceComparison {
/**
* Downloads text files for benchmarking and returns their local paths
*/publicstatic String[]downloadTestFiles() throws IOException {
String[] urls = {
"https://www.gutenberg.org/files/1342/1342-0.txt", // Pride and Prejudice"https://www.gutenberg.org/files/84/84-0.txt", // Frankenstein"https://www.gutenberg.org/files/2701/2701-0.txt", // Moby Dick"https://www.gutenberg.org/files/98/98-0.txt"// A Tale of Two Cities };
String[] localPaths =new String[urls.length];
for (int i = 0; i < urls.length; i++) {
String fileName ="book"+ (i + 1) +".txt";
Path filePath = Paths.get(fileName);
// Download the file URL url =new URL(urls[i]);
try (InputStream in = url.openStream()) {
Files.copy(in, filePath, StandardCopyOption.REPLACE_EXISTING);
localPaths[i]= filePath.toString();
}
}
return localPaths;
}
/**
* Runs performance tests for single-threaded vs multi-threaded word counting
*/publicstaticvoidrunPerformanceComparison() throws IOException, InterruptedException {
// 1. Download test files System.out.println("Downloading test files...");
String[] filePaths = downloadTestFiles();
// 2. Measure single-threaded performance System.out.println("\nRunning single-threaded tests...");
long startTimeSingle = System.currentTimeMillis();
long beforeMemorySingle = PerformanceMetrics.getMemoryUsage();
double beforeCpuSingle = PerformanceMetrics.getCpuUsage();
long[] singleThreadedResults = SingleThreadedWordCounter.countWordsInFiles(filePaths);
long endTimeSingle = System.currentTimeMillis();
long afterMemorySingle = PerformanceMetrics.getMemoryUsage();
double afterCpuSingle = PerformanceMetrics.getCpuUsage();
// 3. Measure multi-threaded performance System.out.println("\nRunning multi-threaded tests...");
long startTimeMulti = System.currentTimeMillis();
long beforeMemoryMulti = PerformanceMetrics.getMemoryUsage();
double beforeCpuMulti = PerformanceMetrics.getCpuUsage();
long[] multiThreadedResults = MultiThreadedWordCounter.countWordsInFiles(filePaths);
long endTimeMulti = System.currentTimeMillis();
long afterMemoryMulti = PerformanceMetrics.getMemoryUsage();
double afterCpuMulti = PerformanceMetrics.getCpuUsage();
// 4. Print comparison results System.out.println("\nPerformance Comparison Results:");
System.out.println("--------------------------------");
System.out.println("Single-Threaded:");
System.out.println("Total Time: "+ (endTimeSingle - startTimeSingle) +" ms");
System.out.println("Average Time per File: "+ ((endTimeSingle - startTimeSingle) / filePaths.length) +" ms");
System.out.println("Memory Usage: "+ (afterMemorySingle - beforeMemorySingle) +" MB");
System.out.println("CPU Usage: "+ afterCpuSingle +"%");
System.out.println("\nMulti-Threaded:");
System.out.println("Total Time: "+ (endTimeMulti - startTimeMulti) +" ms");
System.out.println("Average Time per File: "+ ((endTimeMulti - startTimeMulti) / filePaths.length) +" ms");
System.out.println("Memory Usage: "+ (afterMemoryMulti - beforeMemoryMulti) +" MB");
System.out.println("CPU Usage: "+ afterCpuMulti +"%");
// Calculate and print accuracy System.out.println("\nAccuracy Check:");
boolean accurate =true;
for (int i = 0; i < filePaths.length; i++) {
if (singleThreadedResults[i]!= multiThreadedResults[i]) {
accurate =false;
System.out.println("Mismatch in file "+ filePaths[i]);
System.out.println("Single-threaded count: "+ singleThreadedResults[i]);
System.out.println("Multi-threaded count: "+ multiThreadedResults[i]);
}
}
System.out.println("Word count accuracy: "+ (accurate ?"100%" : "Mismatch detected"));
}
publicstaticvoidrunCPUMemoryUsage() throws IOException, InterruptedException {
long beforeMemory = PerformanceMetrics.getMemoryUsage();
double beforeCpu = PerformanceMetrics.getCpuUsage();
// Run your word counting codelong afterMemory = PerformanceMetrics.getMemoryUsage();
double afterCpu = PerformanceMetrics.getCpuUsage();
System.out.println("Memory usage: "+ (afterMemory - beforeMemory) +" MB");
System.out.println("CPU usage: "+ afterCpu +"%");
}
}
Performance Comparison Results
Run the included unit tests to verify your solution. Additionally, run the PerformanceComparison class to see the real-world performance difference between single-threaded and multi-threaded approaches and complete the table below.
Metric
Single-Threaded
Multi-Threaded
Total Execution Time (ms)
Average Time per File (ms)
Memory Usage (MB)
CPU Utilization (%)
Word Count Accuracy (%)
Number of Threads Used
1
Deliverables and Submission
Please push your code to GitHub for auto-grading and submit a PDF file with:
Screenshots showing your implementation
A table showing the performance comparison results
A brief analysis (1 paragraph) explaining the performance differences and why they occur
If your instructor is using GitHub classroom, then you should click on your class submission link,
link your GitHub username to your name if you have not already done so, accept the assignment, clone the
repository into your local
development environment, and push the code to the remote repository on GitHub. Please make sure that your
written
answers are included in either a README (Markdown) file or a PDF file.
Lab dues dates are listed on GitHub classroom unless otherwise
noted.
If your instructor is using GitHub classroom, your submission will be
auto-graded
by running the included unit tests as well as manually graded for correctness, style, and quality.
How to submit your lab to GitHub Classroom
The video below demonstrates how to submit your work to GitHub classroom