Lab 5: Java Threads and Word Counting

In this lab, you will explore Java’s multithreading capabilities by implementing concurrent word counting operations. You’ll compare the performance of single-threaded and multi-threaded approaches for processing large text files.

Introduction

A developer is working for a company that processes large volumes of text data daily, such as analyzing customer reviews, indexing books, or processing legal documents. As the data grows, the need for faster and more efficient processing becomes critical. In this lab, you will step into the role of this developer tasked with optimizing a word counting application.

You will start by implementing a simple single-threaded solution to count words in text files. Then, you will explore the power of multithreading to divide the workload across multiple threads, enabling faster processing by utilizing modern multi-core processors. Finally, you will compare the performance of both approaches, analyzing metrics such as execution time, CPU utilization, and memory usage. This hands-on activity will give you practical experience with Java’s threading capabilities and help you understand how concurrency can improve application performance.

Prerequisites

  • Java Development Kit (JDK) 17 or later
  • Your preferred Java IDE
  • Git for version control

Getting Started

If your instructor is using GitHub classroom, you will need to accept the assignment using the link at the bottom of this page, clone your auto-generated repository, and import it as a project into your IDE.

If your instructor is not using GitHub classroom, clone and import the template project at https://github.com/cpit305-spring-25-IT1/lab-05 ↗.

Task 1: Single-Threaded Word Counter

Implement a single-threaded word counting solution in src/main/java/cpit305/fcit/kau/edu/sa/SingleThreadedWordCounter.java:

SingleThreadedWordCounter.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import java.io.*;

public class SingleThreadedWordCounter {
    /**
     * Counts the number of words in the given file using a single thread
     * @param filePath Path to the file to count words in
     * @return the number of words in the file
     */
    public static long countWords(String filePath) throws IOException, IllegalArgumentException {
        if(filePath == null || filePath.isEmpty()) {
            throw new IllegalArgumentException("File path cannot be null or empty");
        }
        long totalWords = 0;



        return totalWords;
    }

    /**
     * Counts words in multiple files using a single thread (sequentially)
     * @param filePaths Array of file paths to count words in
     * @return Array of word counts corresponding to each file
     */
    public static long[] countWordsInFiles(String[] filePaths) throws IOException, IllegalArgumentException {
        if(filePaths == null || filePaths.length == 0) {
            throw new IllegalArgumentException("File path cannot be null or empty");
        }
        
        long[] wordCounts;


        
        return wordCounts;
    }

    /**
     * Helper method that counts words in a given text
     */
    private static long countWordsInText(String text) {
        if (text == null || text.isEmpty()) {
            return 0;
        }




        return 0;
    }
}

Run the unit test at src/test/java/cpit305/fcit/kau/edu/sa/SingleThreadedWordCounterTest.java to verify your implementation.

Task 2: Multi-Threaded Word Counter

Implement a multi-threaded word counter at src/main/java/cpit305/fcit/kau/edu/sa/MultiThreadedWordCounter.java that:

  1. Counts words in a single file:

    • Divides the file into equal chunks (number of lines / number of threads)
    • Each thread counts words in its assigned chunk
  2. Counts words in multiple files:

    • Uses one thread per file
    • Each thread processes its entire file independently.

MultiThreadedWordCounter.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
package cpit305.fcit.kau.edu.sa;

import java.io.*;
public class MultiThreadedWordCounter {

    /**
     * Helper method to read file content
     */
    private static String readFileContent(String filePath) throws IOException {
        StringBuilder content = new StringBuilder();






        return content.toString();
    }

    /**
     * Helper method that counts words in a given text
     */
    private static long countWordsInText(String text) {
        if (text == null || text.isEmpty()) {
            return 0;
        }




        return 0;
    }

    /**
     * Counts words in a single file by dividing it into chunks and processing each chunk in a separate thread
     * @param filePath Path to the file to count words in
     * @param numThreads Number of threads to use
     * @return Total word count in the file
     */
    public static long countWords(String filePath, int numThreads) throws IOException, InterruptedException {
        if (filePath == null || filePath.isEmpty()) {
            throw new IllegalArgumentException("File path cannot be null or empty");
        }
        
        if (numThreads <= 0) {
            throw new IllegalArgumentException("Number of threads must be positive");
        }

        String content = readFileContent(filePath);
       
        return 0;
    }

    /**
     * Counts words in multiple files using multiple threads (one thread per file)
     * @param filePaths Array of file paths to count words in
     * @return Array of word counts corresponding to each file
     */
    public static long[] countWordsInFiles(String[] filePaths) throws IOException, InterruptedException {
        if (filePaths == null) {
            throw new IllegalArgumentException("File paths array cannot be null");
        }
        
        long[] wordCounts = new long[filePaths.length];
        Thread[] threads = new Thread[filePaths.length];
        
        
        for (int i = 0; i < filePaths.length; i++) {



        }
        
        
        return 0;
    }

}

Run the unit tests at src/test/java/cpit305/fcit/kau/edu/sa/MultiThreadedWordCounterTest.java to verify your implementation.

Task 3: Performance Comparison

We will assess the CPU utilization and memory usage for the single-threaded and multi-threaded implementations and compare the performance of both implementations.

In the main class App.java, run the performance metrics utility in src/main/java/cpit305/fcit/kau/edu/sa/PerformanceMetrics.java and the performance comparison utility in src/main/java/cpit305/fcit/kau/edu/sa/PerformanceComparison.java:

App.java

1
2
3
4
5
6
7
8
9
public class App {
    public static void main(String[] args) {





    }
}

PerformanceMetrics.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
import java.lang.management.ManagementFactory;
import java.lang.management.OperatingSystemMXBean;

public class PerformanceMetrics {
    public static long getMemoryUsage() {
        Runtime runtime = Runtime.getRuntime();
        long totalMemory = runtime.totalMemory();
        long freeMemory = runtime.freeMemory();
        return (totalMemory - freeMemory) / (1024 * 1024); // Convert to MB
    }

    public static double getCpuUsage() {
        OperatingSystemMXBean osBean = ManagementFactory.getOperatingSystemMXBean();
        if (osBean instanceof com.sun.management.OperatingSystemMXBean) {
            return ((com.sun.management.OperatingSystemMXBean) osBean).getProcessCpuLoad() * 100;
        }
        return -1;
    }
}

PerformanceComparison.java

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109

import java.io.*;
import java.nio.file.*;
import java.net.URL;

public class PerformanceComparison {
    /**
     * Downloads text files for benchmarking and returns their local paths
     */
    public static String[] downloadTestFiles() throws IOException {
        String[] urls = {
                "https://www.gutenberg.org/files/1342/1342-0.txt", // Pride and Prejudice
                "https://www.gutenberg.org/files/84/84-0.txt",     // Frankenstein
                "https://www.gutenberg.org/files/2701/2701-0.txt", // Moby Dick
                "https://www.gutenberg.org/files/98/98-0.txt"      // A Tale of Two Cities
        };

        String[] localPaths = new String[urls.length];

        for (int i = 0; i < urls.length; i++) {
            String fileName = "book" + (i + 1) + ".txt";
            Path filePath = Paths.get(fileName);

            // Download the file
            URL url = new URL(urls[i]);
            try (InputStream in = url.openStream()) {
                Files.copy(in, filePath, StandardCopyOption.REPLACE_EXISTING);
                localPaths[i] = filePath.toString();
            }
        }

        return localPaths;
    }

    /**
     * Runs performance tests for single-threaded vs multi-threaded word counting
     */
    public static void runPerformanceComparison() throws IOException, InterruptedException {
        // 1. Download test files
        System.out.println("Downloading test files...");
        String[] filePaths = downloadTestFiles();

        // 2. Measure single-threaded performance
        System.out.println("\nRunning single-threaded tests...");
        long startTimeSingle = System.currentTimeMillis();
        long beforeMemorySingle = PerformanceMetrics.getMemoryUsage();
        double beforeCpuSingle = PerformanceMetrics.getCpuUsage();

        long[] singleThreadedResults = SingleThreadedWordCounter.countWordsInFiles(filePaths);

        long endTimeSingle = System.currentTimeMillis();
        long afterMemorySingle = PerformanceMetrics.getMemoryUsage();
        double afterCpuSingle = PerformanceMetrics.getCpuUsage();

        // 3. Measure multi-threaded performance
        System.out.println("\nRunning multi-threaded tests...");
        long startTimeMulti = System.currentTimeMillis();
        long beforeMemoryMulti = PerformanceMetrics.getMemoryUsage();
        double beforeCpuMulti = PerformanceMetrics.getCpuUsage();

        long[] multiThreadedResults = MultiThreadedWordCounter.countWordsInFiles(filePaths);

        long endTimeMulti = System.currentTimeMillis();
        long afterMemoryMulti = PerformanceMetrics.getMemoryUsage();
        double afterCpuMulti = PerformanceMetrics.getCpuUsage();

        // 4. Print comparison results
        System.out.println("\nPerformance Comparison Results:");
        System.out.println("--------------------------------");
        System.out.println("Single-Threaded:");
        System.out.println("Total Time: " + (endTimeSingle - startTimeSingle) + " ms");
        System.out.println("Average Time per File: " + ((endTimeSingle - startTimeSingle) / filePaths.length) + " ms");
        System.out.println("Memory Usage: " + (afterMemorySingle - beforeMemorySingle) + " MB");
        System.out.println("CPU Usage: " + afterCpuSingle + "%");

        System.out.println("\nMulti-Threaded:");
        System.out.println("Total Time: " + (endTimeMulti - startTimeMulti) + " ms");
        System.out.println("Average Time per File: " + ((endTimeMulti - startTimeMulti) / filePaths.length) + " ms");
        System.out.println("Memory Usage: " + (afterMemoryMulti - beforeMemoryMulti) + " MB");
        System.out.println("CPU Usage: " + afterCpuMulti + "%");

        // Calculate and print accuracy
        System.out.println("\nAccuracy Check:");
        boolean accurate = true;
        for (int i = 0; i < filePaths.length; i++) {
            if (singleThreadedResults[i] != multiThreadedResults[i]) {
                accurate = false;
                System.out.println("Mismatch in file " + filePaths[i]);
                System.out.println("Single-threaded count: " + singleThreadedResults[i]);
                System.out.println("Multi-threaded count: " + multiThreadedResults[i]);
            }
        }
        System.out.println("Word count accuracy: " + (accurate ? "100%" : "Mismatch detected"));
    }


    public static void runCPUMemoryUsage() throws IOException, InterruptedException {
        long beforeMemory = PerformanceMetrics.getMemoryUsage();
        double beforeCpu = PerformanceMetrics.getCpuUsage();

        // Run your word counting code

        long afterMemory = PerformanceMetrics.getMemoryUsage();
        double afterCpu = PerformanceMetrics.getCpuUsage();

        System.out.println("Memory usage: " + (afterMemory - beforeMemory) + " MB");
        System.out.println("CPU usage: " + afterCpu + "%");
    }
}

Performance Comparison Results

Run the included unit tests to verify your solution. Additionally, run the PerformanceComparison class to see the real-world performance difference between single-threaded and multi-threaded approaches and complete the table below.

MetricSingle-ThreadedMulti-Threaded
Total Execution Time (ms)
Average Time per File (ms)
Memory Usage (MB)
CPU Utilization (%)
Word Count Accuracy (%)
Number of Threads Used1

Deliverables and Submission

Please push your code to GitHub for auto-grading and submit a PDF file with:

  1. Screenshots showing your implementation
  2. A table showing the performance comparison results
  3. A brief analysis (1 paragraph) explaining the performance differences and why they occur