
Photo by Patrick Tomasso on Unsplash
The word count command (wc
) is a classic utility that you might use to determine the number of lines, words, and bytes in files or standard input.
It’s a staple tool for anyone working with text files on Unix-like systems. But have you ever wondered how such a tool is designed and implemented?
In this practice exercise, we’ll build a simple implementation of the wc
tool using Rust. By the end, we’ll have a functional utility that can count lines, words, and bytes in files.
Setting Up Our Project
First, let’s set up a new Rust project and add clap as a dependency:
cargo new rust-wc
cd rust-wc
cargo add clap --features derive
Defining the Command-Line Interface
Below is our initial structure for the application:
use std::fs::File;
use std::io::{self, BufRead, BufReader};
use clap::Parser;
#[derive(Parser)]
#[command(name = "wc")]
#[command(about = "Count lines, words, and bytes in a file")]
struct Cli {
#[arg(short = 'l', long = "lines")]
lines: bool,
#[arg(short = 'w', long = "words")]
words: bool,
#[arg(short = 'c', long = "bytes")]
bytes: bool,
#[arg(name = "FILE")]
file: Option<String>,
}
#[derive(Default)]
struct Counts {
lines: usize,
words: usize,
bytes: usize,
}
fn main() -> io::Result<()> {
let args = Cli::parse();
// Show everything if no specific counts are requested
let show_all = !(args.lines || args.words || args.bytes);
let counts = match args.file {
Some(ref path) => count_file(path)?,
None => count_stdin()?,
};
print_counts(&counts, show_all, args.lines, args.words, args.bytes);
Ok(())
}
This code sets up our tool’s foundation. We use Rust’s clap
library to handle command-line options and create a structure to store our count results.
Counting Files and Input
Now let’s add the counting functions:
fn count_file(path: &str) -> io::Result<Counts> {
let file = File::open(path)?;
let reader = BufReader::new(file);
count_reader(reader)
}
fn count_stdin() -> io::Result<Counts> {
let stdin = io::stdin();
let reader = stdin.lock();
count_reader(reader)
}
fn count_reader<R: BufRead>(reader: R) -> io::Result<Counts> {
let mut counts = Counts::default();
for line in reader.lines() {
let line = line?;
counts.lines += 1;
counts.words += line.split_whitespace().count();
counts.bytes += line.len() + 1;
}
Ok(counts)
}
These functions work together to count lines, words, and bytes from any input source. We use Rust’s traits to handle both files and standard input in the same way.
Display Format
Let’s create a clean output format:
impl std::fmt::Display for Counts {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
let mut output = String::new();
if self.show_all || self.show_lines {
output.push_str(&format!("{:>8} ", self.lines));
}
if self.show_all || self.show_words {
output.push_str(&format!("{:>8} ", self.words));
}
if self.show_all || self.show_bytes {
output.push_str(&format!("{:>8}", self.bytes));
}
write!(f, "{}", output.trim_end())
}
}
fn print_counts(counts: &Counts) {
println!("{}", counts);
}
This code creates neat, aligned output that matches the style of traditional Unix tools.
Testing Our Tool
Let’s add some tests to make sure everything works:
#[cfg(test)]
mod tests {
use super::*;
use std::io::Cursor;
#[test]
fn test_empty_input() {
let input = Cursor::new("");
let counts = count_reader(input).unwrap();
assert_eq!(counts.lines, 0);
assert_eq!(counts.words, 0);
assert_eq!(counts.bytes, 0);
}
#[test]
fn test_single_word() {
let input = Cursor::new("hello");
let counts = count_reader(input).unwrap();
assert_eq!(counts.lines, 1);
assert_eq!(counts.words, 1);
assert_eq!(counts.bytes, 6);
}
#[test]
fn test_multiple_words() {
let input = Cursor::new("hello world\nrust is great");
let counts = count_reader(input).unwrap();
assert_eq!(counts.lines, 2);
assert_eq!(counts.words, 5);
assert_eq!(counts.bytes, 22);
}
#[test]
fn test_multiple_spaces() {
let input = Cursor::new("hello world");
let counts = count_reader(input).unwrap();
assert_eq!(counts.words, 2);
}
#[test]
fn test_unicode_characters() {
let input = Cursor::new("Hello 世界");
let counts = count_reader(input).unwrap();
assert_eq!(counts.words, 2);
assert_eq!(counts.bytes, 11);
}
}
Using the Tool
We can build and run our program with:
cargo build --release
echo .e "Hello World" | ./target/release/rust-wc
./target/release/rust-wc some_file.txt
To run the tests:
cargo test
Future Improvements
We can make our tool even better by:
- Adding support for multiple files at once
- Improving how we handle different text encodings
- Adding better error messages for invalid text
- Making it faster with parallel processing
This project shows how we can use Rust’s features to build reliable command-line tools that work as well as traditional Unix utilities. As always, if you disagree or want to chat about this post, find me on Bluesky!
Complete Code
use std::fs::File;
use std::io::{self, BufRead, BufReader};
use clap::Parser;
#[derive(Parser)]
#[command(name = "wc")]
#[command(about = "Count lines, words, and bytes in a file")]
struct Cli {
#[arg(short = 'l', long = "lines")]
lines: bool,
#[arg(short = 'w', long = "words")]
words: bool,
#[arg(short = 'c', long = "bytes")]
bytes: bool,
#[arg(name = "FILE")]
file: Option<String>,
}
#[derive(Default)]
struct Counts {
lines: usize,
words: usize,
bytes: usize,
show_all: bool,
show_lines: bool,
show_words: bool,
show_bytes: bool,
}
impl std::fmt::Display for Counts {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
let mut output = String::new();
if self.show_all || self.show_lines {
output.push_str(&format!("{:>8} ", self.lines));
}
if self.show_all || self.show_words {
output.push_str(&format!("{:>8} ", self.words));
}
if self.show_all || self.show_bytes {
output.push_str(&format!("{:>8}", self.bytes));
}
write!(f, "{}", output.trim_end())
}
}
fn count_file(path: &str) -> io::Result<Counts> {
let file = File::open(path)?;
let reader = BufReader::new(file);
count_reader(reader)
}
fn count_stdin() -> io::Result<Counts> {
let stdin = io::stdin();
let reader = stdin.lock();
count_reader(reader)
}
fn count_reader<R: BufRead>(reader: R) -> io::Result<Counts> {
let mut counts = Counts::default();
for line in reader.lines() {
let line = line?;
counts.lines += 1;
counts.words += line.split_whitespace().count();
counts.bytes += line.len() + 1;
}
Ok(counts)
}
fn print_counts(
counts: &mut Counts,
show_all: bool,
show_lines: bool,
show_words: bool,
show_bytes: bool,
) {
counts.show_all = show_all;
counts.show_lines = show_lines;
counts.show_words = show_words;
counts.show_bytes = show_bytes;
println!("{}", counts);
}
fn main() -> io::Result<()> {
let args = Cli::parse();
// Show everything if no specific counts are requested
let show_all = !(args.lines || args.words || args.bytes);
let mut counts = match args.file {
Some(ref path) => count_file(path)?,
None => count_stdin()?,
};
print_counts(&mut counts, show_all, args.lines, args.words, args.bytes);
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::Cursor;
#[test]
fn test_empty_input() {
let input = Cursor::new("");
let counts = count_reader(input).unwrap();
assert_eq!(counts.lines, 0);
assert_eq!(counts.words, 0);
assert_eq!(counts.bytes, 0);
}
#[test]
fn test_single_word() {
let input = Cursor::new("hello");
let counts = count_reader(input).unwrap();
assert_eq!(counts.lines, 1);
assert_eq!(counts.words, 1);
assert_eq!(counts.bytes, 6);
}
#[test]
fn test_multiple_words() {
let input = Cursor::new("hello world\nrust is great");
let counts = count_reader(input).unwrap();
assert_eq!(counts.lines, 2);
assert_eq!(counts.words, 5);
assert_eq!(counts.bytes, 22);
}
#[test]
fn test_multiple_spaces() {
let input = Cursor::new("hello world");
let counts = count_reader(input).unwrap();
assert_eq!(counts.words, 2);
}
#[test]
fn test_unicode_characters() {
let input = Cursor::new("Hello 世界");
let counts = count_reader(input).unwrap();
assert_eq!(counts.words, 2);
assert_eq!(counts.bytes, 11);
}
}
All these pieces work together to create a complete word counting tool. We’ve included command-line options, file handling, counting functions, and thorough testing.