Node.js Streams

2023-08-16

When working with large files or large amounts of data in Node.js, loading everything into memory at once is not practical. If we try to read a 2GB file using fs.readFile, the entire file gets loaded into memory, which can crash the application or slow it down significantly. Streams solve this by processing data in small chunks.

What Are Streams

A stream is a sequence of data that is made available over time. Instead of waiting for all the data to be ready, we process it piece by piece as it arrives. This is similar to watching a video online. We do not wait for the entire video to download before watching it. The video plays while the rest is still loading.

Node.js has four types of streams: Readable, Writable, Duplex, and Transform.

Readable Streams

A readable stream is a source of data. We can read from it chunk by chunk. A common example is reading a file:

const fs = require("fs");

const readStream = fs.createReadStream("largefile.txt", { encoding: "utf8" });

readStream.on("data", (chunk) => {
  console.log("Received chunk:", chunk.length, "bytes");
});

readStream.on("end", () => {
  console.log("Finished reading");
});

The data event fires every time a new chunk is available. The default chunk size is 64KB, but we can change it with the highWaterMark option.

Writable Streams

A writable stream is a destination where we can send data. Writing to a file is a common example:

const writeStream = fs.createWriteStream("output.txt");

writeStream.write("First line\n");
writeStream.write("Second line\n");
writeStream.end("Last line\n");

Piping

The real power of streams comes from piping, which connects a readable stream to a writable stream. Data flows from the source to the destination automatically:

const readStream = fs.createReadStream("input.txt");
const writeStream = fs.createWriteStream("output.txt");

readStream.pipe(writeStream);

This copies the file from input to output without loading the entire file into memory.

Transform Streams

Transform streams sit between a readable and writable stream, modifying the data as it passes through. A practical example is compressing a file:

const zlib = require("zlib");
const gzip = zlib.createGzip();

fs.createReadStream("input.txt")
  .pipe(gzip)
  .pipe(fs.createWriteStream("input.txt.gz"));

The data flows from the file, through the gzip transform, and into the compressed output file.

When to Use Streams

Streams are useful whenever we deal with large data. Reading or writing large files, processing CSV or JSON data row by row, handling HTTP request and response bodies for file uploads or downloads, and piping data between services. For small data that fits comfortably in memory, the simpler readFile and writeFile methods are fine. But when the data size is unpredictable or potentially large, streams are the safer choice.