Ivan Derlich – My Blog

Understanding Backpressure: Application Layer vs. Transport Layer

Post author By admin
Post date September 26, 2024
No Comments on Understanding Backpressure: Application Layer vs. Transport Layer

In modern systems, efficient data handling is critical to building robust, scalable applications. One of the key concepts in data flow management is backpressure, which ensures that a fast producer doesn’t overwhelm a slower consumer by adjusting the data flow rate. However, backpressure operates at multiple layers of communication, most notably the application layer (e.g., Node.js streams) and the transport layer (e.g., TCP).

In this article, we’ll break down the differences between how backpressure is handled in these two layers, why each layer has its own mechanism, and how they work together to keep data flowing smoothly in complex systems.

What is Backpressure?

Backpressure refers to the situation where a system producing data (the producer) is generating data faster than the system consuming it (the consumer) can process. When this happens, data starts to accumulate, and if not managed properly, it can lead to performance degradation, memory exhaustion, or even application crashes.

To prevent this, mechanisms at various layers of the communication stack signal the producer to slow down or pause until the consumer is ready to process more data.

Application Layer Backpressure

At the application layer, backpressure is managed by the application itself, typically through mechanisms provided by frameworks and programming environments. In the case of Node.js, streams are a common way to handle data flow between producers and consumers.

In a writable stream, for example, when the internal buffer exceeds a certain threshold (defined by highWaterMark), the stream.write() method will return false, signaling that the consumer needs to catch up before accepting more data. The producer must then pause and wait for the “drain” event before resuming.

Here’s a simplified version of how it works:

The producer generates data and writes it to a writable stream.
The writable stream has an internal buffer. If this buffer becomes full, backpressure is applied.
The producer is paused and waits for the “drain” event, which signals that the buffer has been cleared.
The producer resumes writing when the buffer has capacity again.

This type of backpressure is highly customizable and gives developers fine control over the data flow at the application level.

Application Layer Backpressure Example

const { Writable } = require("stream");

const outStream = new Writable({
  highWaterMark: 25,
  write(chunk, encoding, callback) {
    setTimeout(() => {
      console.log("write in the outstream: ", chunk.toString());
      callback(); // This is crucial to signal the stream is ready for more data
    }, 10); // Simulate processing time
  },
});

outStream.on("drain", () => {
  console.log("Drain event in consumer");
});

outStream.on("finish", () => {
  console.log("Finish event in consumer");
});

process.stdin.on("data", (chunk) => {
  console.log("Data event in the producer");
});

async function produceData() {
  let i = 0;
  const limit = 100;

  while (i < limit) {
    const chunk = Buffer.from(`Chunk ${i}`);
    console.log("Writing chunk:", chunk.toString());
    await new Promise((resolve) => setTimeout(resolve, 1));
    const canWrite = outStream.write(chunk); // Returns false when buffer is full
    console.log(`Write result for chunk ${i}:`, canWrite);
    if (!canWrite) {
      console.log("Backpressure applied, waiting for drain...");
      await new Promise((resolve) => outStream.once("drain", resolve));
    }

    i++;
    if (i >= limit) {
      outStream.end(); // Signal the end of the writing process
    }
  }
}

produceData();

More about the previous code in this repository

In this example, the writable stream’s buffer fills up if the producer generates data too quickly. The producer pauses when write() returns false and resumes when the “drain” event is emitted.

Transport Layer Backpressure

At the transport layer, protocols like TCP also handle backpressure, but in a different way. TCP ensures reliable data transmission across a network by managing the flow of data between sender and receiver. It uses a mechanism called flow control to prevent the sender from overwhelming the receiver.

TCP uses a sliding window protocol, where the receiver advertises how much buffer space it has available. This window size dictates how much data the sender can send before it needs an acknowledgment from the receiver. If the receiver’s buffer is full (e.g., the application layer isn’t reading data fast enough), the receiver will reduce the window size or even set it to zero, signaling the sender to stop transmitting more data until the buffer is freed up.

Here’s what happens at the transport layer:

The sender transmits data in packets, adhering to the window size advertised by the receiver.
The receiver acknowledges the data and updates the window size based on its current buffer availability.
If the receiver’s buffer is full, it reduces the window size, causing the sender to slow down or stop sending data.
When the receiver processes more data and frees up buffer space, it increases the window size, signaling the sender to resume transmission.

Separation of Concerns: How Application and Transport Layers Interact

The transport layer (e.g., TCP) doesn’t directly know what the application is doing. It only knows whether the application is consuming data or not. If the application stops reading data (due to its own backpressure handling), this indirectly affects the transport layer because the buffer on the receiving side fills up. As a result, TCP will reduce the sender’s transmission rate based on the sliding window protocol.

To put it simply:

The transport layer backpressure is concerned with managing data flow over the network between sender and receiver. It makes sure that the sender doesn’t overwhelm the receiver’s buffer by slowing down transmission when necessary.
The application layer backpressure is about controlling the data flow within your application, making sure that the consumer (e.g., a writable stream) isn’t overwhelmed by the producer.

Conveyor Belt Analogy: Transport Layer vs. Application Layer

Imagine a conveyor belt system in a factory that connects two rooms:

In the first room, a machine (the producer) places packages onto the conveyor belt.
In the second room, a worker (the consumer) picks up the packages and processes them.
The conveyor belt transports the packages from the machine to the worker.

Now, let’s break down the roles of the transport layer and the application layer in this analogy:

The Transport Layer: Controlling the Flow of Packages

The transport layer is like the conveyor belt itself, controlling the speed and flow of packages between the two rooms. It doesn’t know exactly how the worker in the second room processes the packages, but it ensures that the packages keep moving from the machine to the worker as long as there’s space on the belt.

If the worker is processing packages slowly, the conveyor belt (transport layer) will start to get crowded. At this point, the conveyor belt automatically slows down, ensuring that no more packages are sent into the second room than the worker can handle.

Transport Layer’s Job: Manage the flow of packages (data) between the producer and consumer to avoid overwhelming the worker (consumer) in the second room.
Sliding Window: The transport layer uses something like a “sliding window” to adjust how many packages (data chunks) can be sent at a time. If the worker’s buffer is full, the conveyor belt (transport layer) slows down or stops, preventing overflow.

The Application Layer: Processing the Packages

The application layer is like the worker in the second room, responsible for processing the packages. The worker can only handle so many packages at a time, and if they fall behind, packages start to pile up on the conveyor belt.

In this case, the worker signals to the transport layer (conveyor belt) that they need time to catch up. Once the worker processes some of the packages, the belt can resume its normal speed.

Application Layer’s Job: Process the packages (data) being delivered, but if it falls behind, it applies backpressure to slow down the producer.
Backpressure in the Application Layer: When the worker (application layer) can’t keep up, it tells the transport layer (conveyor belt) to slow down until it can catch up.

How the Two Work Together

In this analogy:

The transport layer (conveyor belt) ensures smooth, controlled data flow between the rooms.
The application layer (worker) processes the data but has the ability to slow down the flow if it can’t handle more data.
If the worker stops processing packages, the conveyor belt (transport layer) slows down to prevent packages from piling up and overflowing.

Key Takeaway:

The transport layer manages data flow across the network, ensuring that the sender doesn’t overwhelm the receiver. However, if the application layer (worker) falls behind in processing, it signals backpressure, and the transport layer adjusts the data flow rate accordingly. This separation of concerns ensures efficient communication between systems without overloading resources at any layer.

Why the Separation is Important

By separating concerns between the transport layer and application layer, each layer can focus on what it does best:

Transport layer backpressure ensures reliable, efficient data transmission over the network.
Application layer backpressure ensures that the system doesn’t run out of memory or processing power when handling large volumes of data.

This layered approach provides flexibility, scalability, and robustness in managing data flow, especially in systems where both network performance and application-level resource management are critical.

Real-World Applications

Understanding and properly handling backpressure at both the transport and application layers has important real-world implications:

File Transfers: When uploading or downloading large files over the network, both transport layer and application layer backpressure mechanisms ensure that neither the client nor the server is overwhelmed.
Streaming Services: Video and audio streaming services rely on backpressure to avoid buffering issues and to scale efficiently as more clients request data.
API Servers: In high-load environments, API servers can apply application-layer backpressure to handle client requests efficiently without overwhelming the underlying system resources.

Conclusion

Backpressure is a vital concept for managing data flow in modern applications. Both the transport layer and application layer implement backpressure mechanisms, but they operate independently. The transport layer focuses on network data transmission, while the application layer focuses on managing system resources like memory and CPU.

By understanding how these two layers handle backpressure and how they interact, developers can build more scalable, efficient, and resilient applications that handle large volumes of data gracefully, whether it’s streaming media, processing API requests, or transferring files.

Uncategorized

The Backbone of Modern JavaScript Development

Post author By admin
Post date May 17, 2023
No Comments on The Backbone of Modern JavaScript Development

Introduction

Node.js, the open-source, cross-platform JavaScript runtime built on Chrome’s V8 JavaScript engine, has been a game-changer in the world of web development. Since its inception in 2009, it has revolutionized JavaScript, enabling developers to use the same language for both front-end and back-end development. As we stand today, Node.js remains a dominant force in the JavaScript ecosystem, powering some of the world’s largest companies and applications.

Efficient Performance

One of the key advantages of Node.js is its efficient performance. Built on Google Chrome’s V8 engine, Node.js compiles JavaScript into machine code, leading to faster runtime execution. Moreover, it uses an event-driven, non-blocking I/O model, making it lightweight and efficient. This architecture makes it an excellent choice for data-intensive real-time applications.

NPM: The Robust Package Manager

Node.js introduced the Node Package Manager (npm), the world’s largest software registry. With over a million packages, npm provides reusable components that can significantly reduce development time. From utility functions to complex libraries like Express.js, you can find almost anything you need. This vast ecosystem is one of Node.js’s most powerful features, and it continues to grow each day.

Real-time Web Applications

Node.js is an ideal choice for building real-time web applications, such as collaborative tools, chat rooms, and live-streaming platforms. Its event-driven architecture allows it to handle multiple client requests simultaneously, making it perfect for applications that require real-time updates and two-way connections.

Scalability

Scalability is a built-in feature of Node.js. It provides tools to scale your applications in horizontal as well as vertical directions. The cluster module allows you to load balance over multiple CPU cores. These features make Node.js a preferred runtime for high-traffic websites and services.

Full-Stack JavaScript

Perhaps the most influential aspect of Node.js is its introduction of full-stack JavaScript development. By allowing JavaScript to run on the server-side, developers can now write both the front-end and back-end of web applications in the same language. This not only reduces context-switching but also makes team collaboration easier as everyone is speaking the same language.

Community Support

Node.js has an active and vibrant community of developers who continually contribute to its development and improvement. This ensures that Node.js is regularly updated with the latest features and security patches. It also means that if you encounter a problem, there’s a good chance someone else has already found a solution.

Conclusion

Node.js has undeniably left an indelible mark on the field of web development. Its performance, scalability, and vast package ecosystem make it a robust platform for building a wide variety of applications. While newer technologies may come and go, the impact of Node.js is a testament to its design and capabilities. It continues to be an excellent choice for many developers and businesses and will likely remain a cornerstone of web development for years to come.

Graphic Design

Create your own logo for free

Post author By admin
Post date February 3, 2022
No Comments on Create your own logo for free

I want to share this story so you can be as happy as I was when I discover this tool.

I was making a favicon for my blog and I went to an online logo generator.

I visited favicon.io and I wasn’t expecting much, I thought I was just going to create a boring favicon with my initials.

Then I saw that I could choose the shape of the logo, the color of the text, and the background.

I was happy.

Then I discovered that I could also change the font with google fonts.

I was happier!

But as soon as I download the zip file with different formats, I saw that one file was a 512 x 512 png with a very clear definition:

I discovered my logo folks!

You can’t imagine how happy I was. Because I always wanted my logo and I didn’t have the time to go through a process of logo discovery with a designer.

And of course, I was happy also because I was saving money. I don’t know how much a designer would charge in Upwork, but for sure it isn’t a penny.

So with all this, I’m considering the possibility of creating a similar logo for my portfolio

Software Development

Don’t underestimate the power of a bash script

Post author By Ivan Derlich
Post date February 3, 2022
No Comments on Don’t underestimate the power of a bash script

A while ago, I was working in the front-end for a company.

I was building a standalone app that worked with magic links. Those links were associated with a token. A token expired at the end of the happy path. Each time I had to test the app end-to-end I did it through the happy path.

At the end of the happy path, the token expired and the magic link became invalid. I couldn’t make that token valid again because I didn’t have access to the back-end, so to test the system again, I had to generate another token to have a new magic link.

This was tiresome.

So I took time off the issue I was working with to create a mock server on Postman. In that server, I created an “infinite magic link” which was a fixed endpoint that responded OK to a fixed request body. So I could test the happy path without the token expiring at the end.

I was very happy and I’m happy each time I remember this.

Once I started feeling the power of the mock server, I started creating different mock responses for different types of magic links.

Suddenly I could test a dozen test cases outside the happy path.

I needed to document that number of magic links somewhere.

So instead of creating a boring google doc or something like that, I created a bash script that opened the browser in the specific magic link I wanted to work with.

The bash script was a simple script that with a flag started the development server with ‘npm start’ and opened the browser in http://localhost:3000 with the magic link after the port number.

I was happy again.

But the story doesn’t end there: The project had the problem that the deployment server was changing their URI all the time, and before each deployment, we had to run tests, check the node version and make a build.

This was also tiresome.

So I decided to incorporate all those steps into the previous script. And each time I deployed, I could see the app deployed locally and in the staging server with a magic link that never expired.

I was engulfed with joy at that time.

My colleagues were reluctant at first about me spending time in the script, but they ended up using the script for their own deployments.

The moral of the story is that if you find yourself repeating bash commands and opening certain URI for end-to-end testing, don’t hesitate to consider writing a bash script.

Uncategorized

Zendesk Platform Issue

DISCLAIMER: I am publishing an article because I haven’t found a way to report this publicly by creating an issue in a GitHub repository nor any other way. I need to share this issue with my company because a communication problem is hurting my relationship with them and if I report it privately with Zendesk I don’t have a way to show this. I have no intention to degrade Zendesk in any way, shape, or form. I think Zendesk provides a lot of value to the developer community and the community in general.

I log in with a Google account:

I choose ivanderlich@gmail.com:

I access the panel. (I have changed my profile picture to a forbidden image for other people to avoid this account)
I click on “My activities” to see all the tickets I have:

Unexpectedly the platform doesn’t want me to see the messages until I confirm my email.
I comply and click “Click here to verify your email”:

Here is something interesting:
My Google account is ivanderlich@gmai.com
The Email field is ivanderlich.com
But it’s sending verification emails to e@ivanderlich.com
I click on “Click here to resend verification email”:

I see shows me this message:

I go to my email inbox:
( You can see that I’ve tried to do this a lot of times and failing each of them)

I open the email:

I go to the link and it shows me this:

And then, suddenly and unexpectedly I get this message:

And then the cycle repeats by saying to me that I can’t see my activities because I have an unconfirmed email.