Return to site

Docker Taking Up Space

broken image


  1. Docker Taking Up Space Shuttle
  2. Set Up Docker On Windows
  3. Clean Up Docker Space
  4. Free Docker Space
  5. Docker Taking Up Space Meme

Deploying with Docker, how is it done?

Running OMV 4.19 on a 8GB USB drive, the partition that OMV created for itself it around 4GB during installation.I used to have docker+emby server installed and they have been working fine!Today I tried to installed jdownloader ( 2 version avail through.

  • Because Docker is so light-weight, projects take up less space on servers. More data and software can be stored in less space. You can even dictate how many resources-CPU, network, memory, etc.-each container can use. Plus, the containers can be resized to meet the needs of your application as it grows.
  • Containers take up less space than VMs (container images are typically tens of MBs in size), can handle more applications and require fewer VMs and Operating systems. VIRTUAL MACHINES Virtual machines (VMs) are an abstraction of physical hardware turning one server into many servers.
  • Lastly, we clean up unused Docker objects to free space on the server. Docker is notorious for quickly taking up a lot of space. Here's the last addition to the script.

Should you pull from Github and build a Docker image on the production server? Or should you push the image to the container registry at the same time you push to Github?

And btw, how do you automate all this?! Do you poll every x seconds/minutes on the production server and check for changes? That doesn't seem efficient.

Surely there must be a more elegant way to deploy Docker applications 🤔.

Spoiler alert: Yes, there is!

There are several ways to automate Docker deployments. Today you're going to learn a simple and straightforward approach.

You don't need to be an experienced sysadmin/DevOps person to follow along. If you're a frontend/backend person and new to servers, this tutorial is for you.

By the end of this tutorial, your application will be automatically deployed on every push to the master branch — no manual steps involved. If you have tests, those will run as well and if any of them fail deployment won't proceed.

We won't be using expensive or complicated infrastructure. Therefore, this approach works great for hobby projects and small-scale applications.

Goals
We're going to have automated deployments based off the master branch. We'll automate all the steps between pushing your code to the repository up and deploying an updated version of your application.

This will make sure the code on the master branch is the same code that's running on the production server, at all times.

On each commit to the master branch, the following will happen:

  • Trigger a build in the CI provider
  • Run tests, if any, and proceed if all tests pass
  • Build and tag a Docker image
  • Push image to the container registry
  • Pull the image from the registry on the production server
  • Stop the current container and start a new one from the latest image

Overview
A high-level overview of the steps we're going to take:

  1. Configure the CI/CD provider
  2. Write a deploy script that will:
    • Build and upload a Docker image to the container registry
    • Deploy image on the production server via remote SSH

In my examples, I'm going to use the following services:

  • CircleCI as CI/CD provider
  • Docker Hub as the container registry

Docker Taking Up Space Shuttle

Feel free to use whatever you're using already. It shouldn't be a problem to follow along. I'll explain the general concepts so that you can apply this to your setup.

If you're missing a service, I'll link to resources on how to get started with each one of them.

Requirements
To be able to follow along, there are some things you'll need:

  • A containerised application. If you're using NodeJS, I wrote an article on how to build a Docker image with NodeJS
  • A server with SSH access and basic shell knowledge
  • Experience with running containers in Docker

With that out of the way, let's get started!

Continuous Integration and Continuous Deployment

What we're going to accomplish today is called Continuous Deployment (CD), and is usually coupled with Continuous Integration (CI) — automated testing. CI precedes CD in the automation pipeline to make sure broken code doesn't make it into production.

Therefore, it's sensible to have at least a basic test suite that makes sure the application starts and the main features work correctly before implementing automated deployments. Otherwise, you could quickly break production by pushing code that doesn't compile or has a major bug.

If you're working on a non-critical application, such as a hobby project, then you can implement automated deployments without a test suite.

Configure the CI/CD provider

Getting started with a CI/CD provider

If you already have a CI/CD provider connected to your repository, then you can head over to the next section.

CI/CD providers (or CI providers) sit between your code repository and your production server. They are the middlemen doing all the heavy lifting of building your application, running tests and deploying to production. You can even run cron jobs on them and do things that are not part of the CI or CD pipeline.

The most important thing to know is that a CI provider gives you configurable and short-lasting servers you can use. You pay for how long you're using one, or multiple, servers in parallel.

If you're not using a CI provider, I recommend starting with Github Actions. It's built into Github and therefore easy to get started. They also have a very generous free plan. Other popular providers are CircleCI and TravisCI. Since I'm more familiar with CircleCI, I'll be using them in my examples.

Configure the CI provider

We want the CI provider to run on each commit to the master branch. The provider should build our application, run tests, and if all tests have passed, execute our deploy script.

The configuration differs between providers, but the general approach is similar. You want to have a job triggered by a commit to the master branch, build the application and run the test suite, and as the last step, execute the deploy script.

In CircleCI, there are jobs and workflows. Jobs are a series of steps run on the server. A workflow runs and coordinates several jobs in parallel and/or in sequence. In jobs, you specify how to do something, and workflows describe when those jobs should run.

I've added a deploy job that runs after the build-and-test job. It checks out the code and runs the deploy script. We'll get to the internals of the script in the next section, but for now, you can add a simple hello world in a file named deploy.sh sitting at the root of your project. This will allow us to test if the job runs properly.

CircleCI looks at a configuration file in the following path: .circleci/config.yml. Let's add it with the following contents:

The build-and-test job describes a common way of installing dependencies and running tests in a NodeJS project. If you want to skip tests, you can remove the test command.

With circleci/node:12.15.0-stretch we specify which server image the CI provider should use to run our commands in. I'm using node:12.15.0-stretch in my Dockerfile, so this image mimics the production environment. It's a CircleCI specific image that adds a few common used utilities in CI/CD pipelines such as git and docker.

Let's add the workflow that coordinates when the jobs should run. We'll append the following section to .circleci/config.yml:

The tests will run on all branches/PRs, but we'll only deploy on the master branch.

Deploy script

After you've confirmed, the CI provider runs the deploy script on each commit to master after all the test have passed, we can move on to the deployment section. Afk mouse holder.

Getting started with a container registry

In the deploy script, we'll use a container registry to push the image so we can pull it from the production server.

A container registry is for containers what Github is for repositories and NPM is for NodeJS modules. It's a central place to store and manage container images.

If you're new to the Docker ecosystem, the easiest is to use Docker Hub container registry. It's free for public repositories, and you get one free private repository.

The Docker CLI uses Docker Hub as the default container registry. Therefore, it will work out of the box.

Build a Docker image and push to the container registry

The first thing we'll do in the deploy script is to build a new Docker image of the application. We give the image a name and a unique tag. A good way to generate a unique tag is to use the git hash of the latest commit. We also tag the image with the latest tag.

The image name should follow this format: [/]/. It has to match the username and repository name of the container registry you're going to push the image to in the next step. If you're using Docker Hub, that's the default, and you don't have to specify the container registry in the image name.

Let's replace the hello world example in deploy.sh with the following:

Next up, we want to upload the image to the container registry. We authenticate first using docker login. If you're using a different registry, you pass that as an argument (e.g. docker login my-registry ..).

We provide the username and password through environment variables set in the CI provider's dashboard. This is a safe way to work with credentials in CI/CD pipelines because they will be hidden in the output logs, and we don't have to commit them as code.

We append this to the deploy.sh file:

The --password-stdin flag lets us provide the password to Docker CLI in a non-interactive/manual way. It also prevents the password from appearing in the shell's history or log files. In a CI environment, this is not an issue because the server environment is thrown away after the job finishes. However, I've included it anyway since people tend to copy/paste code in all sorts of places 🤷🏼‍♂️.

Deploy the image to production server via remote SSH

We have the new image pushed to the container registry, and we're ready to deploy it on the production server. We'll do that by executing several commands remotely through the SSH agent.

Authenticating with the SSH agent

Before we get to the deploy commands, we first need to make sure the SSH agent has access to the production server and works without manual interference.

With CircleCi, there are two ways you can add a private key to the CI server — through environment variables, or using a specific job step unique to CircleCI. I'm going to use an environment variable so you can take the same steps using your own CI provider. It also makes it easier to switch providers because you're not using provider-specific configuration.

Docker

To make it easier to store a multiline SSH key into an environment variable, we'll encode it into a base64 string. Assuming your private key is stored at .ssh/id_rsa, you can do this with:

You should see a long string output:

Save this as an environment variable in the dashboard of your CI provider. Remember, the SSH key shouldn't have a passphrase. Otherwise, the CI job will require manual input and will break the automation.

In the deploy script, we'll decode it and save it to a file. We also change the file permission to be more strict because the SSH agent won't accept private keys with loose permissions. In code, it looks like this:

When the SSH agent tries to connect to a server it hasn't seen before, it asks if you trust the server and want to remember it in the future. This feature prevents man-in-the-middle attacks by confirming the server is who it claims to be.

Let's automate this manual step by adding the server's public key to ~/.ssh/known_hosts in the CI server. If you have used SSH before to connect to the production server, you'll find the public key stored in the same location on your laptop.

We'll use the same technique of encoding to base64:

Replace [IP address] with the IP address of the production server, and you should get a similar string output as before. Add it as an environment variable in your CI provider.

Let's add the following to the script:

Run deploy commands

Finally, we execute several deploy commands remotely through SSH.

We pull the image from the container registry first. If the repository is private, you'll have to authenticate with docker login in the production server before you can pull the image.

Then, we stop and remove the currently running container. docker restart won't work here since it will stop and restart the same container. We want to start another container based on the new image we just downloaded.

Next, we start a container based on the new image with the relevant flags added to the docker run command. Adjust this as you see fit for your project.

Lastly, we clean up unused Docker objects to free space on the server. Docker is notorious for quickly taking up a lot of space.

Here's the last addition to the script:

Final script

The final deploy.sh script looks like this:

I've added set -e at the top of the file to stop script execution at the first command that returns with an error. Since we're running commands in a sequence, we'll run into weird errors if the script continues.

Final words

If you've got this far without hiccups — Congratulations 🎉!

More realistically though, you've probably faced some issues along the way or were confused at some point. I always find it helpful to see a fully finished and working example. I made an example project based on this article. You can use it as a guideline. You can also reach out to me on Twitter if you're stuck.

I'd also love to hear from you what you've accomplished!

As a developer, you have probably heard of Docker at some point in your professional life. And you're likely aware that it has become important tech for any application developer to know.

If you have no idea of what I'm talking about, no worries – that's what this article is for.

We'll go on a journey to discover what is this Docker everyone is talking about and what you can do with it. By the end, we'll also create, publish, and run our first Docker image.

But first, let's lay the foundation for our story. I'll be using this amazing article by Rani Osnat that explains the whole history of containers in more depth. And I'll summarize it here so we can focus on the important parts.

A Little Bit of Container History

Docker is a container runtime. A lot of people think that Docker was the first of its kind, but this is not true – Linux containers have existed since the 1970s.

Docker is important to both the development community and container community because it made using containers so easy that everyone started doing it.

What are containers?

Containers, or Linux Containers, are a technology that allows us to isolate certain kernel processes and trick them into thinking they're the only ones running in a completely new computer.

Different from Virtual Machines, a container can share the kernel of the operating system while only having their different binaries/libraries loaded with them.

In other words, you don't need to have whole different OS (called guest OS) installed inside your host OS. You can have several containers running within a single OS without having several different guest OS's installed.

This makes containers much smaller, faster, and more efficient. While a VM can take about a minute to spin up and can weigh several Gigabytes, a container weighs, on average, 400 to 600mb (the biggest ones).

They also take only seconds to spin up. This is mostly because they don't have to spin a whole operating system before running the process.

And this all began with six characters.

The beginning of containers

The history of containers begins in 1979 with Unix v7. At that time, I wasn't even born, and my father was 15 years old. Did containers already exist in 1979? No!

In 1979, the Unix version 7 introduced a system call called chroot, which was the very beginning of what we know today as processvirtualization.

The chroot call allowed the kernel to change the apparent root directory of a process and its children.

In short, the process thinks it's running alone in the machine, because its file system is segregated from all other processes. This same syscall was introduced in BSD in 1982. But it was only two decades later when we had the first widespread application of it.

In 2000, a hosting provider was searching for better ways to manage their customers' websites, since they were all installed in the same machine and competed for the same resources.

This solution was called jails, and it was one of the first real attempts to isolate stuff at the process level. Jails allowed any FreeBSD users to partition the system into several independent, smaller systems (which are called jails). Each jail can have its own IP config and system config.

Jails were the first solution to expand the uses of chroot to allow not only the segregation at the filesystem level, but also virtualizing users, network, sub-systems and so on.

In 2008, LXC (LinuXContainers) was launched. It was, at the time, the first and most complete implementation of a container management system. It used control groups, namespaces, and a lot of what was built until then. The greatest advancement was that it was used straight from a Unix kernel, it didn't require any patches.

Docker

Finally, in 2010, Solomon Hykes and Sebastien Pahl created Docker during the Y Combinator startup incubator group. In 2011 the platform was launched.

Originally, Hykes started the Docker project in France as part of an internal project within dotCloud, a PaaS company that was shut down in 2016.

Rescale image without losing quality. Docker didn't add much to the container runtimes at the time – the greatest contribution from Docker to the container ecosystem was the awareness. Its easy-to-use CLI and concepts democratized the use of containers to common developers, and not only to deep hacking companies that needed containers for some reason.

After 2013, several companies started adopting Docker as default container runtime because it standardized the use of containers worldwide. In 2013, Red Hat announced a Docker collaboration, in 2014 it was time for Microsoft, AWS, Stratoscale, and IBM.

In 2016, the first version of Docker for a different OS than Linux was announced. Windocks released a port of Docker's OSS project designed to run on Windows. And, by the end of the same year, Microsoft announced that Docker was now natively supported on Windows through Hyper-V.

In 2019, Microsoft announced the WSL2, which made possible for Docker to run on Windows without the need of a virtualized machine on Hyper-V. Docker is now natively multiplatform while still leveraging Linux's container approach.

Finally, in 2020, Docker became the worldwide choice for containers. This happened not necessarily because it's better than others, but because it unifies all the implementations under a single easy-to-use platform with a CLI and a Daemon. And it does all of this while using simple concepts that we'll explore in the next sections.

How Does Docker Work?

Docker packages an application and all its dependencies in a virtual container that can run on any Linux server. This is why we call them containers. Because they have all the necessary dependencies contained in a single piece of software.

Docker is composed of the following elements:

  • a Daemon, which is used to build, run, and manage the containers
  • a high-level API which allows the user to communicate with the Daemon,
  • and a CLI, the interface we use to make this all available.

Docker Containers

Containers are abstractions of the app layer. They package all the code, libraries, and dependencies together. This makes it possible for multiple containers to run in the same host, so you can use that host's resources more efficiently.

Each container runs as an isolated process in the user space and take up less space than regular VMs due to their layered architecture.

These layers are called intermediate images, and these images are created every time you run a new command in the Dockerfile, for instance, if you have a Dockerfile that's like this:

At each command like COPY or RUN you'll be creating another layer on top of your container image. This allows Docker to split and separate each command into a separate part. So if you eventually use this node:stable image again, it won't need to pull all the layers of it, because you have already installed this image.

Also, all layers are hashed, which means Docker can cache those layers and optimize build times for layers that didn't change across builds. You won't need to rebuild and re-copy all the files if the COPY step hasn't changed, which greatly reduces the amount of time spent in build processes.

In the end of the build process, Docker creates a new empty layer on top of all layers called thin writable layer. This layer is the one you access when using docker exec -it . This way you can perform interactive changes in the image and commit those using docker commit, just like you'd do with a Git tracked file.

This hash-diffed layer architecture is possible because of the AuFS file system. This is a layered FS that allows files and directories to be stacked as layers one upon another.

AuFS pose some problems when dealing with DnD (Docker in Docker), but this is a subject for other article! You can check out a deeper explanation in this article.

Layers can be hash-diffed among versions. This way Docker can check if a layer has changed when building an image and decide whether to rebuild it, saving a lot of time.

So, if you already have the Ubuntu image downloaded on your computer, and you're building a new image which relies on one or more layers of that image, Docker won't build them again. It'll just reuse the same layers.

If you want to dig deeper into layers, this article gives a lot of detail on how to find, list, and manage them.

Why Docker containers are great

You have probably heard the iconic phrase 'It works on my machine'. Well, why don't we give that machine to the customer?

That's exactly the problem Docker and containers solve in general. A Docker container is a packaged collection of all the app's libraries and dependencies already prebuilt and ready to be executed.

Set Up Docker On Windows

A lot of companies have migrated over from VMs to containers not only because they're much lighter and faster to spin up, but also because they're extremely easy to maintain.

A single container can be versioned using its Dockerfile (we'll get to images in the next section), so it makes quite easy for one developer (or even a small team of developers) to run and maintain a whole ecosystem of containers. On the other hand, you would need an infrastructure person just to be able to run and housekeep VMs.

Does this mean that we don't needVMs anymore? No, on the contrary, VMs are still very much needed if you want to have a whole operating system for each customer or just need the whole environment as a sandbox. VMs are usually used as middle layers when you have a big server rack and several customers that'll be using it.

The ease of use and maintainability leads us to another important aspect of why containers are so great: it's way cheaper for a company to use containers than fully fledged VMs.

This is not because the infrastructure or hardware is cheaper, but because you need fewer people to housekeep containers, which means you can better organize your team to focus on the product instead of focusing on housekeeping.

Still related to savings, a single medium-sized VM can run about 3 to 8 containers. It depends on how many resources your containers use and how much of the underlying OS it needs to boot before running the whole application.

Some languages, like Go, allow you to build an image with only the compiled binary and nothing else. This means the Docker container will have much less to load and therefore will use fewer resources. This way you can spin up more containers per VM and use your hardware more efficiently.

Since containers are made to be ephemeral, this means all data inside them is lost when the container is deleted. This is great, because we can use containers for burstable tasks like CI.

The use of containers has brought a whole new level of DevOps advancements. Now you can simply spin up lots of containers, each one doing one small step of your deployment pipeline, and then just kill them without worrying if you've left something behind.

Clean Up Docker Space

The stateless nature of containers makes them the perfect tool for fast workloads.

Now that we've seen how containers are awesome, let's understand how we can build one of them!

What are Docker Images?

Docker images are instructions written in a special file called a Dockerfile. It has its own syntax and defines what steps Docker will take to build your container.

Since containers are only layers upon layers of changes, each new command you create in a Docker image will create a new layer in the container.

The last layer is what we call a thin writable layer. An empty layer which can be changed by the user and committed using the docker commit command.

This is an example of a simple image for a Node.js application:

In this simple example, we're creating a new image. All images are based on an existing image, or a scratch image (which I explain on my blog articles in Portuguese, here, here, and here).

These images are downloaded from a Container Registry, a repository for storing images of containers. The most common of them is the Docker Hub, but you can also create a private one using cloud solutions like Azure Container Registry.

When you run docker build . on the same directory as the Dockerfile, Docker daemon will start building the image and packaging it so you can use it. Then you can run docker run to start a new container.

Notice that we expose certain ports in the Dockerfile. Docker allows us to separate networks within our own OS, which means you can map ports from your computer to the container and vice-versa. Also, you can execute commands inside containers with docker exec.

Let's put this knowledge into practice.

How to Deploy your Dockerized Application

This will be a simple and easy walkthrough on how to create a basic Docker image using a Node.js server and make it run on your computer.

First, start a new project in a directory of your choosing, and run npm init -y to create a new package.json file. Now let's create another directory called src. In this directory we'll create a new file called server.js.

Now, in your package.json file, change the main key to src/server.js. Also, delete the test script that was created and replace it with 'start': 'node src/server.js'. Your file should be like this:

Now, create a file called Dockerfile (no extension). Let's write our image!

Let's explain this:

  1. First, we're getting the node image from Docker Hub. Since images are saved by their names, we differentiate images by their tags. You can check all tags here.
  2. Next, we use COPY to copy all files in the current dir (using .) to a new dir in the container called /usr/src/app. The directory is created automatically. This is necessary because we need all our application files in there.
  3. Now we change our start directory to the /usr/src/app directory, so we can run things from the root directory of our application.
  4. We expose our port,
  5. And we say that, as soon as our container runs, we'll execute 'npm start'.

Let's build the image by running docker build . -t simple-node-image. This way we'll tag our image and give it a name.

You'll see that it's going to create and download the image, along with all the necessary layers. Let's run this image with the following command:

We're mapping our port 80 to the port 8089 inside the container. We can check that by typing docker ps like this:

Now try to access localhost:80, and see what happens:

What is Docker Used For?

Now that we've seen how to build a Docker container, let's jump into some practical uses of Docker and how you can get the most out of it.

Ephemeral databases

Have you ever tried to develop an application that requires a database to run? Or worse, tried to run someone else's application that needs a database that you don't have installed?

Free Docker Space

The old solution was to install the database first, then run the application. With Docker you just need to run the database container. Let's run a simple MongoDB container:

That's it! Now you can access your database from your computer through port 27017, just like you'd do normally.

Persistent databases

The problem with the previous example is that, if you remove the container, all your data will be lost. So, what happens if you want to run a local database without needing to install it, but keep the data after it's deleted? You can bind Docker to a volume!

When you bind Docker to a local volume, you're essentially mounting your filesystem into the container or vice-versa. Let's see:

In this command we're mounting /data/db into /home/my/path/to/db. Now if we use docker stop my-persistent-db and docker rm my-persistent-db all our data will continue to be stored there.

Later, if we need the database again, we can mount it using the same command, and all the data will be back.

Docker Taking Up Space Meme

One-use tools

Another thing that all devs do: we install applications that we only use once. For example, that simple CLI to access that old database, or that simple GUI to some CI server.

Many tools already have Docker containers, and you can use them like this, so you don't have to install yet another tool in your notebook.

The best example is Redis. It has the redis-cli built in another container, so you don't need to install the redis-cli in your shell if you hardly use it.

Let's imagine you spin up a Redis instance with docker run -d --name redis redis --bind 127.0.0.1 bound to the localhost interface. You can access it through another container using:

The --rm flag tells Docker that it should remove the container as soon as it's stopped, while the -it flags tell it we want an interactive session (with a shell) and we'll need a TTY.

Run entire stacks

If you need to test an app that relies on another app, how would you do it? Docker makes it easy by providing docker-compose. It's another tool in your toolbox that allows you to code a docker-compose.yml file which describes your environment.

The file looks like this:

As you can see, we're defining two services, one is called web and runs docker build on the web.build path. That's a Dockerfile.

After that, it exposes port 5000 both in the host and in the container. The other service is redis, which pulls and runs the redis image on port 6379.

The best part is that the network layer is shared, in other words, you can access redis from the web service by simply typing redis and the port.

You can start this file with a simple docker-compose up, and see the magic happening.

Conclusion

That's it! This is the history of Docker, how it came to be, and how it works in 3000 words. I hope you liked it, and I hope this has made your advancement with Docker a bit easier.

As you could see, most uses of Docker are to make life easier for devs when developing applications. But there are many other uses, such as infrastructure layers and making the housekeeping of your apps a lot easier.

If you ever want to reach out to me, just ping me on any of my social networks on my website.

Cover photo by Georg Wolf on Unsplash





broken image