[vagrant@centos ~]$ curl -v https://puppet:8140/puppet-ca/v1/certificate/ca -k * About to connect() to puppet port 8140 (#0) * Trying 192.168.50.100... * Connected to puppet (192.168.50.100) port 8140 (#0) * Initializing NSS with certpath: sql:/etc/pki/nssdb * skipping SSL peer certificate verification * NSS: client certificate not found (nickname not specified)
on puppet:
1 2 3 4 5
root@puppet ~]# puppetserver ca list Requested Certificates: centos.example.com (SHA256) 42:79:9E:79:A7:17:92:88:4C:E9:DE:A2:75:F0:03:5E:6A:7C:C4:D0:6B:AF:0E:F4:69:38:B8:9D:0F:E5:AE:5E [root@puppet ~]# puppetserver ca sign --cert centos.example.com Successfully signed certificate request for centos.example.com
[root@puppet environments]# ls production sandbox [root@puppet environments]# cd sandbox/ [root@puppet sandbox]# ls data environment.conf hiera.yaml manifests modules [root@puppet sandbox]# cd manifests/ [root@puppet manifests]# ls site.pp [root@puppet manifests]# vi site.pp [root@puppet manifests]# puppet agent -t --environment sandbox Info: Using configured environment 'sandbox' Info: Retrieving pluginfacts Info: Retrieving plugin Info: Retrieving locales Info: Caching catalog for puppet.example.com Info: Applying configuration version '1593430390' Notice: This is the sandbox environment Notice: /Stage[main]/Main/Node[default]/Notify[This is the sandbox environment]/message: defined 'message' as 'This is the sandbox environment' Notice: Applied catalog in 0.01 seconds [root@puppet manifests]# puppet agent -t Info: Using configured environment 'production' Info: Retrieving pluginfacts Info: Retrieving plugin Info: Retrieving locales Info: Caching catalog for puppet.example.com Info: Applying configuration version '1593430423' Notice: This is production, only approved code is permitted Notice: /Stage[main]/Main/Node[default]/Notify[This is production, only approved code is permitted]/message: defined 'message' as 'This is production, only approved code is permitted' Notice: Applied catalog in 0.01 seconds [root@puppet manifests]# vi /etc/puppetlabs/puppet/puppet.conf add: [agent] environment=sandbox [root@puppet manifests]# puppet agent -t Info: Using configured environment 'sandbox' Info: Retrieving pluginfacts Info: Retrieving plugin Info: Retrieving locales Info: Caching catalog for puppet.example.com Info: Applying configuration version '1593430508' Notice: This is the sandbox environment Notice: /Stage[main]/Main/Node[default]/Notify[This is the sandbox environment]/message: defined 'message' as 'This is the sandbox environment' Notice: Applied catalog in 0.01 seconds
$disclaimer = @(END) ----------------------------------------------------------- This system is for the use of authorized users only. Individuals using this computer system without authority, or in excess of their authority, are subject to having all of their activities on the system monitored and recorded. This information will be shared with law enforcement should any wrong doing be suspected. -----------------------------------------------------------
----------------------------------------------------------- This system is for the use of authorized users only. Individuals using this computer system without authority, or in excess of their authority, are subject to having all of their activities on the system monitored and recorded. This information will be shared with law enforcement should any wrong doing be suspected. -----------------------------------------------------------
Each has a type (service) and a name or title (puppet). Each resource is unique to a manifest, and can be referenced by the combination of its type and name, such as Server[“puppet”]. A resource comprises a list of zero or more attributes. An attribute is a key-value pair, such as enable => fales.
Puppet differentiates between two different attributes: parameters and properties. Parameters describe the way that Puppet should deal with a resource type. Properties describe a specific setting of a resource. Certain parameters are available for all resource types (metaparameters), and some names are just very common, such as ensure. The service type supports the ensure property, which represents the status of the managed process. Its enabled property, on the other hand, relates to the system boot configuration (with respect to the service in question).
The provider parameter tells Puppet that it need to interact with the upstart subsystem to control its background service.
The difference between parameters and properties is that the parameter merely indicates how Puppet should manage the resource, not what a desired state is. Puppet will only take action on property values. In this example, these are ensure => ‘stopped’ and enable => false. For each such property, Puppet will perform the following tasks:
Test whether the resources is already in sync with the target state
If the resource is not in sync, it will trigger a sync action
Properties can be out of sync, whereas parameters cannot.
Dry testing your manifest
puppet apply puppet_service.pp --noop
Using variables
Any variable name is always prefixed with the $ sign:
Puppet has core data types and abstract data types. The core data types are the most commonly used types of data, such as string or integer, whereas abstract data types allow for more sophisticated type validation, such as optional or variant.
Adding control structures in manifests
if/else block:
1 2 3 4 5
if 'mail_lda' in $needed_services { service { 'dovecot': enable => true } } else { service { 'dovecot': enable => false } }
case statement:
1
How to add a new module in puppet
1. modified Puppetfile
1
mod 'puppetlabs-lvm', '1.4.0'
2. ssh to puppet master
1 2
cd /etc/puppetlabs/code/environments/production/ librarian-puppeet install --verbose
3. modified data/common.yaml and base.pp
1 2 3 4 5 6
profile::base::enable_lvm: false vim base.pp if $enable_lvm { include lvm }
Browsers expose page performance metrics visa the Navigation Timing API
Application Monitoring
Application Performance Monitoring(APM) Tools
StatsD is a tool used to add metrics inside of your code.
Telegraf, InfluxDB + Grafana
Grafana: Grafana is “The open platform for beautiful analytics and monitoring.” It makes it easy to create dashboards for displaying data from many sources, particularly time-series data. It works with several different data sources such as Graphite, Elasticsearch, InfluxDB, and OpenTSDB. We’re going to use this as our main front end for visualizing our network statistics.
InfluxDB: InfluxDB is “…a data store for any use case involving large amounts of timestamped data.” This is where we’re going to store our network statistics. It is designed for exactly this use-case, where metrics are collected over time.
Telegraf: Telegraf is “…a plugin-driven server agent for collecting and reporting metrics.” This can collect data from a wide variety of sources, e.g. system statistics, API calls, DB queries, and SNMP. It can then send those metrics to a variety of datastores, e.g. Graphite, OpenTSDB, Datadog, Librato. Telegraf is maintained by InfluxData, the people behind InfluxDB. So it has very good support for writing data to InfluxDB.
Movement: a group of people working together to advance their shared political, social, or artistic ideas.
Early agile methodologies
1995: Scrum
Iterative cycles of work
Emphasis on cross-functional collaboration
Mid-90s: Crystal
Emphasis on adaptability and “stretch-to-fit” process
1990: XP
Short, iterative cycles of work
Emphasis on collaboration between developers
Agile Values
Increase cross-functional collaboration
Minimize works in progress (sepecs, etc)
Work in iterative cycles to incorporate change and get frequent feedback
Make time to reflect on how you work
Agile Development
born-on-the-cloud model: cloud as the primary platform for both consumed and delivered services Agility to get projects up and running quickly streamlined process by reducing transitions from Development to Operations
Java EE application with a Minimum Viable Product (MVP) setup the cloud infrastructure and Toolchain using the Born-on-the-cloud approach
Hypothesis-driven development
ranked backlog, user stories, story points tracking and delivering in Timeboxed iterations
backlog is a prioritized list of features (User Stores) waiting to be scheduled and implemented
Note that among other things one of the most important responsibilities of the product owner role in Agile development methodology is to keep the backlog ranked by priority, and he or she can set the priorities by simply dragging the issues and ordering them, regardless of when they were added to the backlog
User stories: who, what, why story points and planning poker
user story is an informal, natural language description of a chunk of functionality that is of value from an end-user perspective As a , I want so that
Story Points are estimates of effort as influenced by the amount of work, complexity, risk, and uncertainty.
Planning Poker is a consensus-based, gamified technique for estimating Timeboxing refers to the act of putting strict time boundaries around an action for activity
time boxed interrogations along with story points are important to measure the team velocity over time. For example you should track the total amount of story points that team delivers each iteration, and this way you’ll be able to determine on average the team velocity metric, or in other words how many story points on average the team is able to deliver.
Architecture breakdown
User interface (html/css/javascript)
CRUD Service(REST APIs with JAX-RS)
Database(NoSQL)
Test-Driven Development (TDD) is a technique for building software by writing automate test cases before writing any code why: focus on the outcome; helps to manage risk; enables fast iterations and continous integration; keeps the code clear, simple and testable; grater confidence to make applicaton changes; documentation of how the system works
How:add a test->run all tests and see if the new one fails-> write the code->run tests-> refactor the code->repeat
A delivery pipeline is a sequence of automated stages that retrieve input and run jobs, such as builds, tests, and deployments. It automates the continuous deployment of a project, with miniumum or no-human intervention.
A Stage retrieves input and run jobs, such as builds,tests, and deployments.
Jobs run in discrete working directories in containers that are created for each pipeline run
By default, stages run sequentially after changes are delivered to source control
showcase the applicaton: At the end of the iteration before the retrospective meeting that is a showcase meaning, also know as demo meeting, where the work done is demonstrated to the stakeholders and feedback is obtaining. The showcase meeting starts by reviewing what we have committed in the form of user stories. And finally we demonstrate what we have accomplished in the form of working software, so we can acquire feedback on the product we are building.
retrospective meeting: how to conduct 1st way: What went well? what could be improved?
2nd way: Start doing
continuous delivery stop doing
time-bxed iterations continue doing
ranked backlog
Common Problems
Value/Focus
Problems
Solution
Not understanding the market
Hypothesis-driven development
Putting wrong products into the market
Minimum Viable Product (MVP)
Lack of focus on value
Timboxing/Iterations
Productivity/Speed/Cost
Problems
Solution
Too long to get projects up and running
Born on the Cloud
Slow to put products into market
MVP/Delivery pipeline/Continuous delivery
Too costly projects
Automation/Toolchain/Delivery pipeline
Communication/Visibility
Problems
Solution
Priorities aren’t clear
Ranked backlog
Requirements not reflecting the user’s needs
User stories
Poor team communication
Daily standup
Lack of visibility into the progress
Iteration wall/Showcase meeting
Quality/Control
Problems
Solution
Low quality products
Test driven development/Delivery pipeline
Lack of prcess improvement
Retrospective meeting/Continous improvement
Bad Requirements estimations
Story points/Planning poker
Not honouring past our commitments
Track team’s velocity
Scrum Sprints
Sprint goal is the high-level goal of each time-boxed sprint written as concisely as possible
The development team then plans how to convert the goal times into a product increment
The sprint backlog is a subset of the product backlog and contains items selected for the sprint, plus additional iteams for the development team
The scrum pillars (TIA)
Transparency
Inspcection
Adaptation
Scrum roles: Product Owner
Anyone on Scrum team may change the product backlog, but must do it with product owner’s knowledge
Product owner is accountable for the product backlog
Scrum roles: The development team
The development team is self-organized and operates with minimal input from external sources
The development team owns the entire sprint backlog(selected product backlog iteams, plus development tasks)
The are no managers or team leads-it is a flat hierarchy
Exposure: Aperture, ISO speed, and shutter speed are the three core controls that manipulate exposure.
Camera metering: The engine that assesses light and exposure.
Depth of field: An important characteristic that influences our perception of space.
Understanding exposure
Exposure triangle
Shutter speed: Controls the duration of the exposure
The shutter spped, or exposure time, refers to how long this light is permitted to enter the camera.
Range of Shutter Speeds SHUTTER SPEED|TYPICAL EXAMPLE —|— 1 to 30+ seconds|To take specialty night and low-light photos on a tripod 1/2 to 2 seconds|To add a silky look to flowing water landscape photos on a tripod for enhanced depth of field 1/30 to 1/2 second|To add motion blur to the background of moving-subject, carefully taken, handheld photos with stabilization 1/250 to 1/50 second|To take typical handheld photos without substantial zoom 1/500 to 1/250 second|To freeze everyday sports/action, in moving-subject, handheld photos with substantial zoom (telephoto lens) 1/8000 to 1/1000 second|To freeze extremely fast, up-close subject motion
Note that the range in shutter speeds spans a 100,000× ratio between the shortest exposure and longest exposure, enabling cameras with this capability to record a wide variety of subject motion.
Aperture: Controls the area through which light can enter your camera A camera’s aperture setting controls the width of the opening that lets light into your camera lens. We measure a camera’s aperture using an f-stop value, which can be counterintuitive because the area of the opening increases as the f-stop decreases. For example, when photographers say they’re “stopping down” or “opening up” their lens, they’re referring to increasing and decreasing the f-stop value, respectively
ISO speed: Controls the sensitivity of your camera’s sensor to a given amount of light
Service level objectives (SLOs) specify a target level for the reliability of your service. SLOs are key to making data-driven decisions about reliability, they’re at the core of SRE practices.
SLOs are a tool to help determin what engineering work to prioritize.
Effective team incident reports
It’s essential to have a team incident report per incident. These artifacts are for the team to help to spread knowledge and prevent stagnation where the team as a whole doesn’t know specific knowledge obtained via the incident. Store these artifacts in a central place. Depending on the organization, these artifacts may be useful to other teams.
Each artifact may have slightly different content based on the nature of the incident. The reader of the team incident report is the individuals on the team, so these reports can be longer and more detailed than external or executive briefings.
An example template for a team incident report:
Title
Date
Author(s)
Summary of the incident
Incident participants and their role(s)
Impact
Timeline
Include graphs and logs that help support the facts described in the timeline.
Lessons learned about what went well, and what needs improvement.
Action Items - These should include who, what, type of action, and when. Others outside of the incident response team might think of additional action items after reviewing the narrative.
While one person should have the responsibility of being the note taker during the meeting and initial author of the report, everyone on the incident response team should review and update the team report.
Containers are specially encapsulated and secured processes running on the host system.
Containers leverage a lot of features and primitives available in the Linux OS. The most important ones are
namespaces
cgroups
All processes running in containers share the same Linux kernel of the underlying host operating system. This is fundamentally different compared with VMs, as each VM contains its own full-blown operating system.
Difference:
Container
VMs
startup times
milliseconds
several seconds
goal
ephemeral
long-living
On the lower part of the the preceding figure, we have the Linux operating system with its cgroups, namespaces, and layer capabilities as well as other functionality that we do not need to explicitly mention here. Then, there is an intermediary layer composed of containerd and runc. On top of all that now sits the Docker engine. The Docker engine offers a RESTful interface to the outside world that can be accessed by any tool, such as the Docker CLI, Docker for Mac, and Docker for Windows or Kubernetes to just name a few.
Namespaces
A namespace is an abstraction of global resources such as filesystems, network access, process tree (also named PID namespace) or the system group IDs, and user IDs.
The PID namespace is what keeps processes in one container from seeing or interacting with processes in another container. A process might have the apparent PID 1 inside a container, but if we examine it from the host system, it would have an ordinary PID, say 334:
Control groups (cgroups)
Linux cgroups are used to limit, manage, and isolate resource usage of collections of processes running on a system. Resources are CPU time, system memory, network bandwidth, or combinations of there resources, and so on.
Using cgroups, admins can limit the resources that containers can consume.With this, one can avoid, for example, the classical noisy neighbor problem, where a rogue process running in a container consumes all CPU time or reserves massive amounts of RAM and, as such, starves all the other processes running on the host, whether they’re containerized or not.
Union filesystem (UnionFS)
The UnionFS forms the backbone of what is known as container images. UnionFS is mainly used on Linux and allows files and directories of distinct filesystems to be overlaid and with it form a single coherent file system.In this context, the individual filesystems are called branches. Contents of directories that have the same path within the merged branches will be seen together in a single merged directory, within the new, virtual filesystem. When merging branches, the priority between the branches is specified. In that way, when two branches contain the same file, the one with the higher priority is seen in the final FS.
Container plumbing
The basement on top of which the Docker engine is built; we can also call it the container plumbing and is formed by the two component—runc and containerd.
Runc
Runc is a lightweight, portable container runtime. It provides full support for Linux namespaces as well as native support for all security features available on Linux, such as SELinux, AppArmor, seccomp, and cgroups.
Runc is a tool for spawning and running containers according to the Open Container Initiative (OCI) specification.
Containerd
Runc is a low-level implementation of a container runtime; containerd builds on top of it, and adds higher-level features, such as image transfer and storage, container execution, and supervision, as well as network and storage attachments.
Creating and managing container images
The layered filesystem
The layers of a container image are all immutable. Immutable means that once generated, the layer cannot ever be changed. The only possible operation affecting the layer is the physical deletion of it.
Each layer only contains the delta of changes in regard to the previous set of layers. The content of each layer is mapped to a special folder on the host system, which is usually a subfolder of /var/lib/docker/.
Since layers are immutable, they can be cached without ever becoming stale. This is a big advantage.
The writable container layer
The container layer is marked as read/write. Another advantage of the immutability of image layers is that they can be shared among many containers created from this image. All that is needed is a thin, writable container layer for each container.
This technique, of course, results in a tremendous reduction of resources that are consumed. Furthermore, this helps to decrease the loading time of a container since only a thin container layer has to be created once the image layers have been loaded into memory, which only happens for the first container.
Copy-on-write
Docker uses the copy-on-write technique when dealing with images. Copy-on-write is a strategy of sharing and copying files for maximum efficiency. If a layer uses a file or folder that is available in one of the low-lying layers, then it just uses it. If, on the other hand, a layer wants to modify, say, a file from a low-lying layer, then it first copies this file up to the target layer and then modifies it.
The second layer wants to modify File 2, which is present in the base layer. Thus, it copied it up and then modified it. Now, let’s say that we’re sitting in the top layer of the preceding figure. This layer will use File 1 from the base layer and File 2 and File 3 from the second layer.
Creating images
Three ways to create a new container image on your system.
Interactive image creation start with a base image that we want to use as a template and run a container of it interactively.
$ docker container run -it --name sample alpine /bin/sh
#By default, the alpine container does not have the ping tool installed. Let's assume we want to create a new custom image that has ping installed.
# apk update && apk add iputils # ping 127.0.0.1 PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.060 ms 64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.054 ms 64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.059 ms ^C $ docker container ls -a | grep sample #If we want to see what has changed in our container in relation to the base image $ docker container diff sample C /var C /var/cache C /var/cache/apk A /var/cache/apk/APKINDEX.00740ba1.tar.gz A /var/cache/apk/APKINDEX.d8b2a6f4.tar.gz C /bin C /bin/ping C /bin/ping6 A /bin/traceroute6 C /lib C /lib/apk C /lib/apk/db C /lib/apk/db/triggers C /lib/apk/db/installed C /lib/apk/db/scripts.tar C /usr C /usr/sbin C /usr/sbin/arping A /usr/sbin/clockdiff A /usr/sbin/rarpd A /usr/sbin/setcap A /usr/sbin/tracepath A /usr/sbin/tracepath6 A /usr/sbin/ninfod A /usr/sbin/getcap A /usr/sbin/tftpd A /usr/sbin/getpcaps A /usr/sbin/ipg A /usr/sbin/rdisc A /usr/sbin/capsh C /usr/lib A /usr/lib/libcap.so.2 A /usr/lib/libcap.so.2.27 C /etc C /etc/apk C /etc/apk/world C /root A /root/.ash_history
#In the preceding list, A stands for added, and C for changed. If we had any deleted files, then those would be prefixed with D. #use the docker container commit command to persist our modifications and create a new image from them: $ docker container commit sample my-alpine sha256:31da02222ee7102dd755692bd498c6b170f11ffeb81ca700a4d12c0e5de9b913
## If we want to see how our custom image has been built $ docker image history my-alpine IMAGE CREATED CREATED BY SIZE COMMENT 31da02222ee7 38 seconds ago /bin/sh 1.76MB 961769676411 4 weeks ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B <missing> 4 weeks ago /bin/sh -c #(nop) ADD file:fe64057fbb83dccb9… 5.58MB #The first layer in the preceding list is the one that we just created by adding the iputils package.
Using Dockerfiles The Dockerfile is a text file that is usually literally called Dockerfile. It contains instructions on how to build a custom container image. It is a declarative way of building images.
1 2 3 4 5 6
FROM python:2.7 RUN mkdir -p /app WORKDIR /app COPY ./requirements.txt /app/ RUN pip install -r requirements.txt CMD ["python", "main.py"]
The FROM keyword Every DOckerfile starts with the FROM keyword. With it, we define which base image we want to start building our custom image from. If build starting with CentOS 7:
1
FROM centos:7
If we really want to start from scratch:
1
FROM scratch
FROM scratch is a no-op in the Dockerfile, and as such does not generate a layer in the resulting container image.
The RUN keyword The argument for RUN is any valid Linux command, such as the following:
1
RUN yum install -y wget
If we use more than one line, we need to put a backslash () at the end of the lines to indicate to the shell that the command continues on the next line. The COPY and ADD keywords add some content to an existing base image to make it a custom image. Most of the time, these are a few source files of, say, a web application or a few binaries of a compiled application.
These two keywords are used to copy files and folders from the host into the image that we’re building. The two keywords are very similar, with the exception that the ADD keyword also lets us copy and unpack TAR files, as well as provide a URL as a source for the files and folders to copy.
The first line copies all files and folders from the current directory recursively to the /app folder inside the container image
The second line copies everything in the web subfolder to the target folder, /app/web
The third line copies a single file, sample.txt, into the target folder, /data, and at the same time, renames it to my-sample.txt
The fourth statement unpacks the sample.tar file into the target folder, /app/bin
Finally, the last statement copies the remote file, sample.txt, into the target file, /data
Wildcards are allowed in the source path. For example, the following statement copies all files starting with sample to the mydir folder inside the image:
1
COPY ./sample* /mydir/
From a security perspective, it is important to know that by default, all files and folders inside the image will have a user ID (UID) and a group ID (GID) of 0. The good thing is that for both ADD and COPY, we can change the ownership that the files will have inside the image using the optional –chown flag, as follows:
1
ADD --chown=11:22 ./data/files* /app/data/
The preceding statement will copy all files starting with the name web and put them into the /app/data folder in the image, and at the same time assign user 11 and group 22 to these files.
Instead of numbers, one could also use names for the user and group, but then these entities would have to be already defined in the root filesystem of the image at /etc/passwd and /etc/group respectively, otherwise the build of the image would fail.
The WORKDIR keyword The WORKDIR keyword defines the working directory or context that is used when a container is run from our custom image. So, if I want to set the context to the /app/bin folder inside the image, my expression in the Dockerfile would have to look as follows:
1
WORKDIR /app/bin
All activity that happens inside the image after the preceding line will use this directory as the working directory. It is very important to note that the following two snippets from a Dockerfile are not the same:
1 2
RUN cd /app/bin RUN touch sample.txt
Compare the preceding code with the following code:
1 2
WORKDIR /app/bin RUN touch sample.txt
The former will create the file in the root of the image filesystem, while the latter will create the file at the expected location in the /app/bin folder. Only the WORKDIR keyword sets the context across the layers of the image. The cd command alone is not persisted across layers.
The CMD and ENTRYPOINT keywords While all other keywords defined for a Dockerfile are executed at the time the image is built by the Docker builder, these two are actually definitions of what will happen when a container is started from the image we define. When the container runtime starts a container, it needs to know what the process or application will be that has to run inside this container. That is exactly what CMD and ENTRYPOINT are used for—to tell Docker what the start process is and how to start that process.
1
$ ping 8.8.8.8 -c 3
ping is the command and 8.8.8.8 -c 3 are the parameters to this command.
ENTRYPOINT is used to define the command of the expression while CMD is used to define the parameters for the command. Thus, a Dockerfile using alpine as the base image and defining ping as the process to run in the container could look as follows:
1 2 3
FROM alpine:latest ENTRYPOINT ["ping"] CMD ["8.8.8.8", "-c", "3"]
For both ENTRYPOINT and CMD, the values are formatted as a JSON array of strings, where the individual items correspond to the tokens of the expression that are separated by whitespace. This the preferred way of defining CMD and ENTRYPOINT. It is also called the exec form.
$ docker image build -t pinger . $ docker container run --rm -it pinger PING 8.8.8.8 (8.8.8.8): 56 data bytes 64 bytes from 8.8.8.8: seq=0 ttl=56 time=5.612 ms 64 bytes from 8.8.8.8: seq=1 ttl=56 time=4.389 ms 64 bytes from 8.8.8.8: seq=2 ttl=56 time=6.592 ms #The beauty of this is that I can now override the CMD part that I have defined in the Dockerfile #This will now cause the container to ping the loopback for 5 seconds.
$ docker container run --rm -it pinger -w 5 127.0.0.1 PING 127.0.0.1 (127.0.0.1): 56 data bytes 64 bytes from 127.0.0.1: seq=0 ttl=64 time=0.077 ms 64 bytes from 127.0.0.1: seq=1 ttl=64 time=0.087 ms 64 bytes from 127.0.0.1: seq=2 ttl=64 time=0.090 ms 64 bytes from 127.0.0.1: seq=3 ttl=64 time=0.082 ms 64 bytes from 127.0.0.1: seq=4 ttl=64 time=0.081 ms
#If we want to override what's defined in the ENTRYPOINT in the Dockerfile, we need to use the --entrypoint parameter in the docker container run expression. docker container run --rm -it --entrypoint /bin/sh pinger / # exit
FROM alpine:latest CMD wget -O - http://www.google.com
## If you leave ENTRYPOINT undefined, then it will have the default value of /bin/sh -c, and whatever is the value of CMD will be passed as a string to the shell command. The preceding definition would thereby result in entering following process to run inside the container: /bin/sh -c "wget -O - http://www.google.com"
Consequently, /bin/sh is the main process running inside the container, and it will start a new child process to run the wget utility.
Multistep builds
With a size of 176 MB, the resulting image is way too big. In the end, it is just a Hello World application. The reason for it being so big is that the image not only contains the Hello World binary, but also all the tools to compile and link the application from the source code. But this is really not desirable when running the application, say, in production. Ideally, we only want to have the resulting binary in the image and not a whole SDK.
It is precisely for this reason that we should define Dockerfiles as multistage. We have some stages that are used to build the final artifacts and then a final stage where we use the minimal necessary base image and copy the artifacts into it. This results in very small images. Have a look at this revised Dockerfile:
1 2 3 4 5 6 7 8 9 10 11
FROM alpine:3.7 AS build RUN apk update && apk add --update alpine-sdk RUN mkdir /app WORKDIR /app COPY . /app RUN mkdir bin RUN gcc hello.c -o bin/hello
FROM alpine:3.7 COPY --from=build /app/bin/hello /app/hello CMD /app/hello
Here, we have a first stage with an alias build that is used to compile the application, and then the second stage uses the same base image alpine:3.7, but does not install the SDK, and only copies the binary from the build stage, using the –from parameter, into this final image.
1 2 3
docker image ls |grep hello-world hello-world-small latest 46bb8c275fda About a minute ago 4.22MB hello-world latest 75339d9c269f 7 minutes ago 178MB
We have been able to reduce the size from 178 MB down to 4 MB. This is reduction in size by a factor of 40. A smaller image size has many advantages, such as a smaller attack surface area for hackers, reduced memory and disk consumption, faster startup times of the corresponding containers, and a reduction of the bandwidth needed to download the image from a registry, such as Docker Hub.
Dockerfile best practices
First and foremost, we need to consider that containers are meant to be ephemeral. By ephemeral, we mean that a container can be stopped and destroyed and a new one built and put in place with an absolute minimum of setup and configuration. That means that we should try hard to keep the time that is needed to initialize the application running inside the container at a minimum, as well as the time needed to terminate or clean up the application.
we should order the individual commands in the Dockerfile so that we leverage caching as much as possible. Building a layer of an image can take a considerable amount of time, sometimes many seconds or even minutes. While developing an application, we will have to build the container image for our application multiple times. We want to keep the build times at a minimum.
When we’re rebuilding a previously built image, the only layers that are rebuilt are the ones that have changed, but if one layer needs to be rebuilt, all subsequent layers also need to be rebuilt. This is very important to remember. Consider the following example:
1 2 3 4 5 6
FROM node:9.4 RUN mkdir -p /app WORKDIR /app COPY . /app RUN npm install CMD ["npm", "start"]
In this example, the npm install command on line five of the Dockerfile usually takes the longest. A classical Node.js application has many external dependencies, and those are all downloaded and installed in this step. This can take minutes until it is done. Therefore, we want to avoid running npm install each time we rebuild the image, but a developer changes their source code all the time during development of the application. That means that line four, the result of the COPY command, changes all the time and this layer has to be rebuilt each time. But as we discussed previously, that also means that all subsequent layers have to be rebuilt, which in this case includes the npm install command. To avoid this, we can slightly modify the Dockerfile and have the following:
1 2 3 4 5 6 7
FROM node:9.4 RUN mkdir -p /app WORKDIR /app COPY package.json /app/ RUN npm install COPY . /app CMD ["npm", "start"]
What we have done here is that, on line four, we only copy the single file that the npm install command needs as a source, which is the package.json file. This file rarely changes in a typical development process. As a consequence, the npm install command also has to be executed only when the package.json file changes. All the remaining, frequently changed content is added to the image after the npm install command.
A further best practice is to keep the number of layers that make up your image relatively small. The more layers an image has, the more the graph driver needs to work to consolidate the layers into a single root filesystem for the corresponding container. Of course, this takes time, and thus the fewer layers an image has, the faster the startup time for the container can be.
The easiest way to reduce the number of layers is to combine multiple individual RUN commands into a single one—for example, say that we had the following in a Dockerfile:
1 2 3
RUN apt-get update RUN apt-get install -y ca-certificates RUN rm -rf /var/lib/apt/lists/*
We could combine these into a single concatenated expression, as follows:
The former will generate three layers in the resulting image, while the latter only creates a single layer.
The next three best practices all result in smaller images. Why is this important? Smaller images reduce the time and bandwidth needed to download the image from a registry. They also reduce the amount of disk space needed to store a copy locally on the Docker host and the memory needed to load the image. Finally, smaller images also means a smaller attack surface for hackers. Here are the best practices mentioned:
The first best practice that helps to reduce the image size is to use a .dockerignore file. We want to avoid copying unnecessary files and folders into an image to keep it as lean as possible. A .dockerignore file works in exactly the same way as a .gitignore file, for those who are familiar with Git. In a .dockerignore file, we can configure patterns to exclude certain files or folders from being included in the context when building the image.
The next best practice is to avoid installing unnecessary packages into the filesystem of the image. Once again, this is to keep the image as lean as possible.
Last but not least, it is recommended that you use multistage builds so that the resulting image is as small as possible and only contains the absolute minimum needed to run your application or application service.
3. Saving and loading images
The third way to create a new container image is by importing or loading it from a file. A container image is nothing more than a tarball. To demonstrate this, we can use the docker image save command to export an existing image to a tarball:
Blue-green deployments In blue-green deployments, the current version of the application service, called blue, handles all the application traffic. We then install the new version of the application service, called green, on the production system. The new service is not yet wired with the rest of the application.
Once green is installed, one can execute smoke tests against this new service and, if those succeed, the router can be configured to funnel all traffic that previously went to blue to the new service, green. The behavior of green is then observed closely and, if all success criteria are met, blue can be decommissioned. But if, for some reason, green shows some unexpected or unwanted behavior, the router can be reconfigured to return all traffic to blue. Green can then be removed and fixed, and a new blue-green deployment can be executed with the corrected version:
Canary releases Canary releases are releases where we have the current version of the application service and the new version installed on the system in parallel. As such, they resemble blue-green deployments. At first, all traffic is still routed through the current version. We then configure a router so that it funnels a small percentage, say 1%, of the overall traffic to the new version of the application service. The behavior of the new service is then monitored closely to find out whether or not it works as expected. If all the criteria for success are met, then the router is configured to funnel more traffic, say 5% this time, through the new service. Again, the behavior of the new service is closely monitored and, if it is successful, more and more traffic is routed to it until we reach 100%. Once all traffic is routed to the new service and it has been stable for some time, the old version of the service can be decommissioned.
System management
Pruning unused resources
Listing resource consumption $ docker system df
in verbose mode
$ docker system df -v Pruning containers
regain unused system resources
$ docker container prune # remove all containers from the system that are not in running status $ docker container prune -f # skip confirmation step
remove all containers from system, even the running ones
Sometimes we do not just want to remove orphaned image layers but all images that are not currently in use on our system. For this, we can use the -a (or –all) flag:
$ docker image prune –force –all Pruning volumes $ docker volume prune A useful flag when pruning volumes is the -f or –filter flag which allows us to specify the set of volumes which we’re considering for pruning $ docker volume prune –filter ‘label=demo’ Pruning network will remove the networks on which currently no container or service is attached. $ docker network prune Pruning everything $ docker system prune