Tagged: programming

Brief GDB Basics

In this post I would like to go through some of the very basic cases in which gdb can come in handy. I’ve seen people avoid using gdb, saying it is a CLI tool and therefore it would be hard to use. Instead, they opted for this:

std::cout << "qwewtrer" << std::endl;
DEBUG("stupid segfault already?");

That’s just stupid. In fact, printing a back trace in gdb is as easy as writing two letters. I don’t appreciate lengthy debugging sessions that much either, but it’s something you simply cannot avoid in software development. What you can do to speed things up is to know the right tools and to be able to use them efficiently. One of them is GNU debugger.

Example program

All the examples in the text will be referring to the following short piece of code. I have it stored as segfault.c and it’s basically a program that calls a function which results in segmentation fault. The code looks like this:

/* Just a segfault within a function. */

#include <stdio.h>
#include <unistd.h>

void segfault(void)
{
	int *null = NULL;
	*null = 0;
}

int main(void)
{
	printf("PID: %d\n", getpid());
	fflush(stdout);

	segfault();

	return 0;
}

Debugging symbols

One more thing, before we proceed to gdb itself. Well, two actually. In order to get anything more than a bunch of hex addresses you need to compile your binary without stripping symbols and with debug info included. Let me explain.

Symbols (in this case) can be thought of simply variable and function names. You can strip them from your binary either during compilation/linking (by passing -s argument to gcc) or later with strip(1) utility from binutils. People do this, because it can significantly reduce size of the resulting object file. Let’s see how it works exactly. First, compile the code with striping the symbols:

[astro@desktop ~/MyBook/code]$ gcc -s segfault.c

Now let’s fire up gdb:

[astro@desktop ~/MyBook/code]$ gdb ./a.out
GNU gdb (GDB) Fedora (7.3.1-48.fc15)
Reading symbols from /mnt/MyBook/code/a.out...(no debugging symbols found)...done.

Notice the last line of the output. gdb is complaining that it didn’t find any debuging symbols. Now, let’s try to run the program and display stack trace after it crashes:

(gdb) run
Starting program: /mnt/MyBook/code/a.out 
PID: 21568

Program received signal SIGSEGV, Segmentation fault.
0x08048454 in ?? ()
(gdb) bt
#0  0x08048454 in ?? ()
#1  0x0804848d in ?? ()
#2  0x4ee4a3f3 in __libc_start_main (main=0x804845c, argc=1, ubp_av=0xbffff1a4, init=0x80484a0, fini=0x8048510, 
    rtld_fini=0x4ee1dfc0 , stack_end=0xbffff19c) at libc-start.c:226
#3  0x080483b1 in ?? ()

You can imagine, that this won’t help you very much with the debugging. Now let’s see what happens when the code is compiled with symbols, but without the debuginfo.

[astro@desktop ~/MyBook/code]$ gcc segfault.c 
[astro@desktop ~/MyBook/code]$ gdb ./a.out 
GNU gdb (GDB) Fedora (7.3.1-48.fc15)
Reading symbols from /mnt/MyBook/code/a.out...(no debugging symbols found)...done.
(gdb) run
Starting program: /mnt/MyBook/code/a.out 
PID: 21765

Program received signal SIGSEGV, Segmentation fault.
0x08048454 in segfault ()
(gdb) bt
#0  0x08048454 in segfault ()
#1  0x0804848d in main ()

As you can see, gdb still complains about the symbols in the beginning, but the results are much better. The program crashed when it was executing segfault() function, so we can start looking for any problems from there. Now let’s see what we get when debuginfo get’s compiled in.

[astro@desktop ~/MyBook/code]$ gcc -g segfault.c 
[astro@desktop ~/MyBook/code]$ gdb ./a.out 
GNU gdb (GDB) Fedora (7.3.1-48.fc15)
Reading symbols from /mnt/MyBook/code/a.out...done.
(gdb) run
Starting program: /mnt/MyBook/code/a.out 
PID: 21934

Program received signal SIGSEGV, Segmentation fault.
0x08048454 in segfault () at segfault.c:9
9		*null = 0;

That’s more like it! gdb printed the exact line from the code that caused the program to crash! That means, every time you try to use gdb to get some useful directions for debugging, make sure, that you don’t strip symbols and have debuginfo available!

Start, Stop, Interrupt, Continue

These are the basic commands to control your application’s runtime. You can start a program by writing

(gdb) run

When a program is running, you can interrupt it with the usual Ctrl-C, which will send SIGINTR to the debugged process. When the process is interrupted, you can examine it (this is described later in the post) and then either stop it completely or let it continue. To stop the execution, write

(gdb) kill

If you’d like to let your program carry on executing, use

(gdb) continue

I should point out, that in gdb, you can abbreviate most of the commands to as little as a single character. For instance r can be used for run, k for kill, c for continue and so on :).

Stack traces

Stack traces are very powerful when you need to localize the point of failure. Seeing a stack trace will point you directly to the function, that caused you program to crash. If your project is small or you keep your functions short and straight-forward, this could be all you’ll ever need from a debugger. You can display stack trace in case of a segmentation fault or generally anytime when the program is interrupted. The stack trace can be displayed by a backtrace or bt command

(gdb) bt
#0  0x08048454 in segfault () at segfault.c:9
#1  0x0804848d in main () at segfault.c:17

You see, that the program stopped (more precisely was killed by the kernel with a SIGSEGV signal) at line 9 of segfault.c file while it was executing a function segfault(). The segfault function was called directly from the main() function.

Listing source code

When the program is interrupted (and compiled it with debuginfo), you can list the code directly by using the list command. It will show the precise line of code (with some context) where the program was interrupted. This can be more convenient, because you don’t have to go back into your editor and search for the place of the crash by line numbers.

(gdb) list
4 #include 
5
6 void segfault(void)
7 {
8   int *null = NULL;
9 *null = 0;
10 }
11
12 int main(void)
13 {

We know (from the stack trace), that the program has stopped at line 9. This command will show you exactly what is going on around there.

Breakpoints

Up to this point, we only interrupted the program by sending a SIGTERM to it manually. This is not very useful in practice though. In most cases, you will want the program stop at some exact place during the execution, to be able to inspect what is going on, what values do the variables have and possibly to manually step further through the program. To achieve this, you can use breakpoints. By attaching a breakpoint to a line of code, you say that you want the debugger to interrupt every time the program wants to execute the particular line and wait for your instructions.

A breakpoint can be set by a break command (before the program is executed) like this

(gdb) break 8
Breakpoint 2 at 0x4005c0: file segfault.c, line 8.

I’m using line number to specify, where to put the break, but you can use also function name and file name. There are multiple variants of arguments to break command.

You can list the breakpoints you have set up by writing info breakpoints:

(gdb) info breakpoints 
Num     Type           Disp Enb Address            What
1       breakpoint     keep n   0x00000000004005d8 in main at segfault.c:14
2       breakpoint     keep y   0x00000000004005c0 in segfault at segfault.c:8

To disable a break point, use disable <Num> command with the number you find in the info.

Stepping through the code

When gdb stops your application, you can resume the execution manually step-by-step through the instructions. There are several commands to help you with that. You can use the step and next commands to advance to the following line of code. However, these two commands are not entirely the same. Next will ‘jump’ over function calls and run them at once. Step, on the other hand, will allow you to descend into the function and execute it line-by-line as well. When you decide you’ve had enough of stepping, use the continue command to resume the execution uninterrupted to the next break point.

Breakpoint 1, segfault () at segfault.c:8
8		int *null = NULL;
(gdb) step
9		*null = 0;

There are multiple things you can do during the process of stepping through a running program. You can dump values of variables using the print command, even set values to variables (using set command). And this is definitely not all. Gdb is great! It really can save a lot of time and lets you focus on the important parts of software development. Think of it the next time you try to bisect errors in the program by inappropriate debug messages :-).

Sources

DRY Principle

I read a couple of books on software development lately and I stumbled upon some more principles of software design that I want to talk about. And the first and probably the most important one is this:

Don’t repeat yourself.

Well, this is new … I mean as soon as any programmer learns about functions and procedures, he knows, that it is way better to split things up into smaller reusable pieces. The thing is, this principle should be used in much broader terms. As in NEVER EVER EVER repeat any information in a software project.

The long version of the DRY principle, which was authored by Andy Hunt and David Thomas states

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

The key is, that it doesn’t apply only to the code. Every single time you have something stored in two places simultaneously, you can be almost certain, that it will cause pain at some point in the future of your project. It will sneak up on you from behind and hit you with a baseball bat. And then keep kicking you while you’re down. This is one of those cases in which foretelling future works damn well.

Authors of The Pragmatic Programmer show this on a great example with database schemes. At one point in your project you make up a database scheme, usually on a paper. Then you store it with your project somewhere in plain text or something. The people responsible for writing code will look in the file and create database scripts for creating the database and start putting various queries in the code.

What happened now? You have two definitions of the database scheme in your system. One in the text file and another as a database script. After while, the customer shows up and demands some additional functionality, that requires altering the scheme. Well, that shouldn’t be that much of a problem. You simply change the database script, alter the class that handles queries and go get some lunch. Everything works fine, but after a year or two, you might want to change the scheme a bit further. But you don’t remember a thing about the project, so you will probably want to look at the design first, to catch up. Or you hire someone new, who will look on the scheme definition. And it will be wrong.

Storing something multiple times is painful for a number of reasons. First, you have to sync changes between the representations. When you change the scheme, you have to change the design too and vice versa. That’s an extra work, right?  And as soon as you forget to alter both, it’s a problem.

The solution to this particular problem is code generation. You can create the database definition and a very simple script, that will turn it into the database script. Here’s a wonderful illustration (by Bruno Oliveira) of how that works :).

Repetitive tasks figure

Sources

The Pragmatic Programmer

Another great piece of computer literature I found in our campus’ library! I’m talking about The Pragmatic Programmer by Andy Hunt and David Thomas. And yes, it’s gooood :)!

The Pragmatic Programmer

Figure 1: The Pragmatic Programmer cover

Title of the book (in it’s Czech version) states: “How to become better programmer and create high quality software.” Right? I want that!

It’s a sort-of-a compilation of advice on software development from the practical point of view based on the experience of the authors. A lot of books come with a load of theory which is good too, but when you’re digging through the mounds of formal methods, it’s very easy to forget about the practical side of software development.

The very first chapter of talks about the career of a programmer or a software developer. The authors say to take your career choices as investments in your future. Pragmatic programmer should invest often and into a wide range of technologies. I don’t like the investment metaphor, but I like the thought. Computers train is moving fast and it will run you over at some point if you don’t jump in.

What I liked about this book the most is the emphasis on automation of routine tasks through scripting and the DRY principle. Having good knowlege of the environment and tools you work with is the key in any profession. But programmers (including myself) often tend to focus on what are we doing and on the final results rather than how we do it. And frankly, every time I stop and think what I could do better or automatically, I always find some weak spot.

The process of programming as in actually writing the code should not be overseen as trivial. You can save yourself a lot of stress by being creative in this area. The DRY principle is somewhat connected to this. If you repeat yourself, you not only work ineffectively (you’re doing stuff twice), but you also set a trap for yourself, which you intend to step into later in the project.

Bear trap

Figure 2: Set up for lazy programmers

Overall the book is great and I definitely can recommend it. It’s something over 200 pages or so it shouldn’t take a year to read. It’s also very well written and full of jokes, which makes it fun to read!

 Sources

Errors as Part of Interface

I was writing this code the other day. It’s a very small program — a POP3 client that downloads messages. And I just couldn’t come up with an easy and consistent way to report errors. I wanted something lightweight, but what actually makes sense. I was looking through some code hoping, that someone else has a good strategy I could rip. From what I saw, the most common is none whatsoever. Well, I didn’t like that one bit …

But worry no longer! Steve McConnell came to aid a coder in distress once again. I looked into my new copy of Code Complete and here’s what I found:

Throw exceptions at the right level of abstraction.

This statement has a very interesting point. The errors that can occur in your code, regardless of whether it’s a exception thrown or a status code returned, should be at the same level of abstraction of the unit, class or even routine that they happen in. For example if function called downloadAndPrintReport() exits with MALLOC_FAILED. You see, this just isn’t right. The malloc failure is the cause of the problem, it’s not the problem itself and you (or the user) cannot react appropriately. I mean, which malloc() call failed? Does it mean the report wasn’t even downloaded or it was but wasn’t printed? What the hell is malloc anyway? User doesn’t know!

Conclusion

Your error reports should be informative and useful to the receiver (which can be either a user or some parent code that deals with the error). By sticking to the current abstraction, your chances of delivering a good report rapidly grow. When downloadAndPrintReport() returns with UNABLE_TO_DOWNLOAD_REPORT, you can try to reopen the connection and try again later. In case of UNABLE_TO_PRINT_REPORT you can store it somewhere in a file instead of printing it.

Best Practices in Error Handling

According to the Murphy’s law — “Anything that can go wrong will go wrong“. And if Mr. Murphy were also a software engineer, he would certainly add “and anything that cannot go wrong will go wrong as well“. Wise man that Murphy, but what does it mean for us, the programmers out there in the trenches?

Error handling and reporting is a programming nightmare. It’s an extra work, it pollutes happy path of your code with whole bunch of weird if statements and forces you to return sets of mysterious error codes from functions. God, I hate error reporting (more than I hate New Jersey).

It might not seem very important, but it’s crucial to set some error handling strategy and stick with it through the whole project. The error reporting code will be literally everywhere. If you choose poor strategy in the beginning, all of your code will be condemned to be ugly and inconsistent even before you start writing it.

There are multiple problems, that you need to address in error reporting. The most important thing is to deliver an useful report to the user. The error message should say what happened and why it happened. A stack trace can help you find exactly what happened, but it generally won’t make the user very happy. My personal favorite format of reporting errors in terminal apps looks like this:

<program_name>: <what_happened>: <why_it_happened>

It’s inspired by GNU coreutils error reporting format. In the first section is always program name, so the user knows who is the message coming from. Second section says what happened or what did the error prevent to happen (e.g. “Cannot load configuration” or “Unable to establish remote connection”). Finally the last section informs user of what was the cause of his inconvenience, for instance “File ‘configuration.txt’ not found” or “Couldn’t resolve remote address”.

This way gives the user complete insight in what happened, yet it won’t scare him off with too much of programming detail. In fact, revealing too much about your errors (stack traces, memory dumps etc.) might be potential security risk.

Another criteria for evaluating error reporting strategy is how does it blend with the code. Generally, there are two approaches — centralized and decentralized error handling.

Centralized

Centralized way involves some sort of central database or storage of errors (usually a module or a header file) where are all the error messages stored. In code you pass an error code to some function and it does all the work for you.

A big plus of this approach is that you have everything in one place. If you want to change something in error reporting you know exactly where to go. On the contrary, everything in your software will depend on this one component. When you decide to reuse some code, you’ll need to take the error handling code with it. Also, as your program will grow number of errors will grow as well, which can result in a huge pile of code in one place that will be very vulnerable to errors (since everyone will want to edit it to add his own errors).

Decentralized

Decentralized approach to error reporting puts errors in places where they can happen. They’re part of interface of the respective modules. In C every module (sometimes even every function) would have it’s own set of error codes. In C++ a class would have a set of exceptions associated with it.

In my opinion, it’s a little harder to maintain and to keep consistent than the centralized way, but if you have the discipline to stick with it, it results in elegant and independent code. Somebody could say, that there will be a lot of duplicates of (let’s say) 5 most common errors, like IO failures and memory errors. Well, this is a problem of decentralized error reporting. You can minimize this by keeping your errors in context with the abstraction of the interface they fit in. For instance — class Socket will throw exceptions like ConnectionError or RecievingError not MallocError, FileError or even UnknownError. Malloc failure is the reason, which resulted in a problem with reading data, so from the point of Socket class it’s a reading error.

These are the two basic ways of error handling. I will write separate posts about a few concrete common strategies, that I know and find useful or at least good to know (exceptions, error codes etc.).

Introduction to Computer Science

In the upcoming semester I’ll be taking class called Theoretical Computer Science. Which is said to be the hardest thing you can attend here at BUT. Only half the people pass the bar every year. It’s brutal. And since I’d really like to be a part of the lucky half, I thought I could dig into the theory a little earlier and see how bad it is.

So Computer Science, right? What the hell is all that about? Theoretical computer science or theoretical information technology (as referred by some people) is a formal foundation for the things we like to call computers. What computers do? They essentially compute stuff. Current computer hardware is built and programmed to work with numbers. Who cares about a bunch of numbers? But the numbers represent some sort of information. In broader terms, computers work with information. A computer takes some information and transforms it into another information. What Wikipedia says:

Computer science deals with the theoretical foundations of information, computation, and with practical techniques for their implementation and application. Computer scientists invent algorithmic processes that create, describe, and transform information and formulate suitable abstractions to model complex systems.

Right, so we have information. How do people work with these anyway? People share their thoughts though languages. We talk and sometimes listen, we also read and sometimes write. Other people like to draw, play charades or sing. In all of these cases the information is encoded into some sort of language that others can understand. Sure, we’re people, it comes naturally to us, but what about the machines? A coffeemaker will most certainly not learn to talk to you or even understand your needs. Machines are dumb. That’s where we (the nerdy guys from engineering department) come in with formal languages — a substantial part of computer science. By formalizing common languages we make the machines understand our instructions.

Ever heard of a programming language? Programming language is basically a sequence of instructions we use when we tell the computers what to do. It’s a language that we use when we talk to the computers. It sounds a little weird, but that’s it. Computer science sets up some ground rules so the computers can algorithmically analyze the instructions and process them. It explores possibilities of computers — what can be processed by a computer? Is there anything that computers can’t solve, why? Bunch of interesting stuff.

Interesting, but sometimes very hard to understand. I wanted to put some basics here too, but I don’t want to scare you off too early. I got stuck for a while with the very basics at first (I figured it out eventually). The math will come in a stand-alone post shortly after. Stay tuned ;-).

Code Complete!

I got a copy of this awesome book today and I’m so excited, I have to write a post about it :-)!

Yeah, dude, you got yourself a book, whatever.

But it’s in English! I read this holy bible of programming already. But it was only the crappy Czech translation, which is not only a lot worse, it’s actually more expensive than the original version.

The only thing that sucks about this book is, that it’s a paperback. Why the hell does anyone ship a thousand-page book in paperback version only? The explanation is very simple. In fact, it’s written right on the cover. Let’s see if you can figure it out! Here’s a couple of pictures:

Code Complete by Steve McConnell  Code Complete by Steve McConnell

You got it! Microsoft Press. Who else could possibly screw up this otherwise perfect piece of literature? Even the Czech version comes in hardcover edition. Yeah, I’m kind of a Microsoft hater. Other than that the book is just amazing. I recommend it to everyone who is serious about software development.

Design Patterns: Bridge

Today I’m going to write some examples of Bridge. The design pattern not the game. Bridge is a structural pattern that decouples abstraction from the implementation of some component so the two can vary independently. The bridge pattern can also be thought of as two layers of abstraction[3].

Bridge pattern is useful in times when you need to switch between multiple implementations at runtime. Another great case for using bridge is when you need to couple pool of interfaces with a pool of implementations (e.g. 5 different interfaces for different clients and 3 different implementations for different platforms). You need to make sure, that there’s a solution for every type of client on each platform. This could lead to very large number of classes in the inheritance hierarchy doing virtually the same thing. The implementation of the abstraction is moved one step back and hidden behind another interface. This allows you to outsource the implementation into another (orthogonal) inheritance hierarchy behind another interface. The original inheritance tree uses implementation through the bridge interface. Let’s have a look at diagram in Figure 1.

Bridge pattern UML diagram

Figure1: Bridge pattern UML diagram

As you can see, there are two orthogonal inheritance hierarchies. The first one is behind ImplementationInterface. This implementation is injected using aggregation through Bridge class into the second hierarchy under the AbstractInterface. This allows having multiple cases coupled with multiple underlying implementations. The Client then uses objects through AbstractInterface. Let’s see it in code.

C++

/* Implemented interface. */
class AbstractInterface
{
    public:
        virtual void someFunctionality() = 0;
};

/* Interface for internal implementation that Bridge uses. */
class ImplementationInterface
{
    public:
        virtual void anotherFunctionality() = 0;
};

/* The Bridge */
class Bridge : public AbstractInterface
{
    protected:
        ImplementationInterface* implementation;

    public:
        Bridge(ImplementationInterface* backend)
        {
            implementation = backend;
        }
};

/* Different special cases of the interface. */

class UseCase1 : public Bridge
{
    public:
        UseCase1(ImplementationInterface* backend)
          : Bridge(backend)
        {}

        void someFunctionality()
        {
            std::cout << "UseCase1 on ";
            implementation->anotherFunctionality();
        }
};

class UseCase2 : public Bridge
{
    public:
        UseCase2(ImplementationInterface* backend)
          : Bridge(backend)
        {}

        void someFunctionality()
        {
            std::cout << "UseCase2 on ";
            implementation->anotherFunctionality();
        }
};

/* Different background implementations. */

class Windows : public ImplementationInterface
{
    public:
        void anotherFunctionality()
        {
            std::cout << "Windows" << std::endl;
        }
};

class Linux : public ImplementationInterface
{
    public:
        void anotherFunctionality()
        {
            std::cout << "Linux!" << std::endl;
        }
};

int main()
{
    AbstractInterface *useCase = 0;
    ImplementationInterface *osWindows = new Windows;
    ImplementationInterface *osLinux = new Linux;

    /* First case */
    useCase = new UseCase1(osWindows);
    useCase->someFunctionality();

    useCase = new UseCase1(osLinux);
    useCase->someFunctionality();

    /* Second case */
    useCase = new UseCase2(osWindows);
    useCase->someFunctionality();

    useCase = new UseCase2(osLinux);
    useCase->someFunctionality();

    return 0;
}

Download complete source file from github.

Python


class AbstractInterface:

    """ Target interface.

    This is the target interface, that clients use.
    """

    def someFunctionality(self):
        raise NotImplemented()

class Bridge(AbstractInterface):

    """ Bridge class.
    
    This class forms a bridge between the target
    interface and background implementation.
    """

    def __init__(self):
        self.__implementation = None

class UseCase1(Bridge):

    """ Variant of the target interface.

    This is a variant of the target Abstract interface.
    It can do something little differently and it can
    also use various background implementations through
    the bridge.
    """
    
    def __init__(self, implementation):
        self.__implementation = implementation

    def someFunctionality(self):
        print "UseCase1: ",
        self.__implementation.anotherFunctionality()

class UseCase2(Bridge):
    def __init__(self, implementation):
        self.__implementation = implementation

    def someFunctionality(self):
        print "UseCase2: ",
        self.__implementation.anotherFunctionality()

class ImplementationInterface:
    
    """ Interface for the background implementation.

    This class defines how the Bridge communicates
    with various background implementations.
    """

    def anotherFunctionality(self):
        raise NotImplemented

class Linux(ImplementationInterface):

    """ Concrete background implementation.

    A variant of background implementation, in this
    case for Linux!
    """

    def anotherFunctionality(self):
        print "Linux!"

class Windows(ImplementationInterface):
    def anotherFunctionality(self):
        print "Windows."

def main():
    linux = Linux()
    windows = Windows()

    # Couple of variants under a couple
    # of operating systems.
    useCase = UseCase1(linux)
    useCase.someFunctionality()

    useCase = UseCase1(windows)
    useCase.someFunctionality()

    useCase = UseCase2(linux)
    useCase.someFunctionality()

    useCase = UseCase2(windows)
    useCase.someFunctionality()

Download complete source file from github.

Summary

The Bridge pattern is very close to the Adapter by it’s structure, but there’s a huge difference in semantics. Bridge is designed up-front to let the abstraction and the implementation vary independently. Adapter is retrofitted to make unrelated classes work together [1].

Sources

  1. http://sourcemaking.com/design_patterns/bridge
  2. http://www.oodesign.com/bridge-pattern.html
  3. http://en.wikipedia.org/wiki/Bridge_pattern

Design Patterns: Adapter

And back to design patterns! Today it’s time to start with structural patterns, since I have finished all the creational patterns. What are those structural patterns anyway?

In Software Engineering, Structural Design Patterns are Design Patterns that ease the design by identifying a simple way to realize relationships between entities.

The first among the structural design patterns is Adapter. The name for it is totally appropriate, because it does exactly what any other real-life thing called adapter does. It converts some attribute of one device so it is usable together with another one. Most common adapters are between various types of electrical sockets. The adapters usually convert the voltage and/or the shape of the connector so you can plug-in different devices.

The software adapters work exactly like the outlet adapters. Imagine having (possibly a third-party) class or module you need to use in your application. It’s poorly coded and it would pollute your nicely designed code. But there’s no other way, you need it’s functionality and don’t have time to write it from scratch. The best practice is to write your own adapter and wrap the old code inside of it. Then you can use your own interface and therefore reduce your dependence on the old ugly code.

Especially, when the code comes from a third-party module you have no control on whatsoever. They could change something which would result in breaking your code on many places. That’s just unacceptable.

Adapter pattern example

UML example of adapter pattern

Here is an example class diagram of adapter use. You see there is some old interface which the adapter uses. On the other end, there is new target interface that the adapter implements. The client (i.e. your app) then uses the daisy fresh new interface. For more explanation see the source code examples bellow.

C++

typedef int Cable; // wire with electrons

/* Adaptee (source) interface */
class EuropeanSocketInterface
{
    public:
        virtual int voltage() = 0;

        virtual Cable live() = 0;
        virtual Cable neutral() = 0;
        virtual Cable earth() = 0;
};

/* Adaptee */
class Socket : public EuropeanSocketInterface
{
    public:
        int voltage() { return 230; }

        Cable live() { return 1; }
        Cable neutral() { return -1; }
        Cable earth() { return 0; }
};

/* Target interface */
class USASocketInterface
{
    public:
        virtual int voltage() = 0;

        virtual Cable live() = 0;
        virtual Cable neutral() = 0;
};

/* The Adapter */
class Adapter : public USASocketInterface
{
    EuropeanSocketInterface* socket;

    public:
        void plugIn(EuropeanSocketInterface* outlet)
        {
            socket = outlet;
        }

        int voltage() { return 110; }
        Cable live() { return socket->live(); }
        Cable neutral() { return socket->neutral(); }
};

/* Client */
class ElectricKettle
{
    USASocketInterface* power;

    public:
        void plugIn(USASocketInterface* supply)
        {
            power = supply;
        }

        void boil()
        {
            if (power->voltage() > 110)
            {
                std::cout << "Kettle is on fire!" << std::endl;
                return;
            }

            if (power->live() == 1 && power->neutral() == -1)
            {
                std::cout << "Coffee time!" << std::endl;
            }
        }
};

int main()
{
    Socket* socket = new Socket;
    Adapter* adapter = new Adapter;
    ElectricKettle* kettle = new ElectricKettle;

    /* Pluging in. */
    adapter->plugIn(socket);
    kettle->plugIn(adapter);

    /* Having coffee */
    kettle->boil();

    return 0;
}

Download example from Github.

Python

# Adaptee (source) interface
class EuropeanSocketInterface:
    def voltage(self): pass

    def live(self): pass
    def neutral(self): pass
    def earth(self): pass

# Adaptee
class Socket(EuropeanSocketInterface):
    def voltage(self):
        return 230

    def live(self):
        return 1

    def neutral(self):
        return -1

    def earth(self):
        return 0

# Target interface
class USASocketInterface:
    def voltage(self): pass

    def live(self): pass
    def neutral(self): pass

# The Adapter
class Adapter(USASocketInterface):
    __socket = None

    def __init__(self, socket):
        self.__socket = socket

    def voltage(self):
        return 110

    def live(self):
        return self.__socket.live()

    def neutral(self):
        return self.__socket.neutral()

# Client
class ElectricKettle:
    __power = None

    def __init__(self, power):
        self.__power = power

    def boil(self):
        if self.__power.voltage() > 110:
            print "Kettle on fire!"
        else:
            if self.__power.live() == 1 and \
               self.__power.neutral() == -1:
                print "Coffee time!"
            else:
                print "No power."

def main():
    # Plug in
    socket = Socket()
    adapter = Adapter(socket)
    kettle = ElectricKettle(adapter)

    # Make coffee
    kettle.boil()

    return 0

if __name__ == "__main__":
    main()

Download example from Github.

Summary

The adapter uses the old rusty interface of a class or a module and maps it’s functionality to a new interface that is used by the clients. It’s kind of wrapper for the crappy code so it doesn’t get your code dirty.

Sources

UML Class Diagram

Class diagram is a very important part of UML. It’s a structure diagram and it’s purpose is to display classes in the system with all the relationships between them. In my opinion it’s the most popular type of diagram in software development.

Drawing class diagram of your design really helps to see the problem in broader terms. By writing it down you free space in your head for new ideas :-). It is also easier to understand by others when you want to discuss the problem with someone else. The thing is, I often find myself wondering about the syntax when I read someone else’s diagrams. That’s why I decided to make a little cheat sheet here to remind me.

Class

Kind of a key component in a class diagram. Classes will be shown as nodes and usually as boxes. Here is a example of one. Each class can have methods and attributes defined. The convention is shown on Figure 1.

UML diagram: A class

Figure 1: A class

Inheritance

Class inheritance is in terms of UML a relationship of generalization. It represents “is a” relationship on class level. Figure 2 shows how to portray generalization.

UML diagram: Inheritance

Figure 2: Inheritance

Realization

UML has different relationship for interfaces. When you inherit from an interface you implement it, which is in terms of UML a relationship of realization. It’s visual appearance is similar to inheritance, but the line is dashed. Also the interface class should be marked as abstract (have name written in italic). See on Figure 3.

UML diagram: Realization

Figure 3: Realization

Association

Another form of relationship in class diagram is association. It’s a object-level relationship (i.e. happens between objects of associated classes). So the whole relationship represents a family of links. There are multiple types of association with stronger policies (composition and aggregation).

UML diagram: Association

Figure 4: Association

Aggregation

Aggregation is a stronger and more specific form of association. It’s “has a” relationship. Graphical representation of aggregation is shown on Figure 5.

UML diagram: Aggregation

Figure 5: Aggregation

 Composition

Even stronger form of aggregation is composition. Instead of “has a” it represents “owns a”. It’s suited for relationship when one object can only exist as a part of another. For example if a plane has a wing it’s a composition. What would you do with a wing alone, right? The plane owns it. But when a  pond has some ducks in it it’s an aggregation. The ducks will survive without a pond (only probably not that happy). And a pond will still be a pond with or without ducks. Graphical representation of composition is virtually the same as aggregation, only the diamond is filled (see on Figure 6).

UML diagram: Composition

Figure 6: Composition

Dependency

Last type of relationship is a dependency. It’s weaker then association and it says, that a class uses another one and therefore is dependant on it. The use of dependency is appropriate for example in cases where an instance of a class is stored as a local variable inside another classes’ method. Or some static methods are used, so the classes are not associated, but one depends on the other.

UML diagram: Dependency

Figure 7: Dependency

Cheat Sheet

I did all the examples in an open-source diagram editor called Dia. I recommend it by the way. And because it’s such a wonderful editor, here’s a complete cheat sheet (if you’d like to print it).

UML Class Diagram Cheat Sheet

UML Class Diagram Cheat Sheet