The white paper about Porting to Linux is below. You can start with the table of contents, the beginning, or the technical issues. Separately, you can request a similar white paper, about porting of commercial software, by requesting “Software Porting Best Practices.”
If you are considering a port of a commercial software system, whether you have any potential to consider outside help or not, we’d be pleased to discuss the software engineering and planning issues. Talk with the author of the white papers, or other architects and senior software engineers with analogous experiences.
Feedback and suggestions are welcome.
Related material from ArrAy:
About the planning, integration, coordination and skills needed: “Why consider ArrAy for a Port?” Use it as a guide for planning what team members you’ll choose for a commercial software porting project.
Avoid the common porting mistakes by requesting, “Software Porting Best Practices.”
Commercial software maintenance is very different from new development. Learn about a development methodology for maintenance that results typically in a 50%+ increase in productivity, is totally compatible with all development tools and practices and that reduces the maintenance time needed from Management and from SMEs, subject matter experts.
Table of Contents
Linux has emerged as an important target and deployment platform for software vendors. There is no longer any question that Linux-based systems can deliver reliability, performance, and efficiency, and that Linux can be used for servers, desktops, and embedded systems. Consequently, software vendors and developers can no longer ignore Linux; customers are demanding it, and the market for Linux systems is expanding rapidly.
With reliability and performance that is increasingly seen as fully competitive with Windows and UNIX, the fact that Linux is free, with no licensing fees, is the primary factor driving its popularity. It enjoys wide industry support. As hardware price-performance ratios continue to improve, Linux can minimize commercial software licensing fees that would otherwise become a dominant percentage of system costs.
Software vendors and developers are finding it essential to develop plans and strategies to migrate their software from UNIX, Windows, and proprietary systems to Linux in order to meet market demand and to exploit the Linux opportunity. This white paper, after quickly exploring the current Linux environment and factors behind its success, describes the project and technical challenges that must be considered when migrating to Linux, and provides solutions to those challenges.
Table of Contents for this white paper
Linux is no longer the domain of “hackers” alone; it is a serious and complete operating system that can take on the most demanding challenges. Not only is Linux widely supported by major system vendors and application suppliers, but it can also provide fully competitive performance and reliability, along with reduced costs and platform independence, in nearly every conceivable application area.
Note: The term Linux, as used here, refers not just to the core operating system, commands, and utilities. It also refers to the large array of open source software that Linux distributors, such as Red Hat, integrate, distribute, and support.
A few factors will illustrate the wide industry support for Linux.
· Major system vendors and integrators, including IBM, HP, Compaq, and now even Sun, support Linux on their hardware platforms. For example, you can acquire an IBM server with Linux as the operating system, and IBM will provide the same full support that you would have if you acquired MVS or AIX.
· Linux is available for Intel systems as well as for systems based on Sun, IBM, Compaq, HP, and other processors.
· Software drivers are available for nearly all important devices.
· The open source software movement provides a wide array of Linux development tools, middleware, database systems, web servers, and other essential system software components. These components are maturing and improving rapidly and are, in many cases, competitive with their commercial counterparts.
There are numerous Linux trade shows, magazines, journals, and discussion groups, all providing support, information, and stimulus to the growing Linux market.
Linux and, more generally, open source software, has matured to the point where there is no concern with quality, reliability, and performance.
Linux, even in the latest releases, still contains occasional bugs. The same, however, is true of nearly all vendor systems. Furthermore, the open source community can, in many cases, provide rapid fixes and workarounds, so that it is not necessary to wait for the next vendor patch or release to obtain a bug fix.
Other factors contribute to Linux and open source software quality, including:
· Major system software developers, including IBM, Sun, and HP, now participate in and contribute to the open source process, combining the best aspects of experience, professionalism, and academic and industrial research with the “hacker” enthusiasm, vision, and skill that created the open source movement in the first place.
· Extensive documentation is available online at a large number of locations, and a quick web search will find information on nearly any conceivable technical topic.
Software developers can obtain the actual Linux source code, which allows them either to produce fixes or, at least, develop workarounds to problems. This is not possible with proprietary vendor software. With access to the source code, developers can also develop strategies to exploit the system to improve performance.
Linux systems are now running demanding applications, around the clock, 365 x 24 x 7. Desktop systems run continuously, without restarts. Many advocates would even say that Linux is more reliable than many proprietary systems.
Applications using the latest Linux releases can generally achieve the same performance levels as they can running under Windows or UNIX on comparable hardware. This is true even for server-based applications that depend on efficient multithreading and symmetric multiprocessing (SMP).
The open source community provides extensive online documentation and commercial publications. For example, O’Reilly, a leading publisher of UNIX-related books, is a leader in the Linux community and publishes a complete set of titles.
Equally importantly, Linux is, for all practical purposes, UNIX and complies with relevant POSIX and other standards. Programmers, users, and administrators do not need to learn new APIs, utilities, and commands. This conformance to familiar standards is not only a convenience, it is also an another factor that contributes to total quality.
There is one additional advantage that is often overlooked. Linux, and hence its applications, operates on systems from all major vendors. In the past, developers have spent extensive resources porting applications to multiple platforms, including multiple UNIX platforms, where system differences, large and small, create a large development, maintenance, and support burden. A single Linux port, on the other hand, can be compiled to run on nearly any hardware platform.
2.3 Linux Efficiencies
Users demand Linux for one primary reason; it is free. Not only is the operating system itself free, but there is a vast array of software development tools, utilities, database systems, and even desktop applications that can be downloaded over the Internet or acquired inexpensively from Linux distributors such as Red Hat and SuSE.
Hardware prices continue to decline, but system vendors continue to charge significant fees for the operating system itself as well as software development tools, database systems, support, and more. These fees can overwhelm the hardware costs, so license-free software can be irresistible.
ArrAy has recently completed several client engagements that required porting large C++ systems from several UNIX versions, including Sun Solaris, and Windows NT (Win32) to Linux. These projects revealed that:
· Solaris and other UNIX code ports require some specialized code modification and conditional compilation to account for Linux and Solaris behavior differences. The effort is comparable to porting to another vendor’s UNIX.
· Acceptance and performance tests passed, and Linux performance easily met or exceeded goals established on the basis of NT and Solaris performance on similar hardware.
· Achieving performance goals can, however, require some Linux-specific code as well as small amounts of assembly language code.
· Solaris actually required more compiler and operating system workarounds than Linux, and there were several extremely complex Solaris compiler bugs.
· Linux is fully capable of handling complex, multithreaded, distributed applications running in a heterogeneous environment.
· Linux is, in many respects, as scaleable as the commercial systems, although there are limitations, as described next.
For all these reasons, Linux is an attractive target platform for software developers. Nonetheless, there are some difficulties, as well as opportunities. Expertise and experience are necessary for a successful porting project, even if your code is already ported to NT and several UNIX versions.
1. As usually distributed, Linux has some significant resource limitations, such as a limited number of processes and threads, which can hinder scalability.
2. While it is usually possible to overcome problems and limitations by rebuilding the kernel or other components, most vendors and their customers will prefer to use standard distributions as shipped on CDs. These limitations can limit the applicability of Linux solutions and can also cause customer support issues.
3. Development can require careful tradeoffs in compiler and tool version selection.
4. There are some areas where the standards are not fully or correctly implemented, or where nonstandard techniques are required to access system features.
5. You do not have a direct commercial relationship with the software vendor, so there is no leverage when asking for enhancements, improvements, or bug fixes.
6. Other required middleware, development tools, databases, etc. may not be available, although Linux is rapidly gaining widespread support from both the commercial and open source communities.
Therefore, a successful Linux project and product will require customer support and communication.
Table of Contents for this white paper
Before starting a Linux porting project, it is important to establish project objectives. This section discusses some of the important decisions when setting project objectives.
In most cases, developers will not wish to limit their software exclusively to Linux. Most porting projects are ports to Linux, not ports away from Windows and UNIX. UNIX and Windows will probably remain as target platforms, so it is necessary to minimize the amount of target-specific code. Determining the current and future target platforms, such as other UNIX systems, is essential to success.
There are a number of questions regarding “which Linux” to use.
Currently, the Linux kernel version is 2.4, and customers will expect software to run on this kernel. Determine up front whether or not operation is required on older kernels.
Several vendors distribute Linux, including Red Hat (the most popular distribution in the US), SuSE (which is common in Europe), Caldera, and Debian. Different vendors support different distributions. For example, HP supports Debian, and Compaq supports Red Hat, Caldera, and SuSE. To complicate matters even more, some users will download and install Linux themselves, bypassing the distributors.
Assuming that there is no single distribution that you can require of your users, then it is best to specify the required kernel version and qualify the software on several distributions. If kernel, runtime library, and other requirements are met, there should be no difficulty supporting additional distributions.
Linux has been ported to nearly all modern processor architectures, and the Gnu compilers have back-end code generators for the different processors. While Intel (including IA-64) is the most common, you can run Linux on Sun Sparc, Compaq Alpha, and so on. The target processors should be identified as early as possible in the project, and separate builds and qualification are required for each.
Linux operates on 64-bit processors as well. Determine whether there is a requirement for the ported software to run in 64-bit mode. If the software has already been adapted for 64-bit UNIX operation, then there shouldn’t be a need for any additional changes.
In nearly all cases, it is best to preserve component interfaces, data formats, etc. and to maintain interoperability between software versions running on different platforms.
Technically, the principal barriers to achieving this objective will be byte ordering and data alignment.
In performing the port, there will be numerous opportunities to simplify Win32 and other system- specific code and exploit new levels of standardization and new open source libraries.
Existing code, particularly Windows code, frequently has a large number of proprietary vendor dependencies. While it may not be possible to eliminate such code completely, it is possible to make some decisions that will reduce the amount of platform-specific code. If an existing system currently runs only on Windows, then it is possible to avoid at least some platform-specific code without sacrificing performance, as discussed in the section on Windows technical challenges.
Complex client/server systems may not need to be ported to Linux in one large project. In many cases, it is sufficient to port one or several components, such as a server. Client components can be ported as a separate project, and, of course, this is much easier to do if backward compatibility and interoperability are maintained.
It might even be possible to port individual server components in stages, depending upon the system architecture. For example, a business logic component might be ported to Linux in the first stage while the database server component remains on NT.
The Linux porting plan, when implemented, should result in a system that is at least as reliable, testable, and efficient as the original UNIX and Windows versions. Therefore, all existing tests and test procedures should be ported to Linux as required.
Where possible and appropriate, decisions should be consistent with industry directions. For instance, web services could be a consideration in some aspects of the overall architecture. A Linux port should not in any way impede architectural changes in future projects, but such “modernization” should usually be a separate project so as to keep the individual projects manageable.
Table of Contents for this white paper
A Linux porting project will be simplest if the software currently runs on one or more UNIX versions. If the software runs only on Windows, then the challenges will be greater.
The first section below covers source language issues that will occur regardless of whether the software currently runs on UNIX, Windows, or both. The second section covers some major issues that occur with UNIX software, regardless of whether or not it also runs on Windows. The final section deals with Windows-only software, where there may be extensive dependency on proprietary Microsoft technologies.
This section assumes that the software is written in C, C++, or both. Fortran and other languages present similar problems, and the C/C++ issues will illustrate what to expect for other languages.
The Gnu C compiler, GCC (technically, the “Gnu Compiler Collection” as there are numerous versions and front ends), is the compiler of choice for Linux. In order to get a clean, warning-free build, you may find it necessary to make a number of minor source code changes. These changes, however, are no more difficult than what is required to go from one UNIX version to another or to port to or from Windows. Some of the many potential source code changes, at this time, are:
1. GCC is generally stricter than many vendor compilers in requiring #include statements for normal header files, such as stdlib.h and math.h.
2. GCC occasionally requires additional casting to remove warning, or even error, messages.
3. GCC has some minor defects with simple workarounds. For example, the Standard Template Library (STL) at() method is not supported, but the  indexing operator can be used instead. Likewise, object comparisons, such as if (O1 != O2) may not compile and can be replaced with if (!(O1 == O2)).
4. Linux functions are thread-safe, whereas many UNIX versions require special reentrant functions, not provided by Linux, such as gethostbyname_r(). Conditional compilation is required in such situations.
5. fwprintf() (used for wide character string printing) is not available under Linux, so fputwc() will be used instead to print individual string characters.
6. Linux supports 64-bit applications, and, if you do build in 64-bit mode, be careful to account for Windows differences. For example, a Windows long int data item is 32-bits rather than 64 bits as in Linux and UNIX.
GCC is sometimes used on UNIX systems and even on Windows, so if your software has already been built using GCC, most of the required source changes will already be in place.
Software that was developed for UNIX or which has been ported to UNIX is the easiest to port to Linux. In principle, you should not need to change any software source code, make files, tool usage, or anything else, since all UNIX vendors and Linux adhere to the same standards. Unfortunately, there are nearly always some changes required for even the most carefully developed software, and this section discusses some of the most likely changes.
This section also applies to software that has been ported or developed for both Linux and Windows. There is no distinction here between different vendors’ (Sun, IBM, HP, Compaq, etc.) UNIX versions.
Server-based systems, in particular, depend on multiple threads for concurrency, performance, and programming simplicity.
UNIX and Linux programs almost always use the POSIX Pthreads API. Some older software, however, still uses vendor-specific threading libraries that predate the Pthreads standards. Since Linux provides strong Pthreads support, it is a good idea to replace non-standard thread management and synchronization libraries with Pthreads during the porting project.
Recent experience shows that multithreaded Linux applications perform as well as their UNIX and Windows counterparts. This was not the case several years ago, but recent Linux kernel work has eliminated performance differences.
Linux does have a few limitations, however. Three examples will illustrate some special considerations that often require conditional compilation and present an opportunity to improve the UNIX code.
1. High contention locking, using Pthreads mutexes, especially on multiprocessor (SMP) systems, can seriously degrade performance. This is a problem with UNIX as well, and Linux supplies a set of inline assembly language “atomic” macros that minimize the performance impact. Similar version-specific macros can be introduced into existing UNIX code as well.
2. Resource limitations and the fact that Linux uses “clone processes” for threads constrains the number of threads in a process. The standard Red Hat Linux 7.2 installation, for example, allows only 255 threads within a process, and a kernel rebuild is required to change this limit.
3. Linux provides recursive mutexes, but not as specified by the Pthreads standard, and only statically allocated mutexes can be made recursive. Fortunately, recursive mutexes can be emulated easily if required, and it is usually straight-forward to avoid using them. Furthermore, the emulation is valid for UNIX as well and can improve performance.
UNIX command and utility usage is rarely an issue. In particular, make files will work under Linux. However, it is nearly always necessary to add a “target” variable that can be set from the make command line to indicate that Linux is the target platform.
Systems that are currently delivered on multiple UNIX versions probably already have a target variable and it is only necessary to add a new value and to specify the compiler and its options.
GCC is the normal choice for a Linux C/C++ compiler, but the correct version number can be confusing. At this time, GCC 2.96 is provided by most Linux distributors. GCC 3.0.0, 3.0.2 and other versions are coming into use, and they fix some errors in GCC 2.96. Different versions can be downloaded, and new versions become available from time to time. This situation is not much different from selecting a vendor compiler version and vendor patches.
Table of Contents for this white paper
In performing the port, there will be numerous opportunities to simplify Win32 and other system- specific code and exploit new levels of standardization and new open source libraries. In some cases, it is possible to reduce dependency on Microsoft proprietary technology by going to a common technology for both Linux and Windows, as well as UNIX; in other cases, conditional compilation will be required.
This section describes a number of important methods that can be used in a Windows to Linux port, but the list is not complete.
Microsoft Foundation Classes (MFC) technology is commonly used in Windows programs to manage “containers” of objects. A porting project is an excellent opportunity to eliminate proprietary MFC usage in favor of the Standard Template Library (STL), which is fast, efficient, standard, and available on Windows, every UNIX, and Linux.
Note: MFC technology is used for user interface and other purposes; user interface options are explained later.
Server-based systems, in particular, depend on multiple threads for concurrency, performance, and programming simplicity.
Most multithreaded Windows applications use the Win32 API, which has a slightly different synchronization model than the Pthreads model. Windows-only software will not use any Pthreads code, and most multi-platform software depends upon conditional compilation.
However, a high-quality open source Pthreads library is available, and can be downloaded without cost. All Win32 thread management and synchronization functions can now be removed, and Pthreads can be used throughout for Windows, Linux, and UNIX. The Pthreads library has been found to be as reliable and efficient as the Win32 thread management and synchronization functions.
Consequently, a Linux porting project is an excellent opportunity to eliminate proprietary thread management and synchronization altogether.
The Win32 API provides various forms of asynchronous I/O, including overlapped I/O, extended I/O, and asynchronous I/O ports, along with a set of proprietary Windows Sockets functions. None of these techniques are available for Linux and UNIX, and they are all complex to program. In most cases, they can be replaced by using multiple, independent threads via Pthreads, and the resulting code will work with Linux and UNIX.
Microsoft WinSock sockets are source-level compatible with industry-standard sockets, except for the “WSA” (“Windows Sockets Asynchronous”) functions. Windows sockets also require special initialization that UNIX and Linux do not require. Otherwise, WSA functions are usually used for asynchronous I/O and can be replaced by threads.
Linux provides client and server versions of the two dominant file access protocols: NFS, the most common UNIX file access protocol, and Server Message Block (SMB or “Samba”), the normal Windows protocol. Therefore, file access, in both directions, between Linux and Windows systems is not a problem.
As a client, Linux can access popular database servers using open source ODBC libraries. Linux can also be a database server, using either open source software or Linux versions provided by some of the database vendors.
If the software uses Win32 independent heaps, either an emulation can be developed or the independent heaps can be eliminated in favor of standard memory management. POSIX also provides file memory mapping function, so Win32 memory mapping can be emulated.
Windows provides a full C2 security system, with access control lists (ACLs), for files and all other shared objects within a single system.
In nearly all cases, the UNIX security model, which Linux supports, is adequate, and Linux is regarded as being as secure as other UNIX systems.
There is open source support for all three of these popular middleware solutions.
Only the Distributed Computing Environment (DCE) remote procedure calls (RPCs) are supported directly by Microsoft, and there are proprietary extensions. However, these extensions are primarily for asynchronous operation and can be removed by using threads.
GUI conversion can be a major issue for Windows-only software. Since Linux supports X-Windows, the GUI is not a problem for UNIX applications.
If the user interface must be ported to Linux from Windows, then there are several alternatives:
1. Use a Windows compatibility library; there are several products that support the Windows GUI API on Linux.
2. Convert to a web-based interface. This strategy has the advantage of making it possible to manage, configure, monitor, and interact with an application from a remote console.
3. Port the GUI code to X and X-Motif. Conversion tools are available that will simplify this task.
4. Some software, especially server systems, have character-only interfaces or are controlled though a separate “management” application. It is frequently appropriate to maintain the management interface either on Windows or as a web interface, so there is no need to port the GUI.
Several IDEs are available for Linux, and there are plug-ins that allow a Microsoft Visual Studio user to control development on a Linux target.
Similarly, CVS, ClearCase, and other SCC and version management systems are available for Linux which can replace Visual SourceSafe.
There are strong business reasons for considering a Linux port for your software systems. A successful porting project, however, demands that the requirements be set carefully and that appropriate technical strategies be put in place. The challenges are greatest when the current system only runs on Windows, but even UNIX software porting is a significant task when dealing with a large body of code.
A well-planned and well-executed Linux port can provide the expected levels of reliability and performance that allow your customers to benefit from Linux economies.