PMzone logo

Book cover

Project Guide


Why Learn Assembly

Learning assembly language programming helps you:

Program Efficiency and Speed

The superiority of assembly language in generating compact code that runs fast is well documented. Assembly code is ideal for time-critical tasks that have to be completed within a certain time period. Likewise, some systems require compactness of application code such as portable computers, phones, and onboard systems found in aircraft and spacecraft. Again, assembly code excels in compactness.

Access to System Hardware

Time-critical applications often require direct control over system hardware, which programmers are insulated from when using high-level languages such as Ada, BASIC, C, Java, or Pascal. Example applications include operating systems, assemblers, compilers, linkers, device drivers, and network interfaces. Assembly language programming is the only way to go when low-level access is required.

Language Limitations

Sometimes programmers find that their high-level programming language has serious limitations that prevent them from exploiting certain kinds of microprocessor capabilities. Examples include text and bit manipulation, MMX technologies, streaming SIMD extensions (SSE), Advanced Vector Extensions (AVX), and multi-core and concurrent operations. Typically, only assembly language gives you access to these machine-level capabilities.

Programming Skills

Assembly language is central to computer science. Learning assembly language has both practical and educational purposes. A strong foundation in assembly language programming can help improve your awareness of why high-level languages are structured the way they are and improve your understanding of the underlying computer system.

Personal Satisfaction

Although learning assembly language programming is more difficult than learning a high-level language such as Ada, BASIC or C, there is a certain aspect of personal satisfaction that comes with learning something new and complex. You also become aware of the power of assembly language. The insights assembly language programming give you makes the time spent learning assembly well worth your while.

certificate

Inline Assembly Language Programming

Project 1: Learn How to Write Assembly Language Routines

This project explores 32-bit mixed-mode programming techniques. Primary focus is on combining low-level inline assembly language instructions with the high-level program source code of the IWBasic programming language. It is necessary to understand how to do this in preparation for writing a compiler and reusable component libraries. Excerpts in the accompanying project guide are taken from the book, Inline Assembly Language Programming, Volume 1, First Edition

NOTE The Inine Assembly Language Programming book and WebHelp file are currently undergoing technical and editorial reviews and are not yet available. The accompanying PDF project guide is a work in progress.

Overview

To understand how to design and develop a compiler, you need to have a good appreciation for the underlying architecture of the target processor and be able to program it at the machine level. This requires familiarity with the microprocessor and its assembly language instruction set. In the case of the Intel/AMD x86 processor family, understanding their assembly mnemonics and how they operate is a paramount.

I highly recommend you download the Intel IA-32 Architectures Software Developer's Manuals. They usually come in a set of five or more freely available PDF files. The Intel manuals are detailed and comprehensive while the AMD counterpart manuals are easier to read and understand.

Pure Assembly Language Coding

Let me start off by saying that if you are going to write pure assembly code effectively and are new to assembly language, it will take you a significant amount of hard work and persistent practice to get moderately familiar with it. In my opinion, one way to learn assembly language is to not try and learn every last instruction or every detailed nuance of assembly language, but instead, learn enough to be conversant in how data are stored in memory, passed to subroutines on the stack, and manipulated by the various native and SIMD registers available in modern multi-core processors. This approach will shorten your learning curve considerably. Let's take a look at a pure assembly language program.

Example of a pure assembly language program:

;---------------------------------------------------------------------------------
; Pure assembly language program
; Assemble: nasm -f win32 -o file.o file.asm
; Link: golink file.obj /console
;---------------------------------------------------------------------------------
segment .data    ; Define a data memory segment to hold variables
  x dq 15        ; Declare a qword integer variable and assign it a value of 15
  
segment .bss     ; Define a section for storing variables with unassigned values
  total resq 1   ; Set aside storage space for one qword
  
segment .text    ; Define a code segment to hold instructions
  global main    ; Define the starting point of the program
main:
  mov eax, 1000  ; mov 1000 into eax
  add [x], eax   ; Add 1000 to x
  push [x]       ; Push result onto the stack
  pop [total]    ; Pop result from the stack into memory for later access
  xor eax, eax   ; Zero out eax register to inform caller of success
  ret            ; Return to calling routine
;----------------------------------------------------------------------------------

As you can see, unlike high-level languages such as Ada, BASIC, or C, assembly language programming requires a thorough understanding of the underlying architectural aspects of the target microprocessor. It also requires knowledge of the assembler and linker software tools before any useful code can be written.

Inline Assembly Coding

Another helpful approach to learning assembly language is to learn how to embed assembly instructions right in the source code of a high-level language. Embedded instructions are called inline assembly code. The benefit of using inline assembly code versus linked assembly code is that inline assembly instructions reduce the overall workload. The reason for this is that many mixed-mode capable high-level languages allow assembly instructions to take advantage of functions and included libraries that contain various routines, which are automatically included when a program is compiled.

Additionally, you don't have to set up the program and pass it through a separate assembler and linker. This aspect makes inline assembly extremely convenient, powerful, and readily available. It also avoids the chore of having to construct numerous support libraries to make programs work.

This project uses inline assembly language instructions to accomplish the following tasks:

In my view, the true test of a compiler is that it should be written exclusively in its own language. This is because a compiler written completely in its own language is immeasurably easier to understand. Also, it helps facilitate the detection of potential shortcomings in the language's grammar and syntax. However, before you can begin building your own compiler, you have to start with a genesis programming language. Preferably one that incorporates well-supported inline assembly code features that support mixed-mode programming.

Mixed-Mode Programming

Mixed-mode programming is the process of writing programs in which the source code is written in two or more programming languages. For our purposes, mixed-mode programming means combining a high-level language with inline assembly code instructions and accessing Win API and C Runtime (CRT) library functions located in multiple external libraries. Although mixed-mode programming presents additional challenges for the programmer, it is worthwhile because it enables high-level languages to be extended, provides access to CPU systems at the hardware level, improves program size and performance, and affords access to external library routines.

There are numerous programming languages capable of incorporating inline assembly instructions, but most include some variant of C or other complex, fifth-generation language. For the most part, all of these languages come with steep learning curves and if you don't already program in any of them, it may take many additional hours to develop an adequate understanding of how to employ them properly and effectively. Additionally, I have a personal preference against using visually-oriented programming languages such as Delphi, Smalltalk, Visual Studio, VRML and any interpreted languages such as JavaScript, Perl, and Python for this effort. I also want to remain within the Windows environment where all of my current users do the majority of their programming.

Genesis Programming Language

In choosing a first-rate compiler, I looked at a number of currently available and affordable fifth-generation languages that natively compile to object code. Significantly, many of these languages require a UNIX, Linux, BSD, or Apple/Mac operating system. Many also include visually-oriented IDEs or contain object-oriented instruction sets such as C#, C++, Java, and VB. For now, I want to steer clear of .net languages offered by Visual Studio and such as these languages require a completely different knowledge base.

In the end, my selection boiled down to a practical requirement to maintain backward compatibility with the preponderance of existing client programs. For this reason, I settled on the non-visually-oriented 32-bit BASIC compiler already installed at a number of my client locations.

The selected compiler is called IWBasic, Version 3.0 from IonicWind. It is a very powerful BASIC, has one of the best assembler interfaces found in any 32-bit programming language in its class, and is currently well-supported and frequently updated. Together, IWBasic and NASM will serve as our genesis programming environment. Now, let's turn to an example mixed-mode program.

Here is an example program using IWBasic and inline assembly code:

REM -------------------------------------------------------
REM Demonstrate IWBasic's inline assembly code features
REM   Meta code: MyInt = MyInt + 50
REM Print MyInt to stdout. Build as a Console EXE
REM -------------------------------------------------------
INT MyInt = 100        REM Declare an integer variable

_asm                   ; Block of inline assembly code
  mov eax, 50          ; Move constant into register
  add [$MyInt], eax    ; Add it to MyInt
_endasm

PRINT "Value of MyInt = ", MyInt  REM Print 150 to stdout
DO: UNTIL INKEY$ <> "": END
REM -------------------------------------------------------

Notice that I did not have to specify segments or perform other set up chores. All this was done by the IWBasic compiler. The compiler also automatically takes care of assembling and linking the source code. The point is, mixed-mode programming provides the best of both worlds when it comes to developing applications.

Assemblers and Linkers

Speaking of assemblers, the latest version of the Netwide Assembler (NASM) should be located and downloaded. Having NASM on hand helps programmers develop and test stand-alone assembly programs and to update the assembler used in IWBasic. NASM is also well-supported and frequently updated with the latest instructions offered by Intel and AMD on their new processors.

Keep in mind that when writing stand-alone assembly language programs using NASM, a linker might be needed. One linker, ALink (Anthony's linker) is written as a companion to NASM. It is free, but currently only works on 32-bit source code. Another highly regarded linker is Jeremy Gordon's GoLink. It links both 32-bit and 64-bit source programs. In my opinion, GoLink contains comprehensive features not found in ALink, so I would go with this one. It is constantly updated and well-supported.

As an aside, the YASM project is a complete rewrite of the NASM assembler. It currently supports x86-64 instruction sets, accepts NASM and GAS assembler syntaxes, outputs to several binary formats, and generates source debugging information for a number of debugging tools. YASM is easily integrated into IWBasic for assembling NASM or GAS syntax code in Win32 object file formats.

In testing YASM, I found that if the main executable is renamed to nasmw.exe and placed in the idbdev/bin directory of IWBasic, it becomes a one-to-one replacement for NASM. And, if you are into GAS syntax, YASM will process GAS instructions contained inside IWBasic's _asm/_endasm directives. The downside to YASM is that it is not maintained or updated nearly as often as NASM.

Page Top

© 1997-2017 Transtar Management Services, Inc. All rights reserved. Terms of Use