Difference between revisions of "Assembly language"

From Conservapedia
Jump to: navigation, search
(top: Spelling, Grammar, and General Cleanup, typos fixed: etc) → etc.))
(10 intermediate revisions by 8 users not shown)
Line 1: Line 1:
An '''assembly language''' is a low-level [[programming language]] that is specific to a given [[CPU]]'s instruction set. For instance, the [[Zilog Z80]] assembly language is different from the [[Intel 8080]] assembly language, despite the similarities in the underlying instruction sets.
+
An '''assembly language''' is a low-level [[programming language]] that is specific to a given [[CPU]]'s instruction set in an one-to-one relationship. For instance, the [[Zilog Z80]] assembly language is different from the [[Intel 8080]] assembly language, despite the similarities in the underlying instruction sets. Each assembly instruction maps directly to a machine language instruction and vice versa. This also means that machine language instructions can easily be converted ("disassembled") to assembly language for, for example, purposes of debugging and reverse engineering. In other words, assembly language is "just" a human-comprehensible way of displaying processor instructions.
  
A program, called an '''assembler''' converts source code written in assembly language into the CPU's machine code.  Each CPU instruction is assigned a short mnemonic code (traditionally 3 or 4 characters in length).  Additional syntax supports registers, addressing modes, labels and other symbolic names, comments, and directives.  Assembly languages supporting macro processing (substitution, etc) used to be called macro assembly languages, but as most assembly languages now support macros, the "macro" prefix has been dropped from common usage.
+
A program, called an '''assembler''' converts source code written in assembly language into the CPU's machine code.  Each CPU instruction is assigned a short mnemonic code (traditionally 3 or 4 characters in length).  Additional syntax supports registers, addressing modes, labels and other symbolic names, comments, and directives.  Assembly languages supporting macro processing (substitution, etc.) used to be called macro assembly languages, but as most assembly languages now support macros, the "macro" prefix has been dropped from common usage.
  
 
The instruction mnemonics are typically defined by the manufacturer of the CPU.  Although anyone could assign their own arbitrary set of mnemonics for a CPU's instruction set, in practice this is seldom done.
 
The instruction mnemonics are typically defined by the manufacturer of the CPU.  Although anyone could assign their own arbitrary set of mnemonics for a CPU's instruction set, in practice this is seldom done.
Line 20: Line 20:
  
 
==Sample Assembly Source Code==
 
==Sample Assembly Source Code==
The following is a sample of assembly code for the Intel 8086 CPU:
+
The following is a sample program written in assembly for an ARM [[CPU]] running Linux kernel with [[ARM EABI]] system call conventions. The program will print "hello world" to standard output.
  
 +
<pre>
 +
# Arguments are passed in registers r0-6, the system call number in r7
  
; The purpose of this code is to output a new line
+
# Write msglen bytes starting from address msg to standard output (file descriptor 1)
 +
mov r0, #1
 +
adr r1, msg
 +
mov r2, #msglen
 +
mov r7, #4
 +
swi #0
  
.DEFINE OUTR
+
# Exit with status 0
 +
mov r0, #0
 +
mov r7, #1
 +
swi #0
  
      MOV  AX, 02H
+
# Assembler macros that specify the data
  
      INT  21H
+
msg:
 
+
.ascii "hello world\n"
.ENDDEF
+
msglen = . - msg
 
+
</pre>
CRLF:  PUSH  DX
+
 
+
      PUSHF
+
 
+
      MOV  DL, _CR
+
 
+
      OUTR
+
 
+
      MOV  DL, _LF
+
 
+
      OUTR
+
 
+
      POPF
+
 
+
      POP  DX
+
 
+
      RET
+
 
+
      END
+
  
 
==References==
 
==References==
Line 58: Line 49:
 
*Programming the Z80, Rodnay Zaks, Sybex, ISBN 0-89588-069-5
 
*Programming the Z80, Rodnay Zaks, Sybex, ISBN 0-89588-069-5
  
[[Category:Information technology]]
+
[[Category:Programming Languages]]
[[category:Programming Languages]]
+

Revision as of 15:47, July 1, 2016

An assembly language is a low-level programming language that is specific to a given CPU's instruction set in an one-to-one relationship. For instance, the Zilog Z80 assembly language is different from the Intel 8080 assembly language, despite the similarities in the underlying instruction sets. Each assembly instruction maps directly to a machine language instruction and vice versa. This also means that machine language instructions can easily be converted ("disassembled") to assembly language for, for example, purposes of debugging and reverse engineering. In other words, assembly language is "just" a human-comprehensible way of displaying processor instructions.

A program, called an assembler converts source code written in assembly language into the CPU's machine code. Each CPU instruction is assigned a short mnemonic code (traditionally 3 or 4 characters in length). Additional syntax supports registers, addressing modes, labels and other symbolic names, comments, and directives. Assembly languages supporting macro processing (substitution, etc.) used to be called macro assembly languages, but as most assembly languages now support macros, the "macro" prefix has been dropped from common usage.

The instruction mnemonics are typically defined by the manufacturer of the CPU. Although anyone could assign their own arbitrary set of mnemonics for a CPU's instruction set, in practice this is seldom done.

Advantages and Disadvantages of Assembly Language

Assembly languages do not provide many of the useful abstractions of high-level languages, such as memory management, object and other complex data structure support, or string manipulation. Such features are often available through libraries of assembly code, though.

Assembly language is essential when a new CPU is developed since it allows the development of higher-level language compilers for that CPU. Assembly is also required for access to unique and low-level features of a CPU which is why portions of most operating systems must be written using assembly language. Because assembly provides no abstractions, code written with it can run faster than equivalent code written in a high-level programming language. However, modern optimizing compilers are often better at generating efficient machine code than human-crafted assembly. Finally, writing assembly code is more tedious (and therefore more error-prone) than using a high-level language.

Macros

Macros provide short-hand for assembly programmers. The most common macro feature is the substitution macro. This allows the programmer to define his own mnemonic in place of several preexisting mnemonics. When the macro is used, the assembler will substitute the corresponding mnemonics in place of the macro, as if the programmer had included those mnemonics at that point in the source code. Another form of macro is the iteration macro which allows the programmer to duplicate something without having the manually duplicate it numerous times. Thus macros can reduce the size, and complexity, of the assembler source code.

Example

Without an assembler, a person who wanted to increment the Zilog Z80 "C" register, would have to figure out the numeric code (in this case 4 hexadecimal or 00000100 binary). Assembly allows this operation to be specified with the mnemonic:

INC C which is much easier to remember. Modern CPUs have even more complicated instruction sets which can exceed the length of this instruction by four times or more. For this reason, some refer to machine language as a first-generation language, and assembly as a second-generation language. High-level programming languages are called third-generation languages.

Sample Assembly Source Code

The following is a sample program written in assembly for an ARM CPU running Linux kernel with ARM EABI system call conventions. The program will print "hello world" to standard output.

# Arguments are passed in registers r0-6, the system call number in r7

# Write msglen bytes starting from address msg to standard output (file descriptor 1)
mov r0, #1
adr r1, msg
mov r2, #msglen
mov r7, #4
swi #0

# Exit with status 0
mov r0, #0
mov r7, #1
swi #0

# Assembler macros that specify the data

msg:
.ascii "hello world\n"
msglen = . - msg

References

  • The 8086 Family User's Manual, Intel Corporation, 1979
  • VAX 11 Structured Assembly Language Programming, Robert W. Sebesta, Benjamin/Cummings Publishing, ISBN 0-8053-7001-3
  • Programming the Z80, Rodnay Zaks, Sybex, ISBN 0-89588-069-5