This file is part of the Perl 6 Archive

To see what is currently happening visit http://www.perl6.org/

NAME

docs/pdds/pdd06_pasm.pod - Parrot Assembly Language

ABSTRACT

This PDD describes the format of Parrot's bytecode assembly language.

DESCRIPTION

Parrot's bytecode can be thought of as a form of machine language for a virtual super CISC machine. It makes sense, then, to define an assembly language for it for those people who may need to generate bytecode directly, rather than indirectly via the perl (or any other) language.

IMPLEMENTATION

Parrot opcodes take the format of:

  code destination[dest_key], source1[source1_key], source2[source2_key]

The brackets do not denote optional arguments as such--they are real brackets. They may be left out entirely, however. If any argument has a key the assembler will substitute the null key for arguments missing keys.

Conditional branches take the format:

  code boolean[bool_key], true_dest

The key parameters are optional, and may be either an integer or a string. If either is passed they are associated with the parameter to their left, and are assumed to be either an array/list entry number, or a hash key. Any time a source or destination can be a PMC register, there may be a key.

Destinations for conditional branches are an integer offset from the current PC.

All registers have a type prefix of P, S, I, or N, for PMC, string, integer, and number respectively.

Assembly Syntax

All assembly opcodes contain only ASCII lowercase letters, digits, and the underscore.

Upper case names are reserved for assembler directives.

Labels all end with a colon. They may have ASCII letters, numbers, and underscores in them. Labels that begin with a dollar sign (the only valid spot in a label a dollar sign can appear) are private to the subroutine they appear in.

Namespaces are noted with the NAMESPACE directive. It takes a single parameter, the name of the namespace. Multilevel namespaces are supported, and the namespaces should be double-colon separated.

Subroutine names are noted with the SUB directive. It takes a single parameter, the name of the subroutine, which is added to the namespace's symbol table. Sub names may be any valid Unicode alphanumeric character and the underscore.

Constants don't need to be named and put in a separate section of the assembly source. The assembler will take care of putting them in the appropriate part of the generated bytecode.

OPCODE LIST

In the following list, there may be multiple (but unlisted) versions of an opcode. If an opcode takes a register that might be keyed, the keyed version of the opcode has a _k suffix. If an opcode might take multiple types of registers for a single parameter, the opcode function really has a _x suffix, where x is either P, S, I, or N, depending on whether a PMC, string, integer, or numeric register is involved. The suffix isn't necessary (though not an error) as the assembler can intuit the information from the code.

In those cases where an opcode can take several types of registers, and more than one of the sources or destinations are of variable type, then the register is passed in extended format. An extended format register number is of the form:

     register_number | register_type

where register_type is 0x100, 0x200, 0x400, or 0x800 for PMC, string, integer, or number respectively. So N19 would be 0x413.

Note: Instructions tagged with a * will call a vtable method to handle the instruction if used on PMC registers.

In all cases, the letters x, y, and z refer to register numbers. The letter t refers to a generic register (P, S, I, or N). A lowercase p, s, i, or n means either a register or constant of the appropriate type (PMC, string, integer, or number)

Control flow

The control flow opcodes check conditions and manage program flow.

Data manipulation

These ops handle manipulating the data in registers

Transcendental operations

These opcodes handle the transcendental math functions. The destination register here must always be either a numeric or a PMC register.

Register and stack ops

These opcodes deal with registers and stacks

Names, pads, and globals

These operations are responsible for finding names in lexical or global scopes, as well as storing data into those slots. A static scope is captured by a scratchpad. The current dynamic scope is represented by the state of the lexical stack (which contains scratchpads). For more detail on these ops see the inline POD documentation in ops/var.ops.

Exceptions

These opcodes deal with exception handling at the lowest level. Exception handlers are dynamically scoped, and any exception handler set in a scope will be removed when that scope is exited.

Object things

These opcodes deal with PMCs as objects, rather than as opaque data items.

Module handling

These opcodes deal with loading in bytecode or executable code libraries, and fetching info about those libraries. This is all dealing with precompiled bytecode or shared libraries.

I/O operations

Reads and writes read and write records, for some value of record.

Threading ops

Interpreter ops

Garbage collection

Key operations

Keys are used to get access to individual elements of an aggregate variable. This is done to allow for opaque, packed, and multidimensional aggregate types.

A key entry may be an integer, string, or PMC. Integers are used for array lookups, strings for hash lookups, and PMCs for either.

Properties

Properties are a sort of runtime note attached to a PMC. Any PMC can have properties on it. Properties live in a flat namespace, and they are not in any way associated with the class of the PMC that they are attached to.

Properties may be used for runtime notes on variables, or other metadata that may change. They are not for object attributes.

Symbolic support for HLLs

Foreign library access

These are the ops we use to load in and interface to non-parrot libraries.

Runtime compilation

These opcodes deal with runtime creation of bytecode and compilation of source code.

ATTACHMENTS

None.

REFERENCES

None.

VERSION

None.

CURRENT

    Maintainer: Dan Sugalski
    Class: Internals
    PDD Number: 6
    Version: 1.8
    Status: Developing
    Last Modified: 02 December 2002
    PDD Format: 1
    Language: English

HISTORY

CHANGES