From 84adefa331c4159d432d22840663c38f155cd4c1 Mon Sep 17 00:00:00 2001 From: Erlang/OTP Date: Fri, 20 Nov 2009 14:54:40 +0000 Subject: The R13B03 release. --- lib/stdlib/doc/src/beam_lib.xml | 507 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 507 insertions(+) create mode 100644 lib/stdlib/doc/src/beam_lib.xml (limited to 'lib/stdlib/doc/src/beam_lib.xml') diff --git a/lib/stdlib/doc/src/beam_lib.xml b/lib/stdlib/doc/src/beam_lib.xml new file mode 100644 index 0000000000..f2a9c2a671 --- /dev/null +++ b/lib/stdlib/doc/src/beam_lib.xml @@ -0,0 +1,507 @@ + + + + +
+ + 20002009 + Ericsson AB. All Rights Reserved. + + + The contents of this file are subject to the Erlang Public License, + Version 1.1, (the "License"); you may not use this file except in + compliance with the License. You should have received a copy of the + Erlang Public License along with this software. If not, it can be + retrieved online at http://www.erlang.org/. + + Software distributed under the License is distributed on an "AS IS" + basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See + the License for the specific language governing rights and limitations + under the License. + + + + beam_lib + Hans Bolinder + + 1999-10-30 + PA1 +
+ beam_lib + An Interface To the BEAM File Format + +

beam_lib provides an interface to files created by + the BEAM compiler ("BEAM files"). The format used, a variant of + "EA IFF 1985" Standard for Interchange Format Files, divides data + into chunks.

+

Chunk data can be returned as binaries or as compound terms. + Compound terms are returned when chunks are referenced by names + (atoms) rather than identifiers (strings). The names recognized + and the corresponding identifiers are:

+ + abstract_code ("Abst") + attributes ("Attr") + compile_info ("CInf") + exports ("ExpT") + labeled_exports ("ExpT") + imports ("ImpT") + indexed_imports ("ImpT") + locals ("LocT") + labeled_locals ("LocT") + atoms ("Atom") + +
+ +
+ + Debug Information/Abstract Code +

The option debug_info can be given to the compiler (see + compile(3)) + in order to have debug information in the form of abstract code + (see The Abstract Format + in ERTS User's Guide) stored in the abstract_code chunk. + Tools such as Debugger and Xref require the debug information to + be included.

+ +

Source code can be reconstructed from the debug information. + Use encrypted debug information (see below) to prevent this.

+
+

The debug information can also be removed from BEAM files + using strip/1, + strip_files/1 and/or + strip_release/1.

+

Reconstructing source code

+

Here is an example of how to reconstruct source code from + the debug information in a BEAM file Beam:

+ + {ok,{_,[{abstract_code,{_,AC}}]}} = beam_lib:chunks(Beam,[abstract_code]). + io:fwrite("~s~n", [erl_prettypr:format(erl_syntax:form_list(AC))]). +

Encrypted debug information

+

The debug information can be encrypted in order to keep + the source code secret, but still being able to use tools such as + Xref or Debugger.

+

To use encrypted debug information, a key must be provided to + the compiler and beam_lib. The key is given as a string and + it is recommended that it contains at least 32 characters and + that both upper and lower case letters as well as digits and + special characters are used.

+

+

The default type -- and currently the only type -- of crypto + algorithm is des3_cbc, three rounds of DES. The key string + will be scrambled using erlang:md5/1 to generate + the actual keys used for des3_cbc.

+ +

As far as we know by the time of writing, it is + infeasible to break des3_cbc encryption without any + knowledge of the key. Therefore, as long as the key is kept + safe and is unguessable, the encrypted debug information + should be safe from intruders.

+
+

There are two ways to provide the key:

+ + +

Use the compiler option {debug_info,Key}, see + compile(3), + and the function + crypto_key_fun/1 + to register a fun which returns the key whenever + beam_lib needs to decrypt the debug information.

+

If no such fun is registered, beam_lib will instead + search for a .erlang.crypt file, see below.

+
+ +

Store the key in a text file named .erlang.crypt.

+

In this case, the compiler option encrypt_debug_info + can be used, see + compile(3).

+
+
+

.erlang.crypt

+

beam_lib searches for .erlang.crypt in the current + directory and then the home directory for the current user. If + the file is found and contains a key, beam_lib will + implicitly create a crypto key fun and register it.

+

The .erlang.crypt file should contain a single list of + tuples:

+ + {debug_info, Mode, Module, Key} +

Mode is the type of crypto algorithm; currently, the only + allowed value thus is des3_cbc. Module is either an + atom, in which case Key will only be used for the module + Module, or [], in which case Key will be + used for all modules. Key is the non-empty key string.

+

The Key in the first tuple where both Mode and + Module matches will be used.

+

Here is an example of an .erlang.crypt file that returns + the same key for all modules:

+ 7}|pc/DM6Cga*68$Mw]L#&_Gejr]G^"}].]]> +

And here is a slightly more complicated example of an + .erlang.crypt which provides one key for the module + t, and another key for all other modules:

+ 7}|pc/DM6Cga*68$Mw]L#&_Gejr]G^"}].]]> + +

Do not use any of the keys in these examples. Use your own + keys.

+
+
+ +
+ DATA TYPES + +beam() -> Module | Filename | binary() + Module = atom() + Filename = string() | atom() +

Each of the functions described below accept either the module + name, the filename, or a binary containing the beam module.

+ +chunkdata() = {ChunkId, DataB} | {ChunkName, DataT} + ChunkId = chunkid() + DataB = binary() + {ChunkName, DataT} = + {abstract_code, AbstractCode} + | {attributes, [{Attribute, [AttributeValue]}]} + | {compile_info, [{InfoKey, [InfoValue]}]} + | {exports, [{Function, Arity}]} + | {labeled_exports, [{Function, Arity, Label}]} + | {imports, [{Module, Function, Arity}]} + | {indexed_imports, [{Index, Module, Function, Arity}]} + | {locals, [{Function, Arity}]}]} + | {labeled_locals, [{Function, Arity, Label}]}]} + | {atoms, [{integer(), atom()}]} + AbstractCode = {AbstVersion, Forms} | no_abstract_code + AbstVersion = atom() + Attribute = atom() + AttributeValue = term() + Module = Function = atom() + Arity = int() + Label = int() +

It is not checked that the forms conform to the abstract format + indicated by AbstVersion. no_abstract_code means + that the "Abst" chunk is present, but empty.

+

The list of attributes is sorted on Attribute, and each + attribute name occurs once in the list. The attribute values + occur in the same order as in the file. The lists of functions + are also sorted.

+ +chunkid() = "Abst" | "Attr" | "CInf" + | "ExpT" | "ImpT" | "LocT" + | "Atom" + +chunkname() = abstract_code | attributes | compile_info + | exports | labeled_exports + | imports | indexed_imports + | locals | labeled_locals + | atoms + +chunkref() = chunkname() | chunkid() +
+ + + chunks(Beam, [ChunkRef]) -> {ok, {Module, [ChunkData]}} | {error, beam_lib, Reason} + Read selected chunks from a BEAM file or binary + + Beam = beam() + ChunkRef = chunkref() + Module = atom() + ChunkData = chunkdata() + Reason = {unknown_chunk, Filename, atom()} +   | {key_missing_or_invalid, Filename, abstract_code} +   | Reason1 -- see info/1 +  Filename = string() + + +

Reads chunk data for selected chunks refs. The order of + the returned list of chunk data is determined by the order + of the list of chunks references.

+
+
+ + chunks(Beam, [ChunkRef], [Option]) -> {ok, {Module, [ChunkResult]}} | {error, beam_lib, Reason} + Read selected chunks from a BEAM file or binary + + Beam = beam() + ChunkRef = chunkref() + Module = atom() + Option = allow_missing_chunks + ChunkResult = {chunkref(), ChunkContents} | {chunkref(), missing_chunk} + Reason = {missing_chunk, Filename, atom()} +   | {key_missing_or_invalid, Filename, abstract_code} +   | Reason1 -- see info/1 +  Filename = string() + + +

Reads chunk data for selected chunks refs. The order of + the returned list of chunk data is determined by the order + of the list of chunks references.

+

By default, if any requested chunk is missing in Beam, + an error tuple is returned. + However, if the option allow_missing_chunks has been given, + a result will be returned even if chunks are missing. + In the result list, any missing chunks will be represented as + {ChunkRef,missing_chunk}. + Note, however, that if the "Atom" chunk if missing, that is + considered a fatal error and the return value will be an error + tuple.

+
+
+ + version(Beam) -> {ok, {Module, [Version]}} | {error, beam_lib, Reason} + Read the BEAM file's module version + + Beam = beam() + Module = atom() + Version = term() + Reason -- see chunks/2 + + +

Returns the module version(s). A version is defined by + the module attribute -vsn(Vsn). If this attribute is + not specified, the version defaults to the checksum of + the module. Note that if the version Vsn is not a list, + it is made into one, that is {ok,{Module,[Vsn]}} is + returned. If there are several -vsn module attributes, + the result is the concatenated list of versions. Examples:

+
+1> beam_lib:version(a). % -vsn(1).
+{ok,{a,[1]}}
+2> beam_lib:version(b). % -vsn([1]).
+{ok,{b,[1]}}
+3> beam_lib:version(c). % -vsn([1]). -vsn(2).
+{ok,{c,[1,2]}}
+4> beam_lib:version(d). % no -vsn attribute
+{ok,{d,[275613208176997377698094100858909383631]}}
+
+
+ + md5(Beam) -> {ok, {Module, MD5}} | {error, beam_lib, Reason} + Read the BEAM file's module version + + Beam = beam() + Module = atom() + MD5 = binary() + Reason -- see chunks/2 + + +

Calculates an MD5 redundancy check for the code of the module + (compilation date and other attributes are not included).

+
+
+ + info(Beam) -> [{Item, Info}] | {error, beam_lib, Reason1} + Information about a BEAM file + + Beam = beam() + Item, Info -- see below + Reason1 = {chunk_too_big, Filename, ChunkId, ChunkSize, FileSize} +   | {invalid_beam_file, Filename, Pos} +   | {invalid_chunk, Filename, ChunkId} +   | {missing_chunk, Filename, ChunkId} +   | {not_a_beam_file, Filename} +   | {file_error, Filename, Posix} +  Filename = string() +  ChunkId = chunkid() +  ChunkSize = FileSize = int() +  Pos = int() +  Posix = posix() -- see file(3) + + +

Returns a list containing some information about a BEAM file + as tuples {Item, Info}:

+ + {file, Filename} | {binary, Binary} + +

The name (string) of the BEAM file, or the binary from + which the information was extracted.

+
+ {module, Module} + +

The name (atom) of the module.

+
+ {chunks, [{ChunkId, Pos, Size}]} + +

For each chunk, the identifier (string) and the position + and size of the chunk data, in bytes.

+
+
+
+
+ + cmp(Beam1, Beam2) -> ok | {error, beam_lib, Reason} + Compare two BEAM files + + Beam1 = Beam2 = beam() + Reason = {modules_different, Module1, Module2} +   | {chunks_different, ChunkId} +   | Reason1 -- see info/1 +  Module1 = Module2 = atom() +  ChunkId = chunkid() + + +

Compares the contents of two BEAM files. If the module names + are the same, and the chunks with the identifiers + "Code", "ExpT", "ImpT", "StrT", + and "Atom" have the same contents in both files, + ok is returned. Otherwise an error message is returned.

+
+
+ + cmp_dirs(Dir1, Dir2) -> {Only1, Only2, Different} | {error, beam_lib, Reason1} + Compare the BEAM files in two directories + + Dir1 = Dir2 = string() | atom() + Different = [{Filename1, Filename2}] + Only1 = Only2 = [Filename] + Filename = Filename1 = Filename2 = string() + Reason1 = {not_a_directory, term()} | -- see info/1 + + +

The cmp_dirs/2 function compares the BEAM files in + two directories. Only files with extension ".beam" are + compared. BEAM files that exist in directory Dir1 + (Dir2) only are returned in Only1 + (Only2). BEAM files that exist on both directories but + are considered different by cmp/2 are returned as + pairs {Filename1, Filename2} where + Filename1 (Filename2) exists in directory + Dir1 (Dir2).

+
+
+ + diff_dirs(Dir1, Dir2) -> ok | {error, beam_lib, Reason1} + Compare the BEAM files in two directories + + Dir1 = Dir2 = string() | atom() + Reason1 = {not_a_directory, term()} | -- see info/1 + + +

The diff_dirs/2 function compares the BEAM files in + two directories the way cmp_dirs/2 does, but names of + files that exist in only one directory or are different are + presented on standard output.

+
+
+ + strip(Beam1) -> {ok, {Module, Beam2}} | {error, beam_lib, Reason1} + Removes chunks not needed by the loader from a BEAM file + + Beam1 = Beam2 = beam() + Module = atom() + Reason1 -- see info/1 + + +

The strip/1 function removes all chunks from a BEAM + file except those needed by the loader. In particular, + the debug information (abstract_code chunk) is removed.

+
+
+ + strip_files(Files) -> {ok, [{Module, Beam2}]} | {error, beam_lib, Reason1} + Removes chunks not needed by the loader from BEAM files + + Files = [Beam1] +  Beam1 = beam() + Module = atom() + Beam2 = beam() + Reason1 -- see info/1 + + +

The strip_files/1 function removes all chunks except + those needed by the loader from BEAM files. In particular, + the debug information (abstract_code chunk) is removed. + The returned list contains one element for each given file + name, in the same order as in Files.

+
+
+ + strip_release(Dir) -> {ok, [{Module, Filename]}} | {error, beam_lib, Reason1} + Removes chunks not needed by the loader from all BEAM files of a release + + Dir = string() | atom() + Module = atom() + Filename = string() + Reason1 = {not_a_directory, term()} | -- see info/1 + + +

The strip_release/1 function removes all chunks + except those needed by the loader from the BEAM files of a + release. Dir should be the installation root + directory. For example, the current OTP release can be + stripped with the call + beam_lib:strip_release(code:root_dir()).

+
+
+ + format_error(Reason) -> Chars + Return an English description of a BEAM read error reply + + Reason -- see other functions + Chars = [char() | Chars] + + +

Given the error returned by any function in this module, + the function format_error returns a descriptive string + of the error in English. For file errors, the function + file:format_error(Posix) should be called.

+
+
+ + crypto_key_fun(CryptoKeyFun) -> ok | {error, Reason} + Register a fun that provides a crypto key + + CryptoKeyFun = fun() -- see below + Reason = badfun | exists | term() + + +

The crypto_key_fun/1 function registers a unary fun + that will be called if beam_lib needs to read an + abstract_code chunk that has been encrypted. The fun + is held in a process that is started by the function.

+

If there already is a fun registered when attempting to + register a fun, {error, exists} is returned.

+

The fun must handle the following arguments:

+ + CryptoKeyFun(init) -> ok | {ok, NewCryptoKeyFun} | {error, Term} +

Called when the fun is registered, in the process that holds + the fun. Here the crypto key fun can do any necessary + initializations. If {ok, NewCryptoKeyFun} is returned + then NewCryptoKeyFun will be registered instead of + CryptoKeyFun. If {error, Term} is returned, + the registration is aborted and crypto_key_fun/1 + returns {error, Term} as well.

+ + CryptoKeyFun({debug_info, Mode, Module, Filename}) -> Key +

Called when the key is needed for the module Module + in the file named Filename. Mode is the type of + crypto algorithm; currently, the only possible value thus is + des3_cbc. The call should fail (raise an exception) if + there is no key available.

+ + CryptoKeyFun(clear) -> term() +

Called before the fun is unregistered. Here any cleaning up + can be done. The return value is not important, but is passed + back to the caller of clear_crypto_key_fun/0 as part + of its return value.

+
+
+ + clear_crypto_key_fun() -> {ok, Result} + Unregister the current crypto key fun + + Result = undefined | term() + + +

Unregisters the crypto key fun and terminates the process + holding it, started by crypto_key_fun/1.

+

The clear_crypto_key_fun/1 either returns + {ok, undefined} if there was no crypto key fun + registered, or {ok, Term}, where Term is + the return value from CryptoKeyFun(clear), see + crypto_key_fun/1.

+
+
+
+
+ -- cgit v1.2.3