diff options
| author | John Wiegley <johnw@newartisans.com> | 2008-08-29 02:42:58 -0400 |
|---|---|---|
| committer | John Wiegley <johnw@newartisans.com> | 2008-08-29 02:42:58 -0400 |
| commit | 7e5230b8ffe32cfe7c1ec31d37c40684893aa787 (patch) | |
| tree | 182101ebdd0429321339ca54a49aae0f559a5fe2 /doc/EPD.txt | |
| parent | 9cd4c61d4ddbe87f461e402c754ea37782674bfd (diff) | |
Changed to using an autoconf/automake setup for building. This precipitated
many changes to the code, including:
- documentation has been moved into doc/
- the chess-eco opening moves are pre-generated from chess-eco.ps into
chess-eco.fen, so users don't have to wait around for it to build
- no longer using lispdoc to auto-gen function stubs in chess.texi,
this means that chess-maint.el and lispdoc.el are gone
Diffstat (limited to 'doc/EPD.txt')
| -rw-r--r-- | doc/EPD.txt | 1317 |
1 files changed, 1317 insertions, 0 deletions
diff --git a/doc/EPD.txt b/doc/EPD.txt new file mode 100644 index 0000000..906826c --- /dev/null +++ b/doc/EPD.txt @@ -0,0 +1,1317 @@ +EPD_Spec: Extended Position Description Specification + +Revised: 1995.11.26 + +Technical contact: sje@mv.mv.com (Steven J. Edwards) + +1: Introduction + +EPD is "Extended Position Description". It is a standard for describing +chess positions along with an extended set of structured attribute +values using the ASCII character set. It is intended for data and +command interchange among chessplaying programs. It is also intended +for the representation of portable opening library repositories and for +problem test suites. + +EPD is an open standard and is freely available for use by both research +and commercial programs without charge. The only requirement for use is +that any proposed extensions be coordinated through the technical +contact given at the start of this document. + +A single EPD record uses one text line of variable length composed of +four data fields followed by zero or more operations. A text file +composed exclusively of EPD data records should have a file name with +the suffix ".epd". + +2: History + +EPD was created in 1993 and is based in part on the earlier FEN standard +(Forsyth-Edwards Notation) for representing chess positions. Compared +to FEN, EPD has added extensions for use with opening library +preparation and also for general data and command interchange among +advanced chess programs. EPD was developed by John Stanback and Steven +Edwards; its first implementation was in Stanback's commercial +chessplaying program Zarkov and its second implementation was in +Edwards' research chessplaying program Spector. So many programs have +since adopted EPD that no one knows the exact sequence thereafter. + +EPD is employed for storing test suites for chessplaying programs and +for recording the results of programs running these test suites. +Example test suites are available for researchers via anonymous ftp from +the chess.onenet.net site in the pub/chess/Tests directory. The ASCII +text file pub/chess/Tests/Manifest gives descriptions of the contents of +the various test suite files. + +EPD is used to provide a linkage mechanism between chessplaying programs +and position database programs to support the automated direction of +analysis generation. + +3: EPD tools and applications + +To encourage development of EPD capable applications, a free EPD tool +kit is available for program authors working with the ANSI C language. +To further encourage usage of EPD, a number of free applications are +also available. + +3.1: The EPD Kit + +Work is currently in progress on developing an EPD Kit. This tool kit +is a collection of portable ANSI C source code files that provide +routines to create and manipulate EPD data for arbitrarily complex +records. It is designed to handle all common EPD related tasks so as to +assist chess program developers with EPD implementation. A secondary +goal is to ensure that every implementation of EPD processing have the +same set of operational semantics. + +The EPD Kit will be made freely available to all chess software authors +without charge and can be used in both research and commercial +applications. As with EPD itself, the only requirement for use is that +any proposed extensions be coordinated through the technical contact +given at the start of this document. + +3.2: Argus, the automated tournament referee + +Work is currently in progress on developing Argus, an automated +tournament referee program for computer chess events. Argus uses IP +(Internet Protocol) communications to act as a mediator for multiple +pairs of chessplaying programs and to provide an interactive interface +for a human tournament supervisor. Argus uses the EPD Kit along with +other routines to perform the following tasks: + +1) Starting chessplaying programs (IP clients) with proper +initialization data; + +2) Relaying position/move data (using EPD) from each program to its +opponent; + +3) Providing all chess clock data as part of the relay process; + +4) Record all games using PGN (Portable Game Notation) to assist in the +production of the tournament final report; + +5) Record all moves and other transmitted data in log files for later +analysis; + +6) Detect and report time forfeit conditions; + +7) Mediate draw offers and responses between each pair of opponents; + +8) Recognize and handle game termination conditions due to draws, +resignations, time forfeits, and checkmates; + +9) Allow for chessplaying program restart and game resumption as +directed by the human supervisor; + +10) Allow for a second instance of itself to operate in observer mode to +be ready to take over in case of primary machine failure; + +11) Support display of games in progress for the benefit of the human +supervisor and for the general viewing audience. + +In its usual configuration, Argus runs on an IP network that connects it +with all of the participating machines. It acts like a Unix style +server using TCP/IP; the chessplaying programs connect to Argus as +TCP/IP clients. Unlike a typical Unix style server, it runs in the +foreground instead of the background when operated by a human +supervisor. + +One variant mode of operation allows for Argus to be started by the host +system and run in the background. This use is intended for events where +human supervision is not required. Any operating information usually +provided manually may instead be supplied by configuration files. + +Another variant mode of operation allows for Argus to mediate +communication between a single pair of chessplaying programs using +regular (unstructured) bidirectional asynchronous serial communication +instead of IP. While less reliable than IP operation, unstructured +serial communication can be used on common inexpensive hardware +platforms that lack IP support. An example would be to use common PC +machines with each chessplaying program running on a separate machine +and a third machine running Argus in serial mode. Each of the two +machines with chessplaying programs connect to the Argus machine via a +null modem cable. Note that the Argus machine needs two free serial +ports while each of the chessplaying machines needs only a single free +serial port. +The Argus program will be made freely available to all chess software +authors without charge and can be used in both research and commercial +applications. As with EPD itself, the only requirement for use is that +any proposed extensions be coordinated through the technical contact +given at the start of this document. + +3.3: Gastric, an EPD based report generator + +Work is in progress on Gastric, an application that reads EPD files and +produces statistical reports. The main use of Gastric is to assist in +the process of benchmarking chessplaying program performance on EPD test +suites. The resulting reports contain summaries of raw performance, +identification of solved/missed problems, distribution information for +node count, time consumption, and other items. Advanced functions of +Gastric may be used to produce comparative analysis of different +programs or different versions of the same program. Some work is also +planned to allow Gastric output to be used as feedback into +self-adjusting chessplaying programs. + +The Gastric program will be made freely available to all chess software +authors without charge and can be used in both research and commercial +applications. As with EPD itself, the only requirement for use is that +any proposed extensions be coordinated through the technical contact +given at the start of this document. + +4: The four EPD data fields + +Each EPD record contains four data filed that describe the current +position. From left to right starting at the beginning of the record, +these are the piece placement, the active color, the castling +availability, and the en passant target square of a position. These can +all fit on a single text line in an easily read format. The length of +an EPD position description varies somewhat according to the position +and any associated operations. In some cases, the description could be +eighty or more characters in length and so may not fit conveniently on +some displays. However, most EPD records pass among programs only and +so are not usually seen by program users. + +Note: due to the likelihood of future expansion of EPD, implementors are +encouraged to have their programs handle EPD text lines of up to 4096 +characters long including the traditional ASCII NUL character as a +terminator. This is an increase from the earlier suggestion of a +maximum length of 1024 characters. Depending on the host operating +system, the external representation of EPD records will include one or +more bytes to indicate the end of a line. These do not count against +the length limit as the internal representation of an EPD text record is +stripped of end of line bytes and instead is terminated by the +traditional ASCII NUL character. + +Each of the four EPD data fields are composed only of non-blank printing +ASCII characters. Adjacent data fields are separated by a single ASCII +space character. + +4.1: Piece placement data + +The first field represents the placement of the pieces on the board. +The board contents are specified starting with the eighth rank and +ending with the first rank. For each rank, the squares are specified +from file a to file h. White pieces are identified by uppercase SAN +(Standard Algebraic Notation) piece letters ("PNBRQK") and black pieces +are identified by lowercase SAN piece letters ("pnbrqk"). Empty squares +are represented by the digits one through eight; the digit used +represents the count of contiguous empty squares along a rank. The +contents of all eight squares on each rank must be specified; therefore, +the count of piece letters plus the sum of the vacant square counts must +always equal eight. The solidus character "/" (forward slash) is used +to separate data of adjacent ranks. There is no leading or trailing +solidus in the piece placement data; hence there are exactly seven of +solidus characters in the placement field. + +The piece placement data for the starting array is: + +rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR + +4.2: Active color + +The second field represents the active color. A lower case "w" is used +if White is to move; a lower case "b" is used if Black is the active +player. + +The piece placement and active color data for the starting array is: + +rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w + +4.3: Castling availability + +The third field represents castling availability. This indicates +potential future castling that may or may not be possible at the moment +due to blocking pieces or enemy attacks. If there is no castling +availability for either side, the single character symbol "-" is used. +Otherwise, a combination of from one to four characters are present. If +White has kingside castling availability, the uppercase letter "K" +appears. If White has queenside castling availability, the uppercase +letter "Q" appears. If Black has kingside castling availability, the +lowercase letter "k" appears. If Black has queenside castling +availability, then the lowercase letter "q" appears. Those letters +which appear will be ordered first uppercase before lowercase and second +kingside before queenside. There is no white space between the letters. + +The piece placement, active color, and castling availability data for +the starting array is: + +rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq + +4.4: En passant target square + +The fourth field is the en passant target square. If there is no en +passant target square then the single character symbol "-" appears. If +there is an en passant target square then is represented by a lowercase +file character (one of "abcdefgh") immediately followed by a rank digit. +Obviously, the rank digit will be "3" following a white pawn double +advance (Black is the active color) or else be the digit "6" after a +black pawn double advance (White being the active color). + +An en passant target square is given if and only if the last move was a +pawn advance of two squares. Therefore, an en passant target square +field may have a square name even if there is no pawn of the opposing +side that may immediately execute the en passant capture. + +The piece placement, active color, castling availability, and en passant +target square data for the starting array is: + +rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - + +5: Operations + +An EPD operation is composed of an opcode followed by zero or more +operands and is concluded by a semicolon. + +Multiple operations are separated by a single space character. If there +is at least one operation present in an EPD line, it is separated from +the last (fourth) data field by a single space character. + +5.1: General format of opcodes and operands + +An opcode is an identifier that starts with a letter character and may +be followed by up to fourteen more characters. Each additional +character may be a letter or a digit or the underscore character. +Traditionally, no uppercase letters are used in opcode names that are to +be used by more than one program. + +An operand is either a set of contiguous non-white space printing +characters or a string. A string is a set of contiguous printing +characters delimited by a quote (ASCII code: 34 decimal, 0x22 +hexadecimal) character at each end. A string value must have less than +256 bytes of data. This count does not include the traditional ASCII +NUL character terminator. + +If at least one operand is present in an operation, there is a single +space between the opcode and the first operand. If more than one +operand is present in an operation, there is a single blank character +between every two adjacent operands. If there are no operands, a +semicolon character is appended to the opcode to mark the end of the +operation. If any operands appear, the last operand has an appended +semicolon that marks the end of the operation. + +Any given opcode appears at most once per EPD record. Multiple +operations in a single EPD record should appear in ASCII order of their +opcode names (mnemonics). However, a program reading EPD records may +allow for operations not in ASCII order by opcode mnemonics; the +semantics are the same in either case. + +Some opcodes that allow for more than one operand may have special +ordering requirements for the operands. For example, the "pv" +(predicted variation) opcode requires its operands (moves) to appear in +the order in which they would be played. Most other opcodes that allow +for more than one operand should have operands appearing in ASCII order. +An example of the latter set is the "bm" (best move[s]) opcode; its +operands are moves that are all immediately playable from the current +position. + +5.2: Operand basetypes + +Operand values are represented using a variety of basetypes. + +5.2.1: Identifier basetype + +Some opcodes require one of more operands that are identifiers. An +identifier is an unquoted sequence of one to fifteen characters. The +characters are selected from the upper and lower case letters, the ten +digits, and the underscore character. Most identifiers that may appear +in EPD are taken from predefined sets as explained in the sections +covering opcode semantics. + +Identifiers are most often used to select one value from a list of +possible values for a general attribute. They are also used to +represent PGN tag attributes. + +5.2.2: Chess move basetype + +Some opcodes require one or more operands that are chess moves. These +moves should be represented using SAN (Standard Algebraic Notation). If +a different representation is used, there is no guarantee that the EPD +will be read correctly during subsequent processing. In particular, EDN +(English Descriptive Notation), CCN (Computer Coordinate Notation), and +LAN (Long Algebraic Notation) are explicitly not supported. + +Chess moves are used most often in single operand operations to select +one move from the available moves. They are also used in multiple +operand operations to define a set of moves (all taken from available +moves) and in multiple operand operations to express a sequence of moves +(taken from moves available at each point in a forward sequence of +play). + +Note that some chess moves also qualify as identifiers. However, the +semantics of a particular opcode dictate the exact basetype +interpretation of its operands, so there is no ambiguity. + +5.2.3: Integer basetype + +Some opcodes require one or more operands that are integers. Some +opcodes may require that an integer operand must be within a given +range; the details are described in the opcode list given below. A +negative integer is formed with a hyphen (minus sign) preceding the +integer digit sequence. An optional plus sign may be used for +indicating a non-negative value, but such use is not required and is +discouraged. Support for integers in the range -2147483648 to +2147483647 (32 bit two's complement signed extrema) is required. + +Integers are used to represent centipawn scores and also for various +counts, limits, and totals. + +5.2.4: Floating basetype + +Some opcodes require one or more operands that are floating point +numbers. Some opcodes may require that a floating point operand must be +within a given range; the details are described in the opcode list given +below. A floating point operand is constructed from an optional sign +character ("+" or "-"), a digit sequence (with at least one digit), a +radix point (always "."), and a final digit sequence (with at least one +digit). There is currently no provision for scientific representation +of numeric values. + +The floating basetype in not in current use. + +5.2.5: Date basetype + +Some opcodes require one or more operands that represent dates. These +are given in a special date format composed of ten characters. The +first four characters are digits that give the year (0001-9999), the +fifth character is a period, the sixth and seventh characters are digits +that give the month number (01-12), the eighth character is a period, +and the ninth and tenth characters are digits that give the day number +in the month (01-31). + +The date basetype is used to specify date values in timestamps. + +5.2.6: Time of day basetype + +Some opcodes require one or more operands that represent a time of day. +These are given in a special time of day format composed of eight +characters. The first two characters are digits that give the hour +(00-23), the third character is a colon, the fourth and fifth characters +are digits that give the minute (00-59), the sixth character is a colon, +and the seventh and eighth characters are digits that give the second +(00-59). + +The time of day basetype is used to specify time of day values in +timestamps. + +5.2.7: Clock basetype + +Some opcodes require one or more operands that represent a total amount +of time as would be measured by a traditional digital clock. These are +given in a special clock format composed of 12 characters. The first +three characters are digits giving a count of days (000-999), the fourth +character is a colon, the fifth and sixth characters are digits giving a +count of hours (00-23), the seventh character is a colon, the eighth and +ninth characters are digits giving a count of minutes (00-59), the tenth +character is a colon, and the eleventh and twelfth characters are digits +giving a count of seconds (00-59). + +The clock basetype is used to specify clock values for chess clock +information. It is not used to measure time consumption for a search; +an integer count of seconds is used instead. + +5.3: Opcode mnemonics + +An opcode mnemonic used for archival storage and for interprogram +communication starts with a lower case letter and is composed of only +lower case letters, digits, and the underscore character (i.e., no upper +case letters). Mnemonics are all at least two characters long. + +Opcode mnemonics used only by a single program or an experimental suite +of programs should start with an upper case letter. This is so they may +be easily distinguished should they be inadvertently be encountered by +other programs. When a such a "private" opcode be demonstrated to be +widely useful, it should be brought into the official list (appearing +below) in a lower case form. + +If a given program does not recognize a particular opcode, that +operation is simply ignored; it is not signaled as an error. + +6: Opcode list + +The opcodes are listed here in ASCII order of their mnemonics. +Suggestions for new opcodes should be sent to the technical contact +listed near the start of this document. + +6.1: Opcode "acn": analysis count: nodes + +The opcode "acn" takes a single non-negative integer operand. It is +used to represent the number of nodes examined in an analysis or search. +Note that the value may be quite large for some extended searches and so +use of a long (four byte) representation is suggested. + +6.2: Opcode "acs": analysis count: seconds + +The opcode "acs" takes a single non-negative integer operand. It is +used to represent the number of seconds used for an analysis or search. +Note that the value may be quite large for some extended searches and so +use of a long (four byte) representation is suggested. Also note that +the special clock format is not used for this operand. Some systems can +distinguish between elapsed time and processor time; in such cases, the +processor time should be used as its value is usually more indicative of +search effort than wall clock time. + +6.3: Opcode "am": avoid move(s) + +The opcode "am" indicates a set of zero or more moves, all immediately +playable from the current position, that are to be avoided as a search +result. Each operand is a SAN move; they appear in ASCII order. + +6.4: Opcode "bm": best move(s) + +The opcode "bm" indicates a set of zero or more moves, all immediately +playable from the current position, that are judged to the best +available by the EPD writer and so each is allowable as a search result. +Each operand is a SAN move; they appear in ASCII order. + +6.5: Opcode "c0": comment (primary, also "c1" though "c9") + +The opcode "c0" (lower case letter "c", digit character zero) indicates +a top level comment that applies to the given position. It is the first +of ten ranked comments, each of which has a mnemonic formed from the +lower case letter "c" followed by a single decimal digit. Each of these +opcodes takes either a single string operand or no operand at all. + +This ten member comment family of opcodes is intended for use as +descriptive commentary for a complete game or game fragment. The usual +processing of these opcodes are as follows: + +1) At the beginning of a game (or game fragment), a move sequence +scanning program initializes each element of its set of ten comment +string registers to be null. + +2) As the EPD record for each position in the game is processed, the +comment operations are interpreted from left to right. (Actually, all +operations in an EPD record are interpreted from left to right.) +Because operations appear in ASCII order according to their opcode +mnemonics, opcode "c0" (if present) will be handled prior to all other +opcodes, then opcode "c1" (if present), and so forth until opcode "c9" +(if present). + +3) The processing of opcode "cN" (0 <= N <= 9) involves two steps. +First, all comment string registers with an index equal to or greater +than N are set to null. (This is the set "cN" though "c9".) Second, +and only if a string operand is present, the value of the corresponding +comment string register is set equal to the string operand. + +6.6: Opcode "cc": chess clock values + +The opcode "cc" is used to indicate the amount of time used for each +side at the time of the writing of the opcode to the EPD record. This +opcode always takes two values. Both values are in clock format. The +first is the amount of time consumed by White and the second is the +amount of time consumed by Black. Note that these values are not simple +integers. Also, there is no provision for recording at a resolution of +less than one second. + +This opcode is most commonly used by a mediation program as a source of +impartial time information for a pair of opposing players. + +6.7: Opcode "ce": centipawn evaluation + +The opcode "ce" indicates the evaluation of the indicated position in +centipawn units. It takes a single operand, an optionally signed +integer that gives an evaluation of the position from the viewpoint of +the active player; i.e., the player with the move. Positive values +indicate a position favorable to the moving player while negative values +indicate a position favorable to the passive player; i.e., the player +without the move. A centipawn evaluation value close to zero indicates +a neutral positional evaluation. + +Values are restricted to integers that are equal to or greater than +-32768 and +are less than or equal to 32766. + +A value greater than 32000 indicates the availability of a forced mate +to the active player. The number of plies until mate is given by +subtracting the evaluation from the value 32767. Thus, a winning mate +in N fullmoves is a mate in ((2 * N) - 1) halfmoves (or ply) and has a +corresponding centipawn evaluation of (32767 - ((2 * N) - 1)). For +example, a mate on the move (mate in one) has a centipawn evaluation of +32766 while a mate in five has a centipawn evaluation of 32758. + +A value less than -32000 indicates the availability of a forced mate to +the passive player. The number of plies until mate is given by +subtracting the evaluation from the value -32767 and then negating the +result. Thus, a losing mate in N fullmoves is a mate in (2 * N) +halfmoves (or ply) and has a corresponding centipawn evaluation of +(-32767 + (2 * N)). For example, a mate after the move (losing mate in +one) has a centipawn evaluation of -32765 while a losing mate in five +has a centipawn evaluation of -32757. + +A value of -32767 indicates that the side to move is checkmated. A +value of -32768 indicates an illegal position. A stalemate position has +a centipawn evaluation of zero as does a position drawn due to +insufficient mating material. Any other position known to be a certain +forced draw also has a centipawn evaluation of zero. + +6.8: Opcode "dm": direct mate fullmove count + +The "dm" opcode is used to indicate the number of fullmoves until +checkmate is to be delivered by the active color for the indicated +position. It always takes a single operand which is a positive integer +giving the fullmove count. For example, a position known to be a "mate +in three" would have an operation of "dm 3;" to indicate this. + +This opcode is intended for use with problem sets composed of positions +requiring direct mate answers as solutions. + +6.9: Opcode "draw_accept": accept a draw offer + +The opcode "draw_accept" is used to indicate that a draw offer made +after the move that lead to the indicated position is accepted by the +active player. This opcode takes no operands. + +The "draw_accept" opcode should not appear on the same EPD record as a +"draw_reject" opcode. + +6.10: Opcode "draw_claim": claim a draw + +The opcode "draw_claim" is used to indicate claim by the active player +that a draw exists. The draw is claimed because of a third time +repetition or because of the fifty move rule or because of insufficient +mating material. A supplied move (see the opcode "sm") is also required +to appear as part of the same EPD record. The "draw_claim" opcode takes +no operands. + +The "draw_claim" opcode should not appear on the same EPD record as a +"draw_offer" opcode. + +6.11: Opcode "draw_offer": offer a draw + +The opcode "draw_offer" is used to indicate that a draw is offered by +the active player. A supplied move (see the opcode "sm") is also +required to appear as part of the same EPD record; this move is +considered played from the indicated position. The "draw_offer" opcode +takes no operands. + +The "draw_offer" opcode should not appear on the same EPD record as a +"draw_claim" opcode. + +6.12: Opcode "draw_reject": reject a draw offer + +The opcode "draw_reject" is used to indicate that a draw offer made +after the move that lead to the indicated position is rejected by the +active player. This opcode takes no operands. + +The "draw_reject" opcode should not appear on the same EPD record as a +"draw_accept" opcode. + +6.13: Opcode "eco": _Encyclopedia of Chess Openings_ opening code + +The opcode "eco" is used to associate an opening designation from the +_Encyclopedia of Chess Openings_ taxonomy with the indicated position. +The opcode takes either a single string operand (the ECO opening name) +or no operand at all. If an operand is present, its value is associated +with an "ECO" string register of the scanning program. If there is no +operand, the ECO string register of the scanning program is set to null. + +The usage is similar to that of the "ECO" tag pair of the PGN standard. + +6.14: Opcode "fmvn": fullmove number + +The opcode "fmvn" represents the fullmove number associated with the +position. It always takes a single operand that is the positive integer +value of the move number. The value of the fullmove number for the +starting array is one. + +This opcode is used to explicitly represent the fullmove number in EPD +that is present by default in FEN as the sixth field. Fullmove number +information is usually omitted from EPD because it does not affect move +generation (commonly needed for EPD-using tasks) but it does affect game +notation (commonly needed for FEN-using tasks). Because of the desire +for space optimization for large EPD files, fullmove numbers were +dropped from EPD's parent FEN. The halfmove clock information was +similarly dropped. + +6.15: Opcode "hmvc": halfmove clock + +The opcode "hmvc" represents the halfmove clock associated with the +position. The halfmove clock of a position is equal to the number of +plies since the last pawn move or capture. This information is used to +implement the fifty move draw rule. It always takes a single operand +that is the non-negative integer value of the halfmove clock. The value +of the halfmove clock for the starting array is zero. + +This opcode is used to explicitly represent the halfmove clock in EPD +that is present by default in FEN as the fifth field. Halfmove clock +information is usually omitted from EPD because it does not affect move +generation (commonly needed for EPD-using tasks) but it does affect game +termination issues (commonly needed for FEN-using tasks). Because of +the desire for space optimization for large EPD files, halfmove clock +values were dropped from EPD's parent FEN. The fullmove number +information was similarly dropped. + +6.16: Opcode "id": position identification + +The opcode "id" is used to provide a simple identification label for the +indicated position. It takes a single string operand. + +This opcode is intended for use with test suites used for measuring +chessplaying program strength. An example "id" operand for the seven +hundred fifty seventh position of the one thousand one problems in +Reinfeld's _1001 Winning Chess Sacrifices and Combinations_ would be +"WCSAC.0757" while the fifteenth position in the twenty four problem +Bratko-Kopec test suite would have an "id" operand of "BK.15". + +6.17: Opcode "nic": _New In Chess_ opening code + +The opcode "nic" is used to associate an opening designation from the +_New In Chess_ taxonomy with the indicated position. The opcode takes +either a single string operand (the NIC code for the opening) or no +operand at all. If an operand is present, its value is associated with +an "NIC" string register of the scanning program. If there is no +operand, the NIC string register of the scanning program is set to null. + +The usage is similar to that of the "NIC" tag pair of the PGN standard. + +6.18: Opcode "noop": no operation + +The "noop" opcode is used to indicate no operation. It takes zero or +more operands, each of which may be of any type. The operation involves +no processing. It is intended for use by developers for program testing +purposes. + +6.19: Opcode "pm": predicted move + +The "pm" opcode is used to provide a single predicted move for the +indicated position. It has exactly one operand, a move playable from +the position. This move is judged by the EPD writer to represent the +best move available to the active player. + +If a non-empty "pv" (predicted variation) line of play is also present +in the same EPD record, the first move of the predicted variation is the +same as the predicted move. + +The "pm" opcode is intended for use as a general "display hint" +mechanism. + +6.20: Opcode "ptp": PGN tag pair + +The "ptp" opcode is used to record a PGN tag pair. It always takes an +even number of operands. For each pair of operands (from left to +right), the first operand in the pair is always an identifier and is +interpreted as the name of a PGN tag; the second operand in the pair is +always a string and is the value associated with the tag given by the +first operand. + +Any given PGN tag name should only appear once as a tag identifier +operand in a "ptp" operation. + +6.21: Opcode "pv": predicted variation + +The "pv" opcode is used to provide a predicted variation for the +indicated position. It has zero or more operands which represent a +sequence of moves playable from the position. This sequence is judged +by the EPD writer to represent the best play available. + +If a "pm" (predicted move) operation is also present in the same EPD +record, the predicted move is the same as the first move of the +predicted variation. + +6.22: Opcode "rc": repetition count + +The "rc" opcode is used to indicate the number of occurrences of the +indicated position. It takes a single, positive integer operand. Any +position, including the initial starting position, is considered to have +an "rc" value of at least one. A value of three indicates a candidate +for a draw claim by the position repetition rule. + +6.23: Opcode "refcom": referee command + +The "refcom" opcode is used to represent a command from a referee +program to a client program during automated competition. It takes a +single identifier operand which is to be interpreted as a command by the +receiving program. Note that as the operand is an identifier and not a +string value, it is not enclosed in quote characters. + +There are seven available operand values: conclude, disconnect, execute, +fault, inform, reset, and respond. + +Further details of "refcom" usage are given in the section on referee +semantics later in this document. + +6.24: Opcode "refreq": referee request + +The "refreq" opcode is used to represent a request from a client program +to the referee program during automated competition. It takes a single +identifier operand which is to be interpreted as a request to the +referee from a client program. Note that as the operand is an +identifier and not a string value, it is not enclosed in quote +characters. + +There are four available operand values: fault, reply, sign_off, and +sign_on. + +Further details of "refreq" usage are given in the section on referee +semantics later in this document. + +6.25: Opcode "resign": game resignation + +The opcode "resign" is used to indicate that the active player has +resigned the game. This opcode takes no operands. + +The "resign" opcode should not appear on the same EPD record with any of +the following opcodes: "draw_accept", "draw_claim", "draw_decline', and +"draw_offer". + +6.26: Opcode "sm": supplied move + +The "sm" opcode is used to provide a single supplied move for the +indicated position. It has exactly one operand, a move playable from +the position. This move is the move to be played from the position. + +If a "sv" (supplied variation) operation is present on the same record +and has at least one operand, then its first operand must match the +single operand of the "sm" opcode. + +The "sm" opcode is intended for use to communicate the most recent +played move in an active game. It is used to communicate moves between +programs in automatic play via a network. This includes correspondence +play using e-mail and also programs acting as network front ends to +human players. + +6.27: Opcode "sv": supplied variation + +The "sv" opcode is used to provide zero or more supplied moves for the +indicated position. The operands are a move sequence playable from the +position. + +If an "sm" (supplied move) operation is also present on the same record +and the "sv" operation has at least one operand, then the "sm" operand +must match the first operand of the "sv" operation. + +6.28: Opcode "tcgs": telecommunication: game selector + +The "tcgs" opcode is one of the telecommunication family of opcodes used +for games conducted via e-mail and similar means. This opcode takes a +single operand that is a positive integer. It is used to select among +various games in progress between the same sender and receiver. + +Details of e-mail implementation await further development. + +6.29: Opcode "tcri": telecommunication: receiver identification + +The "tcri" opcode is one of the telecommunication family of opcodes used +for games conducted via e-mail and similar means. This opcode takes two +order dependent string operands. The first operand is the e-mail +address of the receiver of the EPD record. The second operand is the +name of the player (program or human) at the address who is the actual +receiver of the EPD record. + +Details of e-mail implementation await further development. + +6.30: Opcode "tcsi": telecommunication: sender identification + +The "tcsi" opcode is one of the telecommunication family of opcodes used +for games conducted via e-mail and similar means. This opcode takes two +order dependent string operands. The first operand is the e-mail +address of the sender of the EPD record. The second operand is the name +of the player (program or human) at the address who is the actual sender +of the EPD record. + +Details of e-mail implementation await further development. + +6.31: Opcode "ts": timestamp + +The "ts" opcode is used to record a timestamp value. It takes two +operands. The first operand is in date format and the second operand is +in time of day format. The interpretation of the combined operand values +gives the time of the last modification of the EPD record. The +timestamp is interpreted to be in UTC (Universal Coordinated Time, +formerly known as GMT). + +6.32: Opcode "v0": variation name (primary, also "v1" though "v9") + +The opcode "v0" (lower case letter "v", digit character zero) indicates +a top level variation name that applies to the given position. It is +the first of ten ranked variation names, each of which has a mnemonic +formed from the lower case letter "v" followed by a single decimal +digit. Each of these opcodes takes either a single string operand or no +operand at all. + +This ten member variation name family of opcodes is intended for use as +traditional variation names for a complete game or game fragment. The +usual processing of these opcodes are as follows: + +1) At the beginning of a game (or game fragment), a move sequence +scanning program initializes each element of its set of ten variation +name string registers to be null. + +2) As the EPD record for each position in the game is processed, the +variation name operations are interpreted from left to right. +(Actually, all operations in an EPD record are interpreted from left to +right.) Because operations appear in ASCII order according to their +opcode mnemonics, opcode "v0" (if present) will be handled prior to all +other opcodes, then opcode "v1" (if present), and so forth until opcode +"v9" (if present). + +3) The processing of opcode "vN" (0 <= N <= 9) involves two steps. +First, all variation name string registers with an index equal to or +greater than N are set to null. (This is the set "vN" though "v9".) +Second, and only if a string operand is present, the value of the +corresponding variation name string register is set equal to the string +operand. + +7: EPD processing verbs + +An EPD processing verb is a command to an EPD capable program used to +direct processing of one or more EPD files. Standardization of verb +semantics among EPD capable programs is important to helping reduce +confusion among program users and to better insure overall +interoperatibilty. + +Each EPD processing verb that requires the reading of EPD records has a +specific set of required opcodes that must be on each input record. +Each EPD processing verb that requires the writing of EPD records has a +specific set of required opcodes that must be on each output record. +Some EPD processing verbs imply both reading and writing EPD records; +these will have requirements for both input and output opcode sets. + +The names of the EPD processing verbs in this section are for use for +specification purposes only. Program authors are free to select +different names as appropriate for the needs of a program's user +interface. + +7.1: EPD verb: pfdn (process file: data normalization) + +The "pfdn" (process file: data normalization) verb reads an EPD input +file and produces a normalized copy of the data on as the EPD output +file. The output file retains the record ordering of the input file. +The noramlization is used to produce a canonical representation of the +EPD. The input records are also checked for legality. There is no +minimum set of operations requires on the input records. For each input +record, all of the operations present are reproduced in the +corresponding output record. + +The normalization of each EPD record consists of the following actions: + +1) Any leading whitespace characters are removed. + +2) Any trailing whitespace characters are removed. + +3) Any unneeded whitespace characters used as data separators are +removed; a single blank is used to separate adjacent fields, adjacent +operations, and adjacent operands. Also, a single blank character is +used to separate the fourth position data field (the en passant target +square indication) from the first operation (if present). + +4) Operations are reordered in increasing ASCII order by opcode +mnemonic. + +5) Operands for each opcode that does not require a special order of +interpretation are reordered in increasing ASCII order by external +representation. + +Data normalization is useful for making a canonical version from data +produced by programs or other sources that do not completely conform to +the lexigraphical and ordering rules of the EPD standard. It also helps +when comparing two EPD files from different sources on a line by line +basis; the non-semantic differences are removed so that different text +lines indicate true semantic difference. + +7.2: EPD verb: pfga (process file: general analysis) + +The "pfga" (process file: general analysis) verb is used to instruct a +chessplaying program to perform an analysis for each EPD input record +and produce an EPD output file containing this analysis. The output +file retains the record ordering of the input file. The current +position given by each input record is not changed; it is copied to the +output. + +Each input EPD record receives the same analysis effort. The level of +effort is indicated as a command (separate from EPD) to the analysis +program prior to the start of the EPD processing. Usually, the level is +given as a time limit or depth limit per each position. The limit can +be either a hard limit or a soft limit. A hard limit represents an +absolute maximum effort per position, while a soft limit allows the +program to spend more or less effort per position. The hard limit +interpretation is preferred for comparing programs. The soft limit +interpretation is used to help test time allocation strategy where a +program can choose to take more or less time depending on the complexity +of a position. + +Each EPD output record is a copy of the corresponding EPD input record +with new analysis added as a result of the verb processing. + +There is no minimum set of operations required for the EPD input +records. + +Each output EPD record must contain: + +1) A "pv" (predicted variation) operation. The operands of this form a +sequence of chess moves to be played from the given position. The +length of this may vary from record to record due to the level of +anaylsis effort and the complexity of each position. However, unless the +current position represents a checkmate or stalemate for the side to +move, the pv operation must include at least one move. If the current +position represents a checkmate or stalemate for the side to move, then +the pv operation still appears, but has no operands. + +2) A "ce" (centipawn evaluation) operation. The value of its operand is +the value in hundredths of a pawn of the current position. Note that +the evaluation is assigned to the position before the predicted move (or +any other move) is made. Thus, a positive centipawn score indicates an +advantage for the side to move in the current position while a negative +score indicates a disadvantage for the side to move. + +Each output EPD record may also contain: + +1) A "pm" (predicted move) operation, unless the current position +represents a checkmate or stalemate for the side to move. (If the side +to move has no moves, then the "pm" operation will not appear.) The +single operand of the "pm" opcode must be the same as the first operand +of the "pv" sequence. + +2) A "sm" (supplied move) operation, unless the current position +represents a checkmate or stalemate for the side to move. (If the side +to move has no moves, then the "sm" operation will not appear.) The +single operand of the "sm" opcode must be the same as the first operand +of the "pv" sequence. + +3) An "acn" (analysis count: nodes) operation. The single operand is +the number of nodes visited in the analysis search for the position. + +4) An "acs" (analysis count: seconds) operation. The single operand is +the number of seconds used for the analysis search for the position. + +7.3: EPD verb: pfms (process file: mate search) + +The "pfms" verb is used to conduct searches for forced checkmating +sequences. The length of the forced mate sequence is provided (outside +of EPD) to the program prior to the beginning of "pfms" processing. The +length is specified using a fullmove count. For example, a fullmove +mate length of three would instruct the program to search for all mates +in three. An analysis program reads and input EPD file and looks for +forced mates in each position where no forced mate of equal or lesser +length has been recorded. The output file retains the record ordering +of the input file. + +The action of the "pfms" command on each record is governed by the +pre-specified fullmove count and, if present on the record, the value of +the "dm" (direct mate fullmove count) operand. A particular record will +be subject to a search for a forced mate if either: + +1) There is no "dm" operation on the input record, or + +2) The value of the "dm" operand on the input record is greater than the +value of the pre-specified fullmove analysis length. + +If the analysis program finds a forced mate, it produces two additional +operations on the corresponding output EPD record: + +1) A "dm" operation with an operand equal to the pre-specified fullmove +mate length. + +2) A "pm" operation with the first move of the mating sequence as its +operand. If two or more such moves exist, the program selects the first +one it located to appear as the "pm" operand. + +The idea is that a set of positions can be repeatedly scanned by a mate +finding program with the fullmove analysis depth starting with a value +of one and being increased by one with each pass. For any given pass, +the positions solved by an earlier pass are skipped. + +The output EPD records may also contain other (optional) information +such as "acn", "acs", and "pv" operations. + +7.4: EPD verb: pfop (process file: operation purge) + +The "pfop" verb is used to purge a particular operation from each of the +records in an EPD file that contain the operation. The output file +retains the record ordering of the input file. Prior to processing, the +opcode of the operation to be purged is specified. + +The records of the input file are copied to the output file. If the +pre-specified operation is present on a record, the operation is removed +prior to copying the record to the output. + +7.5: EPD verb: pfts (process file: target search) + +The "pfts" (process file: target search) verb is similar to the "pfga" +(process file: general analysis) verb in that each position on the EPD +input file is subject to a general analysis. The difference is that +each input record contains a set of target moves and a set of avoidance +moves. Either of these two sets, but not both, may be empty. The set +of avoidance moves is given by the operands of a "am" opcode (if +present). The set of target moves is given by the operands of a "bm" +opcode (if present). + +Prior to processing the target search, the program is given a search +effort limit such as a limit on the amount of search time or search +nodes per position. The "pfts" verb causes each input EPD record to be +read, subjected to analysis, and then written to output file with the +predicted move attached with the "pm" opcode. (No "pm" operation is +added is the current position is a checkmate or stalemate of the side to +play.) + +The output EPD records may also contain other (optional) information +such as "acn", "acs", and "pv" operations. + +8: EPD referee semantics + +Communication between a chessplaying program and a referee program is +performed by exchanging EPD records. Each EPD record emitted by a +chessplaying program to be received by the referee has a "refreq" EPD +opcode with an operand that describes the request. Each EPD record +emitted by a referee to be received by a chessplaying program has a +"refcom" EPD opcode with an operand that describes the command. + +The usual operation sequence in a referee mediated event is as follows: + +1) The referee server program is started and the human event supervisor +provides it with any necessary tournament information including the +names of the chessplaying programs, the name of the event, and various +other data. + +2) The referee program completes its initialization by performing +pairing operations as required. + +3) Once the server has its initial data, it then opens a socket and +binds it to the appropriate port. It then starts listening for input +from clients. For a serial implementation, an analogous function is +performed. + +4) The competing chessplaying programs (clients) are started (if not +already running) and are given the name of the referee host machine +along with the port number. For a serial implementation, an analogous +function is performed. + +5) Each client program transmits an EPD record to the referee requesting +registration. This causes each client to be signed on to the referee. + +6) The referee program replies to each client signing on with an EPD +record commanding a reset operation to set up for a new game. + +7) The referee program sends an EPD record to each client informing each +client about the values for each of the tag values for the PGN Seven Tag +Format. + +8) For each client on the move, the referee will send an EPD record +commanding a response. This causes each receiving client to calculate a +move. If there has been a prior move, it along with the position from +which the move is played is sent. If there has been no prior move, the +current position is sent but no move is included. + +9) For each client receiving a command to respond, the current position +indicated by the record is set as the current position in the receiving +program. (It should already be the current position in the receiver.) +If a supplied move was given, it is executed on the current position. +Finally, the receiving program calculates a move. + +10) As each program on the move completes its calculation, it sends a +reply to the referee which includes the result of the calculation. The +position sent back on the reply is the result of applying the move +received on the referee record to the position on the same received +record. If a move was produced as the result of the calculation, it is +also sent. (A move will not be produced or sent if the receving client +was checkmated, or if it was stalemated, of if it resigns, or claims a +draw due to insufficient material.) + +11) As the referee receives a reply from a client, it produces a respond +command record to the client's opponent. (This step will be skipped if +an end of game condition is detected and no further moves need to be +communicated.) + +12) The referee continues with the respond/reply cycle for each pair of +opponent clients until the game concludes for that pair. + +13) For each game conclusion, the referee sends a conclude command to +each of the clients involved. + +14) When a client is to be removed from competition, it sends a sign off +request. This eliminates that program from being paired until it +re-registers with a sign on request. + +15) When the referree server is to be removed from network operations, +it will send a disconnect command to each client that is currently +signed on to the referee. + +8.1: Referee commands (client directives) + +The referee communicates the command of interest as the single operand +of the "refcom" opcode. The refcom opcode will be on each record sent +by the referee. Each possible refcom operand is sent as an identifier +(and not as a string). + +EPD records sent by the referee will include check clock data as +appropriate. Whenever a client program receives a record with the "cc" +(chess clock) opcode, the client should set the values of its internal +clocks to the values specified by the cc operands. Note that the clock +values for both White and Black are present in a cc operation. + +All EPD records carry the four data fields describing the current +position. In most cases, this position should also be the current +position of the receiving client. If the position sent by the referee +matches the client's current position, then the client can assume that +all of the game history leading to the current position is valid. Thus, +every client keeps track of the game history internally and uses this to +detect repetition draws and so there is no need for each EPD record to +contain a complete copy of the game history. + +If the position sent by the referee does not match the receiving +program's current position, then the receiving program must set its +current position to be the same as the one it received. Unless an +explicit game history move sequence is also sent on the same EPD record, +the receiving program is to assume that the new (different) position +received has no game history. In this case the receiving program cannot +check for repetition of positions prior to the new position as there +aren't any previous positions in the game. + +Each client is expected to maintain its own copy of the halfmove clock +(plies since last irreversible move; starts at zero for the initial +position) and the fullmove number (which has a value of one for the +initial position). If the referee sends a halfmove clock value or a +fullmove number which is different from that kept by the program, then +the receiving program is to treat it as a new position and clear any +game history. As noted above, a halfmove clock is sent using the "hmvc" +opcode and a fullmove number is sent using a "fmvn" opcode. + +If a supplied move (always using the "sm" opcode) is sent by the +referee, the receiving program must execute this move on the current +position. This is done after the program's current position is set to +the position sent by the referee (remember that the two will usually +match). The resulting position becomes the new current position. This +new current position is used for all further calculations. The new +current position is also the position to be sent to the referee if a +move response is commanded. When a client program produces a move to be +played, it uses the sm opcode with its operand being the supplied move. +The position sent is alwasy the position from which the supplied move is +to be played. Thus, the semantics of the current position and the +supplied move are symmetric with respect to the client and the server. + +8.1.1: Referee command: conclude + +The "conclude" refcom operand instructs the client to conclude the +current game in progress. The position sent is the final position of +the game. There is no supplied move sent. No further EPD records +concerning the game will be sent by the referee. The client should +perform any end of game activity required for its normal operation. No +response from the client is made. + +To allow for client game conclusion processing time, the referee will +avoid sending any more EPD records to a client concluding a game for a +time period set by the human supervisor. The default delay will be five +seconds. + +8.1.2: Referee command: disconnect + +The "disconnect" refcom operand instructs the client that the referee is +terminating service operations. The client should close its +communication channel with the server. This command is sent at the end +of an event or whenever the referee is to be brought down for some +reason. No further EPD records will be sent until the server is cycled. +It provides an opportunity for a client to gracefully disconnect from +network operations with the server. No supplied move is sent. The +position sent is irrelevant. No response from the client is made. + +8.1.3: Referee command: execute + +The "execute" refcom operand instructs the client to set up a position. +If a move is supplied (it usually is), then that move is executed from +the position. The sent position will usually be the receiver's current +position. This command is used only to play through the initial +sequence of moves from a game to support a restart capability. No +response is made by the receiver. + +8.1.4: Referee command: fault + +The "fault" refcom operand is used to indicate that the referee has +detected an unrecoverable fault. The reciever should signal for human +intervention to assist with corrective action. The human supervisor +will be notified by the referee regarding the nature of the fault. No +response is made by the receiver. + +A future version of the referee protocol will support some form of +automated fault recovery. + +8.1.5: Referee command: inform + +The "inform" refcom operand is used to convey PGN tag pair data to the +receiver. The "ptp" opcode will carry the PGN tag data to be set on the +receiving client. This command may be sent at any time. It will +usually be sent prior to the first move of a game. It will also be sent +after the last move of a game to communicate the result of the game via +the PGN "Result" tag pair. No response is made by the receiver. + +The main purpose for the inform referee command is to be able to +communcate tag pair data to a client without having to send a move or +other command. Note that the ptp opcode may also appear on EPD records +from the referee that are not inform commands; its operands are +processed in the same way. + +The usual information sent includes the values for the Seven Tag Roster. +The PGN tag names are "Event", "Site", "Date", "Round", "White", +"Black", and "Result". + +Future versions of the referee will likely send more than just the Seven +Tag Roster of PGN tag pairs. One probable addition will be to send the +"TimeControl" tag pair prior to the start of a game; this will allow a +receiving program to have its time control parameters set automatically +rather than manually. + +8.1.6: Referee command: reset + +The "reset" refcom operand is used to command the receiving client to +set up for a new game. Any previous information about a game in +progress is deleted. This command will be sent to mark the beginning of +a game. It will also be sent if there is a need to abort the game +currently in progress. No response is made by the receiver. + +To allow for client reset processing time, the referee will avoid +sending any more EPD records to a resetting client for a time period set +by the human supervisor. The default delay will be five seconds. + +8.1.7: Referee command: respond + +The "respond" refcom operand is used to command the receiving client to +respond to the move (if any) played by its opponent. The position to +use for calculation is the position sent which is modified by a supplied +move (if present; uses the "sm" opcode). The client program calculates +a response and sends it to the referee using the "reply" operand of the +"refreq" opcode. + +8.2: Referee requests (server directives) + +The referee communicates the command of interest as the single operand +of the "refcom" opcode. The refcom opcode will be on each record sent +by the referee. Each possible refcom operand is sent as an identifier +(and not as a string). + +8.2.1: Referee request: fault + +The "fault" refreq operand is used to indicate that the client has +detected an unrecoverable fault. The receiver should signal for human +intervention to assist with corrective action. The human supervisor +will be notified by the referee regarding the nature of the fault. No +response is made by the referee. + +A future version of the referee protocol will support some form of +automated fault recovery. + +8.2.2: Referee request: reply + +The "reply" refreq operand is used to carry a reply by the client +program. Usually, a move (the client's reply) is included as the +operand of the "sm" opcode. + +8.2.3: Referee request: sign_off + +The "sign_off" refreq operand is used to indicate that the client +program is signing off from the referee connection and no further +operations will be made on the communication channel. The channel in +use is then closed by both the referee and the client. + +A new connection must be established and a new "sign_on" referee request +needs to be made for further referee operations with the client. + +8.2.4: Referee request: sign_on + +The "sign_on" refreq operand is used to indicate that the client program +is signing on to the referee connection. This request is required +before any further operations can be made on the communication channel. +The channel in use remains open until it is closed by either side. + +9: EPD report generation semantics + +[TBD] + +EPD_Spec: EOF |
