gdb/doc/agentexpr.texi

   1 @c \input texinfo
   2 @c %**start of header
   3 @c @setfilename agentexpr.info
   4 @c @settitle GDB Agent Expressions
   5 @c @setchapternewpage off
   6 @c %**end of header
   7
   8 @c This file is part of the GDB manual.
   9 @c
  10 @c Copyright (C) 2003-2017 Free Software Foundation, Inc.
  11 @c
  12 @c See the file gdb.texinfo for copying conditions.
  13
  14 @node Agent Expressions
  15 @appendix The GDB Agent Expression Mechanism
  16
  17 In some applications, it is not feasible for the debugger to interrupt
  18 the program's execution long enough for the developer to learn anything
  19 helpful about its behavior.  If the program's correctness depends on its
  20 real-time behavior, delays introduced by a debugger might cause the
  21 program to fail, even when the code itself is correct.  It is useful to
  22 be able to observe the program's behavior without interrupting it.
  23
  24 Using GDB's @code{trace} and @code{collect} commands, the user can
  25 specify locations in the program, and arbitrary expressions to evaluate
  26 when those locations are reached.  Later, using the @code{tfind}
  27 command, she can examine the values those expressions had when the
  28 program hit the trace points.  The expressions may also denote objects
  29 in memory --- structures or arrays, for example --- whose values GDB
  30 should record; while visiting a particular tracepoint, the user may
  31 inspect those objects as if they were in memory at that moment.
  32 However, because GDB records these values without interacting with the
  33 user, it can do so quickly and unobtrusively, hopefully not disturbing
  34 the program's behavior.
  35
  36 When GDB is debugging a remote target, the GDB @dfn{agent} code running
  37 on the target computes the values of the expressions itself.  To avoid
  38 having a full symbolic expression evaluator on the agent, GDB translates
  39 expressions in the source language into a simpler bytecode language, and
  40 then sends the bytecode to the agent; the agent then executes the
  41 bytecode, and records the values for GDB to retrieve later.
  42
  43 The bytecode language is simple; there are forty-odd opcodes, the bulk
  44 of which are the usual vocabulary of C operands (addition, subtraction,
  45 shifts, and so on) and various sizes of literals and memory reference
  46 operations.  The bytecode interpreter operates strictly on machine-level
  47 values --- various sizes of integers and floating point numbers --- and
  48 requires no information about types or symbols; thus, the interpreter's
  49 internal data structures are simple, and each bytecode requires only a
  50 few native machine instructions to implement it.  The interpreter is
  51 small, and strict limits on the memory and time required to evaluate an
  52 expression are easy to determine, making it suitable for use by the
  53 debugging agent in real-time applications.
  54
  55 @menu
  56 * General Bytecode Design::     Overview of the interpreter.
  57 * Bytecode Descriptions::       What each one does.
  58 * Using Agent Expressions::     How agent expressions fit into the big picture.
  59 * Varying Target Capabilities:: How to discover what the target can do.
  60 * Rationale::                   Why we did it this way.
  61 @end menu
  62
  63
  64 @c @node Rationale
  65 @c @section Rationale
  66
  67
  68 @node General Bytecode Design
  69 @section General Bytecode Design
  70
  71 The agent represents bytecode expressions as an array of bytes.  Each
  72 instruction is one byte long (thus the term @dfn{bytecode}).  Some
  73 instructions are followed by operand bytes; for example, the @code{goto}
  74 instruction is followed by a destination for the jump.
  75
  76 The bytecode interpreter is a stack-based machine; most instructions pop
  77 their operands off the stack, perform some operation, and push the
  78 result back on the stack for the next instruction to consume.  Each
  79 element of the stack may contain either a integer or a floating point
  80 value; these values are as many bits wide as the largest integer that
  81 can be directly manipulated in the source language.  Stack elements
  82 carry no record of their type; bytecode could push a value as an
  83 integer, then pop it as a floating point value.  However, GDB will not
  84 generate code which does this.  In C, one might define the type of a
  85 stack element as follows:
  86 @example
  87 union agent_val @{
  88   LONGEST l;
  89   DOUBLEST d;
  90 @};
  91 @end example
  92 @noindent
  93 where @code{LONGEST} and @code{DOUBLEST} are @code{typedef} names for
  94 the largest integer and floating point types on the machine.
  95
  96 By the time the bytecode interpreter reaches the end of the expression,
  97 the value of the expression should be the only value left on the stack.
  98 For tracing applications, @code{trace} bytecodes in the expression will
  99 have recorded the necessary data, and the value on the stack may be
 100 discarded.  For other applications, like conditional breakpoints, the
 101 value may be useful.
 102
 103 Separate from the stack, the interpreter has two registers:
 104 @table @code
 105 @item pc
 106 The address of the next bytecode to execute.
 107
 108 @item start
 109 The address of the start of the bytecode expression, necessary for
 110 interpreting the @code{goto} and @code{if_goto} instructions.
 111
 112 @end table
 113 @noindent
 114 Neither of these registers is directly visible to the bytecode language
 115 itself, but they are useful for defining the meanings of the bytecode
 116 operations.
 117
 118 There are no instructions to perform side effects on the running
 119 program, or call the program's functions; we assume that these
 120 expressions are only used for unobtrusive debugging, not for patching
 121 the running code.
 122
 123 Most bytecode instructions do not distinguish between the various sizes
 124 of values, and operate on full-width values; the upper bits of the
 125 values are simply ignored, since they do not usually make a difference
 126 to the value computed.  The exceptions to this rule are:
 127 @table @asis
 128
 129 @item memory reference instructions (@code{ref}@var{n})
 130 There are distinct instructions to fetch different word sizes from
 131 memory.  Once on the stack, however, the values are treated as full-size
 132 integers.  They may need to be sign-extended; the @code{ext} instruction
 133 exists for this purpose.
 134
 135 @item the sign-extension instruction (@code{ext} @var{n})
 136 These clearly need to know which portion of their operand is to be
 137 extended to occupy the full length of the word.
 138
 139 @end table
 140
 141 If the interpreter is unable to evaluate an expression completely for
 142 some reason (a memory location is inaccessible, or a divisor is zero,
 143 for example), we say that interpretation ``terminates with an error''.
 144 This means that the problem is reported back to the interpreter's caller
 145 in some helpful way.  In general, code using agent expressions should
 146 assume that they may attempt to divide by zero, fetch arbitrary memory
 147 locations, and misbehave in other ways.
 148
 149 Even complicated C expressions compile to a few bytecode instructions;
 150 for example, the expression @code{x + y * z} would typically produce
 151 code like the following, assuming that @code{x} and @code{y} live in
 152 registers, and @code{z} is a global variable holding a 32-bit
 153 @code{int}:
 154 @example
 155 reg 1
 156 reg 2
 157 const32 @i{address of z}
 158 ref32
 159 ext 32
 160 mul
 161 add
 162 end
 163 @end example
 164
 165 In detail, these mean:
 166 @table @code
 167
 168 @item reg 1
 169 Push the value of register 1 (presumably holding @code{x}) onto the
 170 stack.
 171
 172 @item reg 2
 173 Push the value of register 2 (holding @code{y}).
 174
 175 @item const32 @i{address of z}
 176 Push the address of @code{z} onto the stack.
 177
 178 @item ref32
 179 Fetch a 32-bit word from the address at the top of the stack; replace
 180 the address on the stack with the value.  Thus, we replace the address
 181 of @code{z} with @code{z}'s value.
 182
 183 @item ext 32
 184 Sign-extend the value on the top of the stack from 32 bits to full
 185 length.  This is necessary because @code{z} is a signed integer.
 186
 187 @item mul
 188 Pop the top two numbers on the stack, multiply them, and push their
 189 product.  Now the top of the stack contains the value of the expression
 190 @code{y * z}.
 191
 192 @item add
 193 Pop the top two numbers, add them, and push the sum.  Now the top of the
 194 stack contains the value of @code{x + y * z}.
 195
 196 @item end
 197 Stop executing; the value left on the stack top is the value to be
 198 recorded.
 199
 200 @end table
 201
 202
 203 @node Bytecode Descriptions
 204 @section Bytecode Descriptions
 205
 206 Each bytecode description has the following form:
 207
 208 @table @asis
 209
 210 @item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b}
 211
 212 Pop the top two stack items, @var{a} and @var{b}, as integers; push
 213 their sum, as an integer.
 214
 215 @end table
 216
 217 In this example, @code{add} is the name of the bytecode, and
 218 @code{(0x02)} is the one-byte value used to encode the bytecode, in
 219 hexadecimal.  The phrase ``@var{a} @var{b} @result{} @var{a+b}'' shows
 220 the stack before and after the bytecode executes.  Beforehand, the stack
 221 must contain at least two values, @var{a} and @var{b}; since the top of
 222 the stack is to the right, @var{b} is on the top of the stack, and
 223 @var{a} is underneath it.  After execution, the bytecode will have
 224 popped @var{a} and @var{b} from the stack, and replaced them with a
 225 single value, @var{a+b}.  There may be other values on the stack below
 226 those shown, but the bytecode affects only those shown.
 227
 228 Here is another example:
 229
 230 @table @asis
 231
 232 @item @code{const8} (0x22) @var{n}: @result{} @var{n}
 233 Push the 8-bit integer constant @var{n} on the stack, without sign
 234 extension.
 235
 236 @end table
 237
 238 In this example, the bytecode @code{const8} takes an operand @var{n}
 239 directly from the bytecode stream; the operand follows the @code{const8}
 240 bytecode itself.  We write any such operands immediately after the name
 241 of the bytecode, before the colon, and describe the exact encoding of
 242 the operand in the bytecode stream in the body of the bytecode
 243 description.
 244
 245 For the @code{const8} bytecode, there are no stack items given before
 246 the @result{}; this simply means that the bytecode consumes no values
 247 from the stack.  If a bytecode consumes no values, or produces no
 248 values, the list on either side of the @result{} may be empty.
 249
 250 If a value is written as @var{a}, @var{b}, or @var{n}, then the bytecode
 251 treats it as an integer.  If a value is written is @var{addr}, then the
 252 bytecode treats it as an address.
 253
 254 We do not fully describe the floating point operations here; although
 255 this design can be extended in a clean way to handle floating point
 256 values, they are not of immediate interest to the customer, so we avoid
 257 describing them, to save time.
 258
 259
 260 @table @asis
 261
 262 @item @code{float} (0x01): @result{}
 263
 264 Prefix for floating-point bytecodes.  Not implemented yet.
 265
 266 @item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b}
 267 Pop two integers from the stack, and push their sum, as an integer.
 268
 269 @item @code{sub} (0x03): @var{a} @var{b} @result{} @var{a-b}
 270 Pop two integers from the stack, subtract the top value from the
 271 next-to-top value, and push the difference.
 272
 273 @item @code{mul} (0x04): @var{a} @var{b} @result{} @var{a*b}
 274 Pop two integers from the stack, multiply them, and push the product on
 275 the stack.  Note that, when one multiplies two @var{n}-bit numbers
 276 yielding another @var{n}-bit number, it is irrelevant whether the
 277 numbers are signed or not; the results are the same.
 278
 279 @item @code{div_signed} (0x05): @var{a} @var{b} @result{} @var{a/b}
 280 Pop two signed integers from the stack; divide the next-to-top value by
 281 the top value, and push the quotient.  If the divisor is zero, terminate
 282 with an error.
 283
 284 @item @code{div_unsigned} (0x06): @var{a} @var{b} @result{} @var{a/b}
 285 Pop two unsigned integers from the stack; divide the next-to-top value
 286 by the top value, and push the quotient.  If the divisor is zero,
 287 terminate with an error.
 288
 289 @item @code{rem_signed} (0x07): @var{a} @var{b} @result{} @var{a modulo b}
 290 Pop two signed integers from the stack; divide the next-to-top value by
 291 the top value, and push the remainder.  If the divisor is zero,
 292 terminate with an error.
 293
 294 @item @code{rem_unsigned} (0x08): @var{a} @var{b} @result{} @var{a modulo b}
 295 Pop two unsigned integers from the stack; divide the next-to-top value
 296 by the top value, and push the remainder.  If the divisor is zero,
 297 terminate with an error.
 298
 299 @item @code{lsh} (0x09): @var{a} @var{b} @result{} @var{a<<b}
 300 Pop two integers from the stack; let @var{a} be the next-to-top value,
 301 and @var{b} be the top value.  Shift @var{a} left by @var{b} bits, and
 302 push the result.
 303
 304 @item @code{rsh_signed} (0x0a): @var{a} @var{b} @result{} @code{(signed)}@var{a>>b}
 305 Pop two integers from the stack; let @var{a} be the next-to-top value,
 306 and @var{b} be the top value.  Shift @var{a} right by @var{b} bits,
 307 inserting copies of the top bit at the high end, and push the result.
 308
 309 @item @code{rsh_unsigned} (0x0b): @var{a} @var{b} @result{} @var{a>>b}
 310 Pop two integers from the stack; let @var{a} be the next-to-top value,
 311 and @var{b} be the top value.  Shift @var{a} right by @var{b} bits,
 312 inserting zero bits at the high end, and push the result.
 313
 314 @item @code{log_not} (0x0e): @var{a} @result{} @var{!a}
 315 Pop an integer from the stack; if it is zero, push the value one;
 316 otherwise, push the value zero.
 317
 318 @item @code{bit_and} (0x0f): @var{a} @var{b} @result{} @var{a&b}
 319 Pop two integers from the stack, and push their bitwise @code{and}.
 320
 321 @item @code{bit_or} (0x10): @var{a} @var{b} @result{} @var{a|b}
 322 Pop two integers from the stack, and push their bitwise @code{or}.
 323
 324 @item @code{bit_xor} (0x11): @var{a} @var{b} @result{} @var{a^b}
 325 Pop two integers from the stack, and push their bitwise
 326 exclusive-@code{or}.
 327
 328 @item @code{bit_not} (0x12): @var{a} @result{} @var{~a}
 329 Pop an integer from the stack, and push its bitwise complement.
 330
 331 @item @code{equal} (0x13): @var{a} @var{b} @result{} @var{a=b}
 332 Pop two integers from the stack; if they are equal, push the value one;
 333 otherwise, push the value zero.
 334
 335 @item @code{less_signed} (0x14): @var{a} @var{b} @result{} @var{a<b}
 336 Pop two signed integers from the stack; if the next-to-top value is less
 337 than the top value, push the value one; otherwise, push the value zero.
 338
 339 @item @code{less_unsigned} (0x15): @var{a} @var{b} @result{} @var{a<b}
 340 Pop two unsigned integers from the stack; if the next-to-top value is less
 341 than the top value, push the value one; otherwise, push the value zero.
 342
 343 @item @code{ext} (0x16) @var{n}: @var{a} @result{} @var{a}, sign-extended from @var{n} bits
 344 Pop an unsigned value from the stack; treating it as an @var{n}-bit
 345 twos-complement value, extend it to full length.  This means that all
 346 bits to the left of bit @var{n-1} (where the least significant bit is bit
 347 0) are set to the value of bit @var{n-1}.  Note that @var{n} may be
 348 larger than or equal to the width of the stack elements of the bytecode
 349 engine; in this case, the bytecode should have no effect.
 350
 351 The number of source bits to preserve, @var{n}, is encoded as a single
 352 byte unsigned integer following the @code{ext} bytecode.
 353
 354 @item @code{zero_ext} (0x2a) @var{n}: @var{a} @result{} @var{a}, zero-extended from @var{n} bits
 355 Pop an unsigned value from the stack; zero all but the bottom @var{n}
 356 bits.
 357
 358 The number of source bits to preserve, @var{n}, is encoded as a single
 359 byte unsigned integer following the @code{zero_ext} bytecode.
 360
 361 @item @code{ref8} (0x17): @var{addr} @result{} @var{a}
 362 @itemx @code{ref16} (0x18): @var{addr} @result{} @var{a}
 363 @itemx @code{ref32} (0x19): @var{addr} @result{} @var{a}
 364 @itemx @code{ref64} (0x1a): @var{addr} @result{} @var{a}
 365 Pop an address @var{addr} from the stack.  For bytecode
 366 @code{ref}@var{n}, fetch an @var{n}-bit value from @var{addr}, using the
 367 natural target endianness.  Push the fetched value as an unsigned
 368 integer.
 369
 370 Note that @var{addr} may not be aligned in any particular way; the
 371 @code{ref@var{n}} bytecodes should operate correctly for any address.
 372
 373 If attempting to access memory at @var{addr} would cause a processor
 374 exception of some sort, terminate with an error.
 375
 376 @item @code{ref_float} (0x1b): @var{addr} @result{} @var{d}
 377 @itemx @code{ref_double} (0x1c): @var{addr} @result{} @var{d}
 378 @itemx @code{ref_long_double} (0x1d): @var{addr} @result{} @var{d}
 379 @itemx @code{l_to_d} (0x1e): @var{a} @result{} @var{d}
 380 @itemx @code{d_to_l} (0x1f): @var{d} @result{} @var{a}
 381 Not implemented yet.
 382
 383 @item @code{dup} (0x28): @var{a} => @var{a} @var{a}
 384 Push another copy of the stack's top element.
 385
 386 @item @code{swap} (0x2b): @var{a} @var{b} => @var{b} @var{a}
 387 Exchange the top two items on the stack.
 388
 389 @item @code{pop} (0x29): @var{a} =>
 390 Discard the top value on the stack.
 391
 392 @item @code{pick} (0x32) @var{n}: @var{a} @dots{} @var{b} => @var{a} @dots{} @var{b} @var{a}
 393 Duplicate an item from the stack and push it on the top of the stack.
 394 @var{n}, a single byte, indicates the stack item to copy.  If @var{n}
 395 is zero, this is the same as @code{dup}; if @var{n} is one, it copies
 396 the item under the top item, etc.  If @var{n} exceeds the number of
 397 items on the stack, terminate with an error.
 398
 399 @item @code{rot} (0x33): @var{a} @var{b} @var{c} => @var{c} @var{b} @var{a}
 400 Rotate the top three items on the stack.
 401
 402 @item @code{if_goto} (0x20) @var{offset}: @var{a} @result{}
 403 Pop an integer off the stack; if it is non-zero, branch to the given
 404 offset in the bytecode string.  Otherwise, continue to the next
 405 instruction in the bytecode stream.  In other words, if @var{a} is
 406 non-zero, set the @code{pc} register to @code{start} + @var{offset}.
 407 Thus, an offset of zero denotes the beginning of the expression.
 408
 409 The @var{offset} is stored as a sixteen-bit unsigned value, stored
 410 immediately following the @code{if_goto} bytecode.  It is always stored
 411 most significant byte first, regardless of the target's normal
 412 endianness.  The offset is not guaranteed to fall at any particular
 413 alignment within the bytecode stream; thus, on machines where fetching a
 414 16-bit on an unaligned address raises an exception, you should fetch the
 415 offset one byte at a time.
 416
 417 @item @code{goto} (0x21) @var{offset}: @result{}
 418 Branch unconditionally to @var{offset}; in other words, set the
 419 @code{pc} register to @code{start} + @var{offset}.
 420
 421 The offset is stored in the same way as for the @code{if_goto} bytecode.
 422
 423 @item @code{const8} (0x22) @var{n}: @result{} @var{n}
 424 @itemx @code{const16} (0x23) @var{n}: @result{} @var{n}
 425 @itemx @code{const32} (0x24) @var{n}: @result{} @var{n}
 426 @itemx @code{const64} (0x25) @var{n}: @result{} @var{n}
 427 Push the integer constant @var{n} on the stack, without sign extension.
 428 To produce a small negative value, push a small twos-complement value,
 429 and then sign-extend it using the @code{ext} bytecode.
 430
 431 The constant @var{n} is stored in the appropriate number of bytes
 432 following the @code{const}@var{b} bytecode.  The constant @var{n} is
 433 always stored most significant byte first, regardless of the target's
 434 normal endianness.  The constant is not guaranteed to fall at any
 435 particular alignment within the bytecode stream; thus, on machines where
 436 fetching a 16-bit on an unaligned address raises an exception, you
 437 should fetch @var{n} one byte at a time.
 438
 439 @item @code{reg} (0x26) @var{n}: @result{} @var{a}
 440 Push the value of register number @var{n}, without sign extension.  The
 441 registers are numbered following GDB's conventions.
 442
 443 The register number @var{n} is encoded as a 16-bit unsigned integer
 444 immediately following the @code{reg} bytecode.  It is always stored most
 445 significant byte first, regardless of the target's normal endianness.
 446 The register number is not guaranteed to fall at any particular
 447 alignment within the bytecode stream; thus, on machines where fetching a
 448 16-bit on an unaligned address raises an exception, you should fetch the
 449 register number one byte at a time.
 450
 451 @item @code{getv} (0x2c) @var{n}: @result{} @var{v}
 452 Push the value of trace state variable number @var{n}, without sign
 453 extension.
 454
 455 The variable number @var{n} is encoded as a 16-bit unsigned integer
 456 immediately following the @code{getv} bytecode.  It is always stored most
 457 significant byte first, regardless of the target's normal endianness.
 458 The variable number is not guaranteed to fall at any particular
 459 alignment within the bytecode stream; thus, on machines where fetching a
 460 16-bit on an unaligned address raises an exception, you should fetch the
 461 register number one byte at a time.
 462
 463 @item @code{setv} (0x2d) @var{n}: @var{v} @result{} @var{v}
 464 Set trace state variable number @var{n} to the value found on the top
 465 of the stack.  The stack is unchanged, so that the value is readily
 466 available if the assignment is part of a larger expression.  The
 467 handling of @var{n} is as described for @code{getv}.
 468
 469 @item @code{trace} (0x0c): @var{addr} @var{size} @result{}
 470 Record the contents of the @var{size} bytes at @var{addr} in a trace
 471 buffer, for later retrieval by GDB.
 472
 473 @item @code{trace_quick} (0x0d) @var{size}: @var{addr} @result{} @var{addr}
 474 Record the contents of the @var{size} bytes at @var{addr} in a trace
 475 buffer, for later retrieval by GDB.  @var{size} is a single byte
 476 unsigned integer following the @code{trace} opcode.
 477
 478 This bytecode is equivalent to the sequence @code{dup const8 @var{size}
 479 trace}, but we provide it anyway to save space in bytecode strings.
 480
 481 @item @code{trace16} (0x30) @var{size}: @var{addr} @result{} @var{addr}
 482 Identical to trace_quick, except that @var{size} is a 16-bit big-endian
 483 unsigned integer, not a single byte.  This should probably have been
 484 named @code{trace_quick16}, for consistency.
 485
 486 @item @code{tracev} (0x2e) @var{n}: @result{} @var{a}
 487 Record the value of trace state variable number @var{n} in the trace
 488 buffer.  The handling of @var{n} is as described for @code{getv}.
 489
 490 @item @code{tracenz} (0x2f)  @var{addr} @var{size} @result{}
 491 Record the bytes at @var{addr} in a trace buffer, for later retrieval
 492 by GDB.  Stop at either the first zero byte, or when @var{size} bytes
 493 have been recorded, whichever occurs first.
 494
 495 @item @code{printf} (0x34)  @var{numargs} @var{string} @result{}
 496 Do a formatted print, in the style of the C function @code{printf}).
 497 The value of @var{numargs} is the number of arguments to expect on the
 498 stack, while @var{string} is the format string, prefixed with a
 499 two-byte length.  The last byte of the string must be zero, and is
 500 included in the length.  The format string includes escaped sequences
 501 just as it appears in C source, so for instance the format string
 502 @code{"\t%d\n"} is six characters long, and the output will consist of
 503 a tab character, a decimal number, and a newline.  At the top of the
 504 stack, above the values to be printed, this bytecode will pop a
 505 ``function'' and ``channel''.  If the function is nonzero, then the
 506 target may treat it as a function and call it, passing the channel as
 507 a first argument, as with the C function @code{fprintf}.  If the
 508 function is zero, then the target may simply call a standard formatted
 509 print function of its choice.  In all, this bytecode pops 2 +
 510 @var{numargs} stack elements, and pushes nothing.
 511
 512 @item @code{end} (0x27): @result{}
 513 Stop executing bytecode; the result should be the top element of the
 514 stack.  If the purpose of the expression was to compute an lvalue or a
 515 range of memory, then the next-to-top of the stack is the lvalue's
 516 address, and the top of the stack is the lvalue's size, in bytes.
 517
 518 @end table
 519
 520
 521 @node Using Agent Expressions
 522 @section Using Agent Expressions
 523
 524 Agent expressions can be used in several different ways by @value{GDBN},
 525 and the debugger can generate different bytecode sequences as appropriate.
 526
 527 One possibility is to do expression evaluation on the target rather
 528 than the host, such as for the conditional of a conditional
 529 tracepoint.  In such a case, @value{GDBN} compiles the source
 530 expression into a bytecode sequence that simply gets values from
 531 registers or memory, does arithmetic, and returns a result.
 532
 533 Another way to use agent expressions is for tracepoint data
 534 collection.  @value{GDBN} generates a different bytecode sequence for
 535 collection; in addition to bytecodes that do the calculation,
 536 @value{GDBN} adds @code{trace} bytecodes to save the pieces of
 537 memory that were used.
 538
 539 @itemize @bullet
 540
 541 @item
 542 The user selects trace points in the program's code at which GDB should
 543 collect data.
 544
 545 @item
 546 The user specifies expressions to evaluate at each trace point.  These
 547 expressions may denote objects in memory, in which case those objects'
 548 contents are recorded as the program runs, or computed values, in which
 549 case the values themselves are recorded.
 550
 551 @item
 552 GDB transmits the tracepoints and their associated expressions to the
 553 GDB agent, running on the debugging target.
 554
 555 @item
 556 The agent arranges to be notified when a trace point is hit.
 557
 558 @item
 559 When execution on the target reaches a trace point, the agent evaluates
 560 the expressions associated with that trace point, and records the
 561 resulting values and memory ranges.
 562
 563 @item
 564 Later, when the user selects a given trace event and inspects the
 565 objects and expression values recorded, GDB talks to the agent to
 566 retrieve recorded data as necessary to meet the user's requests.  If the
 567 user asks to see an object whose contents have not been recorded, GDB
 568 reports an error.
 569
 570 @end itemize
 571
 572
 573 @node Varying Target Capabilities
 574 @section Varying Target Capabilities
 575
 576 Some targets don't support floating-point, and some would rather not
 577 have to deal with @code{long long} operations.  Also, different targets
 578 will have different stack sizes, and different bytecode buffer lengths.
 579
 580 Thus, GDB needs a way to ask the target about itself.  We haven't worked
 581 out the details yet, but in general, GDB should be able to send the
 582 target a packet asking it to describe itself.  The reply should be a
 583 packet whose length is explicit, so we can add new information to the
 584 packet in future revisions of the agent, without confusing old versions
 585 of GDB, and it should contain a version number.  It should contain at
 586 least the following information:
 587
 588 @itemize @bullet
 589
 590 @item
 591 whether floating point is supported
 592
 593 @item
 594 whether @code{long long} is supported
 595
 596 @item
 597 maximum acceptable size of bytecode stack
 598
 599 @item
 600 maximum acceptable length of bytecode expressions
 601
 602 @item
 603 which registers are actually available for collection
 604
 605 @item
 606 whether the target supports disabled tracepoints
 607
 608 @end itemize
 609
 610 @node Rationale
 611 @section Rationale
 612
 613 Some of the design decisions apparent above are arguable.
 614
 615 @table @b
 616
 617 @item What about stack overflow/underflow?
 618 GDB should be able to query the target to discover its stack size.
 619 Given that information, GDB can determine at translation time whether a
 620 given expression will overflow the stack.  But this spec isn't about
 621 what kinds of error-checking GDB ought to do.
 622
 623 @item Why are you doing everything in LONGEST?
 624
 625 Speed isn't important, but agent code size is; using LONGEST brings in a
 626 bunch of support code to do things like division, etc.  So this is a
 627 serious concern.
 628
 629 First, note that you don't need different bytecodes for different
 630 operand sizes.  You can generate code without @emph{knowing} how big the
 631 stack elements actually are on the target.  If the target only supports
 632 32-bit ints, and you don't send any 64-bit bytecodes, everything just
 633 works.  The observation here is that the MIPS and the Alpha have only
 634 fixed-size registers, and you can still get C's semantics even though
 635 most instructions only operate on full-sized words.  You just need to
 636 make sure everything is properly sign-extended at the right times.  So
 637 there is no need for 32- and 64-bit variants of the bytecodes.  Just
 638 implement everything using the largest size you support.
 639
 640 GDB should certainly check to see what sizes the target supports, so the
 641 user can get an error earlier, rather than later.  But this information
 642 is not necessary for correctness.
 643
 644
 645 @item Why don't you have @code{>} or @code{<=} operators?
 646 I want to keep the interpreter small, and we don't need them.  We can
 647 combine the @code{less_} opcodes with @code{log_not}, and swap the order
 648 of the operands, yielding all four asymmetrical comparison operators.
 649 For example, @code{(x <= y)} is @code{! (x > y)}, which is @code{! (y <
 650 x)}.
 651
 652 @item Why do you have @code{log_not}?
 653 @itemx Why do you have @code{ext}?
 654 @itemx Why do you have @code{zero_ext}?
 655 These are all easily synthesized from other instructions, but I expect
 656 them to be used frequently, and they're simple, so I include them to
 657 keep bytecode strings short.
 658
 659 @code{log_not} is equivalent to @code{const8 0 equal}; it's used in half
 660 the relational operators.
 661
 662 @code{ext @var{n}} is equivalent to @code{const8 @var{s-n} lsh const8
 663 @var{s-n} rsh_signed}, where @var{s} is the size of the stack elements;
 664 it follows @code{ref@var{m}} and @var{reg} bytecodes when the value
 665 should be signed.  See the next bulleted item.
 666
 667 @code{zero_ext @var{n}} is equivalent to @code{const@var{m} @var{mask}
 668 log_and}; it's used whenever we push the value of a register, because we
 669 can't assume the upper bits of the register aren't garbage.
 670
 671 @item Why not have sign-extending variants of the @code{ref} operators?
 672 Because that would double the number of @code{ref} operators, and we
 673 need the @code{ext} bytecode anyway for accessing bitfields.
 674
 675 @item Why not have constant-address variants of the @code{ref} operators?
 676 Because that would double the number of @code{ref} operators again, and
 677 @code{const32 @var{address} ref32} is only one byte longer.
 678
 679 @item Why do the @code{ref@var{n}} operators have to support unaligned fetches?
 680 GDB will generate bytecode that fetches multi-byte values at unaligned
 681 addresses whenever the executable's debugging information tells it to.
 682 Furthermore, GDB does not know the value the pointer will have when GDB
 683 generates the bytecode, so it cannot determine whether a particular
 684 fetch will be aligned or not.
 685
 686 In particular, structure bitfields may be several bytes long, but follow
 687 no alignment rules; members of packed structures are not necessarily
 688 aligned either.
 689
 690 In general, there are many cases where unaligned references occur in
 691 correct C code, either at the programmer's explicit request, or at the
 692 compiler's discretion.  Thus, it is simpler to make the GDB agent
 693 bytecodes work correctly in all circumstances than to make GDB guess in
 694 each case whether the compiler did the usual thing.
 695
 696 @item Why are there no side-effecting operators?
 697 Because our current client doesn't want them?  That's a cheap answer.  I
 698 think the real answer is that I'm afraid of implementing function
 699 calls.  We should re-visit this issue after the present contract is
 700 delivered.
 701
 702 @item Why aren't the @code{goto} ops PC-relative?
 703 The interpreter has the base address around anyway for PC bounds
 704 checking, and it seemed simpler.
 705
 706 @item Why is there only one offset size for the @code{goto} ops?
 707 Offsets are currently sixteen bits.  I'm not happy with this situation
 708 either:
 709
 710 Suppose we have multiple branch ops with different offset sizes.  As I
 711 generate code left-to-right, all my jumps are forward jumps (there are
 712 no loops in expressions), so I never know the target when I emit the
 713 jump opcode.  Thus, I have to either always assume the largest offset
 714 size, or do jump relaxation on the code after I generate it, which seems
 715 like a big waste of time.
 716
 717 I can imagine a reasonable expression being longer than 256 bytes.  I
 718 can't imagine one being longer than 64k.  Thus, we need 16-bit offsets.
 719 This kind of reasoning is so bogus, but relaxation is pathetic.
 720
 721 The other approach would be to generate code right-to-left.  Then I'd
 722 always know my offset size.  That might be fun.
 723
 724 @item Where is the function call bytecode?
 725
 726 When we add side-effects, we should add this.
 727
 728 @item Why does the @code{reg} bytecode take a 16-bit register number?
 729
 730 Intel's IA-64 architecture has 128 general-purpose registers,
 731 and 128 floating-point registers, and I'm sure it has some random
 732 control registers.
 733
 734 @item Why do we need @code{trace} and @code{trace_quick}?
 735 Because GDB needs to record all the memory contents and registers an
 736 expression touches.  If the user wants to evaluate an expression
 737 @code{x->y->z}, the agent must record the values of @code{x} and
 738 @code{x->y} as well as the value of @code{x->y->z}.
 739
 740 @item Don't the @code{trace} bytecodes make the interpreter less general?
 741 They do mean that the interpreter contains special-purpose code, but
 742 that doesn't mean the interpreter can only be used for that purpose.  If
 743 an expression doesn't use the @code{trace} bytecodes, they don't get in
 744 its way.
 745
 746 @item Why doesn't @code{trace_quick} consume its arguments the way everything else does?
 747 In general, you do want your operators to consume their arguments; it's
 748 consistent, and generally reduces the amount of stack rearrangement
 749 necessary.  However, @code{trace_quick} is a kludge to save space; it
 750 only exists so we needn't write @code{dup const8 @var{SIZE} trace}
 751 before every memory reference.  Therefore, it's okay for it not to
 752 consume its arguments; it's meant for a specific context in which we
 753 know exactly what it should do with the stack.  If we're going to have a
 754 kludge, it should be an effective kludge.
 755
 756 @item Why does @code{trace16} exist?
 757 That opcode was added by the customer that contracted Cygnus for the
 758 data tracing work.  I personally think it is unnecessary; objects that
 759 large will be quite rare, so it is okay to use @code{dup const16
 760 @var{size} trace} in those cases.
 761
 762 Whatever we decide to do with @code{trace16}, we should at least leave
 763 opcode 0x30 reserved, to remain compatible with the customer who added
 764 it.
 765
 766 @end table