gdb/doc/agentexpr.texi

   1 @c \input texinfo
   2 @c %**start of header
   3 @c @setfilename agentexpr.info
   4 @c @settitle GDB Agent Expressions
   5 @c @setchapternewpage off
   6 @c %**end of header
   7
   8 @c This file is part of the GDB manual.
   9 @c
  10 @c Copyright (C) 2003, 2004, 2005, 2006
  11 @c               Free Software Foundation, Inc.
  12 @c
  13 @c See the file gdb.texinfo for copying conditions.
  14
  15 @c Revision: $Id$
  16
  17 @node Agent Expressions
  18 @appendix The GDB Agent Expression Mechanism
  19
  20 In some applications, it is not feasible for the debugger to interrupt
  21 the program's execution long enough for the developer to learn anything
  22 helpful about its behavior.  If the program's correctness depends on its
  23 real-time behavior, delays introduced by a debugger might cause the
  24 program to fail, even when the code itself is correct.  It is useful to
  25 be able to observe the program's behavior without interrupting it.
  26
  27 Using GDB's @code{trace} and @code{collect} commands, the user can
  28 specify locations in the program, and arbitrary expressions to evaluate
  29 when those locations are reached.  Later, using the @code{tfind}
  30 command, she can examine the values those expressions had when the
  31 program hit the trace points.  The expressions may also denote objects
  32 in memory --- structures or arrays, for example --- whose values GDB
  33 should record; while visiting a particular tracepoint, the user may
  34 inspect those objects as if they were in memory at that moment.
  35 However, because GDB records these values without interacting with the
  36 user, it can do so quickly and unobtrusively, hopefully not disturbing
  37 the program's behavior.
  38
  39 When GDB is debugging a remote target, the GDB @dfn{agent} code running
  40 on the target computes the values of the expressions itself.  To avoid
  41 having a full symbolic expression evaluator on the agent, GDB translates
  42 expressions in the source language into a simpler bytecode language, and
  43 then sends the bytecode to the agent; the agent then executes the
  44 bytecode, and records the values for GDB to retrieve later.
  45
  46 The bytecode language is simple; there are forty-odd opcodes, the bulk
  47 of which are the usual vocabulary of C operands (addition, subtraction,
  48 shifts, and so on) and various sizes of literals and memory reference
  49 operations.  The bytecode interpreter operates strictly on machine-level
  50 values --- various sizes of integers and floating point numbers --- and
  51 requires no information about types or symbols; thus, the interpreter's
  52 internal data structures are simple, and each bytecode requires only a
  53 few native machine instructions to implement it.  The interpreter is
  54 small, and strict limits on the memory and time required to evaluate an
  55 expression are easy to determine, making it suitable for use by the
  56 debugging agent in real-time applications.
  57
  58 @menu
  59 * General Bytecode Design::     Overview of the interpreter.
  60 * Bytecode Descriptions::       What each one does.
  61 * Using Agent Expressions::     How agent expressions fit into the big picture.
  62 * Varying Target Capabilities:: How to discover what the target can do.
  63 * Tracing on Symmetrix::        Special info for implementation on EMC's
  64                                 boxes.
  65 * Rationale::                   Why we did it this way.
  66 @end menu
  67
  68
  69 @c @node Rationale
  70 @c @section Rationale
  71
  72
  73 @node General Bytecode Design
  74 @section General Bytecode Design
  75
  76 The agent represents bytecode expressions as an array of bytes.  Each
  77 instruction is one byte long (thus the term @dfn{bytecode}).  Some
  78 instructions are followed by operand bytes; for example, the @code{goto}
  79 instruction is followed by a destination for the jump.
  80
  81 The bytecode interpreter is a stack-based machine; most instructions pop
  82 their operands off the stack, perform some operation, and push the
  83 result back on the stack for the next instruction to consume.  Each
  84 element of the stack may contain either a integer or a floating point
  85 value; these values are as many bits wide as the largest integer that
  86 can be directly manipulated in the source language.  Stack elements
  87 carry no record of their type; bytecode could push a value as an
  88 integer, then pop it as a floating point value.  However, GDB will not
  89 generate code which does this.  In C, one might define the type of a
  90 stack element as follows:
  91 @example
  92 union agent_val @{
  93   LONGEST l;
  94   DOUBLEST d;
  95 @};
  96 @end example
  97 @noindent
  98 where @code{LONGEST} and @code{DOUBLEST} are @code{typedef} names for
  99 the largest integer and floating point types on the machine.
 100
 101 By the time the bytecode interpreter reaches the end of the expression,
 102 the value of the expression should be the only value left on the stack.
 103 For tracing applications, @code{trace} bytecodes in the expression will
 104 have recorded the necessary data, and the value on the stack may be
 105 discarded.  For other applications, like conditional breakpoints, the
 106 value may be useful.
 107
 108 Separate from the stack, the interpreter has two registers:
 109 @table @code
 110 @item pc
 111 The address of the next bytecode to execute.
 112
 113 @item start
 114 The address of the start of the bytecode expression, necessary for
 115 interpreting the @code{goto} and @code{if_goto} instructions.
 116
 117 @end table
 118 @noindent
 119 Neither of these registers is directly visible to the bytecode language
 120 itself, but they are useful for defining the meanings of the bytecode
 121 operations.
 122
 123 There are no instructions to perform side effects on the running
 124 program, or call the program's functions; we assume that these
 125 expressions are only used for unobtrusive debugging, not for patching
 126 the running code.
 127
 128 Most bytecode instructions do not distinguish between the various sizes
 129 of values, and operate on full-width values; the upper bits of the
 130 values are simply ignored, since they do not usually make a difference
 131 to the value computed.  The exceptions to this rule are:
 132 @table @asis
 133
 134 @item memory reference instructions (@code{ref}@var{n})
 135 There are distinct instructions to fetch different word sizes from
 136 memory.  Once on the stack, however, the values are treated as full-size
 137 integers.  They may need to be sign-extended; the @code{ext} instruction
 138 exists for this purpose.
 139
 140 @item the sign-extension instruction (@code{ext} @var{n})
 141 These clearly need to know which portion of their operand is to be
 142 extended to occupy the full length of the word.
 143
 144 @end table
 145
 146 If the interpreter is unable to evaluate an expression completely for
 147 some reason (a memory location is inaccessible, or a divisor is zero,
 148 for example), we say that interpretation ``terminates with an error''.
 149 This means that the problem is reported back to the interpreter's caller
 150 in some helpful way.  In general, code using agent expressions should
 151 assume that they may attempt to divide by zero, fetch arbitrary memory
 152 locations, and misbehave in other ways.
 153
 154 Even complicated C expressions compile to a few bytecode instructions;
 155 for example, the expression @code{x + y * z} would typically produce
 156 code like the following, assuming that @code{x} and @code{y} live in
 157 registers, and @code{z} is a global variable holding a 32-bit
 158 @code{int}:
 159 @example
 160 reg 1
 161 reg 2
 162 const32 @i{address of z}
 163 ref32
 164 ext 32
 165 mul
 166 add
 167 end
 168 @end example
 169
 170 In detail, these mean:
 171 @table @code
 172
 173 @item reg 1
 174 Push the value of register 1 (presumably holding @code{x}) onto the
 175 stack.
 176
 177 @item reg 2
 178 Push the value of register 2 (holding @code{y}).
 179
 180 @item const32 @i{address of z}
 181 Push the address of @code{z} onto the stack.
 182
 183 @item ref32
 184 Fetch a 32-bit word from the address at the top of the stack; replace
 185 the address on the stack with the value.  Thus, we replace the address
 186 of @code{z} with @code{z}'s value.
 187
 188 @item ext 32
 189 Sign-extend the value on the top of the stack from 32 bits to full
 190 length.  This is necessary because @code{z} is a signed integer.
 191
 192 @item mul
 193 Pop the top two numbers on the stack, multiply them, and push their
 194 product.  Now the top of the stack contains the value of the expression
 195 @code{y * z}.
 196
 197 @item add
 198 Pop the top two numbers, add them, and push the sum.  Now the top of the
 199 stack contains the value of @code{x + y * z}.
 200
 201 @item end
 202 Stop executing; the value left on the stack top is the value to be
 203 recorded.
 204
 205 @end table
 206
 207
 208 @node Bytecode Descriptions
 209 @section Bytecode Descriptions
 210
 211 Each bytecode description has the following form:
 212
 213 @table @asis
 214
 215 @item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b}
 216
 217 Pop the top two stack items, @var{a} and @var{b}, as integers; push
 218 their sum, as an integer.
 219
 220 @end table
 221
 222 In this example, @code{add} is the name of the bytecode, and
 223 @code{(0x02)} is the one-byte value used to encode the bytecode, in
 224 hexadecimal.  The phrase ``@var{a} @var{b} @result{} @var{a+b}'' shows
 225 the stack before and after the bytecode executes.  Beforehand, the stack
 226 must contain at least two values, @var{a} and @var{b}; since the top of
 227 the stack is to the right, @var{b} is on the top of the stack, and
 228 @var{a} is underneath it.  After execution, the bytecode will have
 229 popped @var{a} and @var{b} from the stack, and replaced them with a
 230 single value, @var{a+b}.  There may be other values on the stack below
 231 those shown, but the bytecode affects only those shown.
 232
 233 Here is another example:
 234
 235 @table @asis
 236
 237 @item @code{const8} (0x22) @var{n}: @result{} @var{n}
 238 Push the 8-bit integer constant @var{n} on the stack, without sign
 239 extension.
 240
 241 @end table
 242
 243 In this example, the bytecode @code{const8} takes an operand @var{n}
 244 directly from the bytecode stream; the operand follows the @code{const8}
 245 bytecode itself.  We write any such operands immediately after the name
 246 of the bytecode, before the colon, and describe the exact encoding of
 247 the operand in the bytecode stream in the body of the bytecode
 248 description.
 249
 250 For the @code{const8} bytecode, there are no stack items given before
 251 the @result{}; this simply means that the bytecode consumes no values
 252 from the stack.  If a bytecode consumes no values, or produces no
 253 values, the list on either side of the @result{} may be empty.
 254
 255 If a value is written as @var{a}, @var{b}, or @var{n}, then the bytecode
 256 treats it as an integer.  If a value is written is @var{addr}, then the
 257 bytecode treats it as an address.
 258
 259 We do not fully describe the floating point operations here; although
 260 this design can be extended in a clean way to handle floating point
 261 values, they are not of immediate interest to the customer, so we avoid
 262 describing them, to save time.
 263
 264
 265 @table @asis
 266
 267 @item @code{float} (0x01): @result{}
 268
 269 Prefix for floating-point bytecodes.  Not implemented yet.
 270
 271 @item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b}
 272 Pop two integers from the stack, and push their sum, as an integer.
 273
 274 @item @code{sub} (0x03): @var{a} @var{b} @result{} @var{a-b}
 275 Pop two integers from the stack, subtract the top value from the
 276 next-to-top value, and push the difference.
 277
 278 @item @code{mul} (0x04): @var{a} @var{b} @result{} @var{a*b}
 279 Pop two integers from the stack, multiply them, and push the product on
 280 the stack.  Note that, when one multiplies two @var{n}-bit numbers
 281 yielding another @var{n}-bit number, it is irrelevant whether the
 282 numbers are signed or not; the results are the same.
 283
 284 @item @code{div_signed} (0x05): @var{a} @var{b} @result{} @var{a/b}
 285 Pop two signed integers from the stack; divide the next-to-top value by
 286 the top value, and push the quotient.  If the divisor is zero, terminate
 287 with an error.
 288
 289 @item @code{div_unsigned} (0x06): @var{a} @var{b} @result{} @var{a/b}
 290 Pop two unsigned integers from the stack; divide the next-to-top value
 291 by the top value, and push the quotient.  If the divisor is zero,
 292 terminate with an error.
 293
 294 @item @code{rem_signed} (0x07): @var{a} @var{b} @result{} @var{a modulo b}
 295 Pop two signed integers from the stack; divide the next-to-top value by
 296 the top value, and push the remainder.  If the divisor is zero,
 297 terminate with an error.
 298
 299 @item @code{rem_unsigned} (0x08): @var{a} @var{b} @result{} @var{a modulo b}
 300 Pop two unsigned integers from the stack; divide the next-to-top value
 301 by the top value, and push the remainder.  If the divisor is zero,
 302 terminate with an error.
 303
 304 @item @code{lsh} (0x09): @var{a} @var{b} @result{} @var{a<<b}
 305 Pop two integers from the stack; let @var{a} be the next-to-top value,
 306 and @var{b} be the top value.  Shift @var{a} left by @var{b} bits, and
 307 push the result.
 308
 309 @item @code{rsh_signed} (0x0a): @var{a} @var{b} @result{} @code{(signed)}@var{a>>b}
 310 Pop two integers from the stack; let @var{a} be the next-to-top value,
 311 and @var{b} be the top value.  Shift @var{a} right by @var{b} bits,
 312 inserting copies of the top bit at the high end, and push the result.
 313
 314 @item @code{rsh_unsigned} (0x0b): @var{a} @var{b} @result{} @var{a>>b}
 315 Pop two integers from the stack; let @var{a} be the next-to-top value,
 316 and @var{b} be the top value.  Shift @var{a} right by @var{b} bits,
 317 inserting zero bits at the high end, and push the result.
 318
 319 @item @code{log_not} (0x0e): @var{a} @result{} @var{!a}
 320 Pop an integer from the stack; if it is zero, push the value one;
 321 otherwise, push the value zero.
 322
 323 @item @code{bit_and} (0x0f): @var{a} @var{b} @result{} @var{a&b}
 324 Pop two integers from the stack, and push their bitwise @code{and}.
 325
 326 @item @code{bit_or} (0x10): @var{a} @var{b} @result{} @var{a|b}
 327 Pop two integers from the stack, and push their bitwise @code{or}.
 328
 329 @item @code{bit_xor} (0x11): @var{a} @var{b} @result{} @var{a^b}
 330 Pop two integers from the stack, and push their bitwise
 331 exclusive-@code{or}.
 332
 333 @item @code{bit_not} (0x12): @var{a} @result{} @var{~a}
 334 Pop an integer from the stack, and push its bitwise complement.
 335
 336 @item @code{equal} (0x13): @var{a} @var{b} @result{} @var{a=b}
 337 Pop two integers from the stack; if they are equal, push the value one;
 338 otherwise, push the value zero.
 339
 340 @item @code{less_signed} (0x14): @var{a} @var{b} @result{} @var{a<b}
 341 Pop two signed integers from the stack; if the next-to-top value is less
 342 than the top value, push the value one; otherwise, push the value zero.
 343
 344 @item @code{less_unsigned} (0x15): @var{a} @var{b} @result{} @var{a<b}
 345 Pop two unsigned integers from the stack; if the next-to-top value is less
 346 than the top value, push the value one; otherwise, push the value zero.
 347
 348 @item @code{ext} (0x16) @var{n}: @var{a} @result{} @var{a}, sign-extended from @var{n} bits
 349 Pop an unsigned value from the stack; treating it as an @var{n}-bit
 350 twos-complement value, extend it to full length.  This means that all
 351 bits to the left of bit @var{n-1} (where the least significant bit is bit
 352 0) are set to the value of bit @var{n-1}.  Note that @var{n} may be
 353 larger than or equal to the width of the stack elements of the bytecode
 354 engine; in this case, the bytecode should have no effect.
 355
 356 The number of source bits to preserve, @var{n}, is encoded as a single
 357 byte unsigned integer following the @code{ext} bytecode.
 358
 359 @item @code{zero_ext} (0x2a) @var{n}: @var{a} @result{} @var{a}, zero-extended from @var{n} bits
 360 Pop an unsigned value from the stack; zero all but the bottom @var{n}
 361 bits.  This means that all bits to the left of bit @var{n-1} (where the
 362 least significant bit is bit 0) are set to the value of bit @var{n-1}.
 363
 364 The number of source bits to preserve, @var{n}, is encoded as a single
 365 byte unsigned integer following the @code{zero_ext} bytecode.
 366
 367 @item @code{ref8} (0x17): @var{addr} @result{} @var{a}
 368 @itemx @code{ref16} (0x18): @var{addr} @result{} @var{a}
 369 @itemx @code{ref32} (0x19): @var{addr} @result{} @var{a}
 370 @itemx @code{ref64} (0x1a): @var{addr} @result{} @var{a}
 371 Pop an address @var{addr} from the stack.  For bytecode
 372 @code{ref}@var{n}, fetch an @var{n}-bit value from @var{addr}, using the
 373 natural target endianness.  Push the fetched value as an unsigned
 374 integer.
 375
 376 Note that @var{addr} may not be aligned in any particular way; the
 377 @code{ref@var{n}} bytecodes should operate correctly for any address.
 378
 379 If attempting to access memory at @var{addr} would cause a processor
 380 exception of some sort, terminate with an error.
 381
 382 @item @code{ref_float} (0x1b): @var{addr} @result{} @var{d}
 383 @itemx @code{ref_double} (0x1c): @var{addr} @result{} @var{d}
 384 @itemx @code{ref_long_double} (0x1d): @var{addr} @result{} @var{d}
 385 @itemx @code{l_to_d} (0x1e): @var{a} @result{} @var{d}
 386 @itemx @code{d_to_l} (0x1f): @var{d} @result{} @var{a}
 387 Not implemented yet.
 388
 389 @item @code{dup} (0x28): @var{a} => @var{a} @var{a}
 390 Push another copy of the stack's top element.
 391
 392 @item @code{swap} (0x2b): @var{a} @var{b} => @var{b} @var{a}
 393 Exchange the top two items on the stack.
 394
 395 @item @code{pop} (0x29): @var{a} =>
 396 Discard the top value on the stack.
 397
 398 @item @code{if_goto} (0x20) @var{offset}: @var{a} @result{}
 399 Pop an integer off the stack; if it is non-zero, branch to the given
 400 offset in the bytecode string.  Otherwise, continue to the next
 401 instruction in the bytecode stream.  In other words, if @var{a} is
 402 non-zero, set the @code{pc} register to @code{start} + @var{offset}.
 403 Thus, an offset of zero denotes the beginning of the expression.
 404
 405 The @var{offset} is stored as a sixteen-bit unsigned value, stored
 406 immediately following the @code{if_goto} bytecode.  It is always stored
 407 most significant byte first, regardless of the target's normal
 408 endianness.  The offset is not guaranteed to fall at any particular
 409 alignment within the bytecode stream; thus, on machines where fetching a
 410 16-bit on an unaligned address raises an exception, you should fetch the
 411 offset one byte at a time.
 412
 413 @item @code{goto} (0x21) @var{offset}: @result{}
 414 Branch unconditionally to @var{offset}; in other words, set the
 415 @code{pc} register to @code{start} + @var{offset}.
 416
 417 The offset is stored in the same way as for the @code{if_goto} bytecode.
 418
 419 @item @code{const8} (0x22) @var{n}: @result{} @var{n}
 420 @itemx @code{const16} (0x23) @var{n}: @result{} @var{n}
 421 @itemx @code{const32} (0x24) @var{n}: @result{} @var{n}
 422 @itemx @code{const64} (0x25) @var{n}: @result{} @var{n}
 423 Push the integer constant @var{n} on the stack, without sign extension.
 424 To produce a small negative value, push a small twos-complement value,
 425 and then sign-extend it using the @code{ext} bytecode.
 426
 427 The constant @var{n} is stored in the appropriate number of bytes
 428 following the @code{const}@var{b} bytecode.  The constant @var{n} is
 429 always stored most significant byte first, regardless of the target's
 430 normal endianness.  The constant is not guaranteed to fall at any
 431 particular alignment within the bytecode stream; thus, on machines where
 432 fetching a 16-bit on an unaligned address raises an exception, you
 433 should fetch @var{n} one byte at a time.
 434
 435 @item @code{reg} (0x26) @var{n}: @result{} @var{a}
 436 Push the value of register number @var{n}, without sign extension.  The
 437 registers are numbered following GDB's conventions.
 438
 439 The register number @var{n} is encoded as a 16-bit unsigned integer
 440 immediately following the @code{reg} bytecode.  It is always stored most
 441 significant byte first, regardless of the target's normal endianness.
 442 The register number is not guaranteed to fall at any particular
 443 alignment within the bytecode stream; thus, on machines where fetching a
 444 16-bit on an unaligned address raises an exception, you should fetch the
 445 register number one byte at a time.
 446
 447 @item @code{trace} (0x0c): @var{addr} @var{size} @result{}
 448 Record the contents of the @var{size} bytes at @var{addr} in a trace
 449 buffer, for later retrieval by GDB.
 450
 451 @item @code{trace_quick} (0x0d) @var{size}: @var{addr} @result{} @var{addr}
 452 Record the contents of the @var{size} bytes at @var{addr} in a trace
 453 buffer, for later retrieval by GDB.  @var{size} is a single byte
 454 unsigned integer following the @code{trace} opcode.
 455
 456 This bytecode is equivalent to the sequence @code{dup const8 @var{size}
 457 trace}, but we provide it anyway to save space in bytecode strings.
 458
 459 @item @code{trace16} (0x30) @var{size}: @var{addr} @result{} @var{addr}
 460 Identical to trace_quick, except that @var{size} is a 16-bit big-endian
 461 unsigned integer, not a single byte.  This should probably have been
 462 named @code{trace_quick16}, for consistency.
 463
 464 @item @code{end} (0x27): @result{}
 465 Stop executing bytecode; the result should be the top element of the
 466 stack.  If the purpose of the expression was to compute an lvalue or a
 467 range of memory, then the next-to-top of the stack is the lvalue's
 468 address, and the top of the stack is the lvalue's size, in bytes.
 469
 470 @end table
 471
 472
 473 @node Using Agent Expressions
 474 @section Using Agent Expressions
 475
 476 Here is a sketch of a full non-stop debugging cycle, showing how agent
 477 expressions fit into the process.
 478
 479 @itemize @bullet
 480
 481 @item
 482 The user selects trace points in the program's code at which GDB should
 483 collect data.
 484
 485 @item
 486 The user specifies expressions to evaluate at each trace point.  These
 487 expressions may denote objects in memory, in which case those objects'
 488 contents are recorded as the program runs, or computed values, in which
 489 case the values themselves are recorded.
 490
 491 @item
 492 GDB transmits the tracepoints and their associated expressions to the
 493 GDB agent, running on the debugging target.
 494
 495 @item
 496 The agent arranges to be notified when a trace point is hit.  Note that,
 497 on some systems, the target operating system is completely responsible
 498 for collecting the data; see @ref{Tracing on Symmetrix}.
 499
 500 @item
 501 When execution on the target reaches a trace point, the agent evaluates
 502 the expressions associated with that trace point, and records the
 503 resulting values and memory ranges.
 504
 505 @item
 506 Later, when the user selects a given trace event and inspects the
 507 objects and expression values recorded, GDB talks to the agent to
 508 retrieve recorded data as necessary to meet the user's requests.  If the
 509 user asks to see an object whose contents have not been recorded, GDB
 510 reports an error.
 511
 512 @end itemize
 513
 514
 515 @node Varying Target Capabilities
 516 @section Varying Target Capabilities
 517
 518 Some targets don't support floating-point, and some would rather not
 519 have to deal with @code{long long} operations.  Also, different targets
 520 will have different stack sizes, and different bytecode buffer lengths.
 521
 522 Thus, GDB needs a way to ask the target about itself.  We haven't worked
 523 out the details yet, but in general, GDB should be able to send the
 524 target a packet asking it to describe itself.  The reply should be a
 525 packet whose length is explicit, so we can add new information to the
 526 packet in future revisions of the agent, without confusing old versions
 527 of GDB, and it should contain a version number.  It should contain at
 528 least the following information:
 529
 530 @itemize @bullet
 531
 532 @item
 533 whether floating point is supported
 534
 535 @item
 536 whether @code{long long} is supported
 537
 538 @item
 539 maximum acceptable size of bytecode stack
 540
 541 @item
 542 maximum acceptable length of bytecode expressions
 543
 544 @item
 545 which registers are actually available for collection
 546
 547 @item
 548 whether the target supports disabled tracepoints
 549
 550 @end itemize
 551
 552
 553
 554 @node Tracing on Symmetrix
 555 @section Tracing on Symmetrix
 556
 557 This section documents the API used by the GDB agent to collect data on
 558 Symmetrix systems.
 559
 560 Cygnus originally implemented these tracing features to help EMC
 561 Corporation debug their Symmetrix high-availability disk drives.  The
 562 Symmetrix application code already includes substantial tracing
 563 facilities; the GDB agent for the Symmetrix system uses those facilities
 564 for its own data collection, via the API described here.
 565
 566 @deftypefn Function DTC_RESPONSE adbg_find_memory_in_frame (FRAME_DEF *@var{frame}, char *@var{address}, char **@var{buffer}, unsigned int *@var{size})
 567 Search the trace frame @var{frame} for memory saved from @var{address}.
 568 If the memory is available, provide the address of the buffer holding
 569 it; otherwise, provide the address of the next saved area.
 570
 571 @itemize @bullet
 572
 573 @item
 574 If the memory at @var{address} was saved in @var{frame}, set
 575 @code{*@var{buffer}} to point to the buffer in which that memory was
 576 saved, set @code{*@var{size}} to the number of bytes from @var{address}
 577 that are saved at @code{*@var{buffer}}, and return
 578 @code{OK_TARGET_RESPONSE}.  (Clearly, in this case, the function will
 579 always set @code{*@var{size}} to a value greater than zero.)
 580
 581 @item
 582 If @var{frame} does not record any memory at @var{address}, set
 583 @code{*@var{size}} to the distance from @var{address} to the start of
 584 the saved region with the lowest address higher than @var{address}.  If
 585 there is no memory saved from any higher address, set @code{*@var{size}}
 586 to zero.  Return @code{NOT_FOUND_TARGET_RESPONSE}.
 587 @end itemize
 588
 589 These two possibilities allow the caller to either retrieve the data, or
 590 walk the address space to the next saved area.
 591 @end deftypefn
 592
 593 This function allows the GDB agent to map the regions of memory saved in
 594 a particular frame, and retrieve their contents efficiently.
 595
 596 This function also provides a clean interface between the GDB agent and
 597 the Symmetrix tracing structures, making it easier to adapt the GDB
 598 agent to future versions of the Symmetrix system, and vice versa.  This
 599 function searches all data saved in @var{frame}, whether the data is
 600 there at the request of a bytecode expression, or because it falls in
 601 one of the format's memory ranges, or because it was saved from the top
 602 of the stack.  EMC can arbitrarily change and enhance the tracing
 603 mechanism, but as long as this function works properly, all collected
 604 memory is visible to GDB.
 605
 606 The function itself is straightforward to implement.  A single pass over
 607 the trace frame's stack area, memory ranges, and expression blocks can
 608 yield the address of the buffer (if the requested address was saved),
 609 and also note the address of the next higher range of memory, to be
 610 returned when the search fails.
 611
 612 As an example, suppose the trace frame @code{f} has saved sixteen bytes
 613 from address @code{0x8000} in a buffer at @code{0x1000}, and thirty-two
 614 bytes from address @code{0xc000} in a buffer at @code{0x1010}.  Here are
 615 some sample calls, and the effect each would have:
 616
 617 @table @code
 618
 619 @item adbg_find_memory_in_frame (f, (char*) 0x8000, &buffer, &size)
 620 This would set @code{buffer} to @code{0x1000}, set @code{size} to
 621 sixteen, and return @code{OK_TARGET_RESPONSE}, since @code{f} saves
 622 sixteen bytes from @code{0x8000} at @code{0x1000}.
 623
 624 @item adbg_find_memory_in_frame (f, (char *) 0x8004, &buffer, &size)
 625 This would set @code{buffer} to @code{0x1004}, set @code{size} to
 626 twelve, and return @code{OK_TARGET_RESPONSE}, since @file{f} saves the
 627 twelve bytes from @code{0x8004} starting four bytes into the buffer at
 628 @code{0x1000}.  This shows that request addresses may fall in the middle
 629 of saved areas; the function should return the address and size of the
 630 remainder of the buffer.
 631
 632 @item adbg_find_memory_in_frame (f, (char *) 0x8100, &buffer, &size)
 633 This would set @code{size} to @code{0x3f00} and return
 634 @code{NOT_FOUND_TARGET_RESPONSE}, since there is no memory saved in
 635 @code{f} from the address @code{0x8100}, and the next memory available
 636 is at @code{0x8100 + 0x3f00}, or @code{0xc000}.  This shows that request
 637 addresses may fall outside of all saved memory ranges; the function
 638 should indicate the next saved area, if any.
 639
 640 @item adbg_find_memory_in_frame (f, (char *) 0x7000, &buffer, &size)
 641 This would set @code{size} to @code{0x1000} and return
 642 @code{NOT_FOUND_TARGET_RESPONSE}, since the next saved memory is at
 643 @code{0x7000 + 0x1000}, or @code{0x8000}.
 644
 645 @item adbg_find_memory_in_frame (f, (char *) 0xf000, &buffer, &size)
 646 This would set @code{size} to zero, and return
 647 @code{NOT_FOUND_TARGET_RESPONSE}.  This shows how the function tells the
 648 caller that no further memory ranges have been saved.
 649
 650 @end table
 651
 652 As another example, here is a function which will print out the
 653 addresses of all memory saved in the trace frame @code{frame} on the
 654 Symmetrix INLINES console:
 655 @example
 656 void
 657 print_frame_addresses (FRAME_DEF *frame)
 658 @{
 659   char *addr;
 660   char *buffer;
 661   unsigned long size;
 662
 663   addr = 0;
 664   for (;;)
 665     @{
 666       /* Either find out how much memory we have here, or discover
 667          where the next saved region is.  */
 668       if (adbg_find_memory_in_frame (frame, addr, &buffer, &size)
 669           == OK_TARGET_RESPONSE)
 670         printp ("saved %x to %x\n", addr, addr + size);
 671       if (size == 0)
 672         break;
 673       addr += size;
 674     @}
 675 @}
 676 @end example
 677
 678 Note that there is not necessarily any connection between the order in
 679 which the data is saved in the trace frame, and the order in which
 680 @code{adbg_find_memory_in_frame} will return those memory ranges.  The
 681 code above will always print the saved memory regions in order of
 682 increasing address, while the underlying frame structure might store the
 683 data in a random order.
 684
 685 [[This section should cover the rest of the Symmetrix functions the stub
 686 relies upon, too.]]
 687
 688 @node Rationale
 689 @section Rationale
 690
 691 Some of the design decisions apparent above are arguable.
 692
 693 @table @b
 694
 695 @item What about stack overflow/underflow?
 696 GDB should be able to query the target to discover its stack size.
 697 Given that information, GDB can determine at translation time whether a
 698 given expression will overflow the stack.  But this spec isn't about
 699 what kinds of error-checking GDB ought to do.
 700
 701 @item Why are you doing everything in LONGEST?
 702
 703 Speed isn't important, but agent code size is; using LONGEST brings in a
 704 bunch of support code to do things like division, etc.  So this is a
 705 serious concern.
 706
 707 First, note that you don't need different bytecodes for different
 708 operand sizes.  You can generate code without @emph{knowing} how big the
 709 stack elements actually are on the target.  If the target only supports
 710 32-bit ints, and you don't send any 64-bit bytecodes, everything just
 711 works.  The observation here is that the MIPS and the Alpha have only
 712 fixed-size registers, and you can still get C's semantics even though
 713 most instructions only operate on full-sized words.  You just need to
 714 make sure everything is properly sign-extended at the right times.  So
 715 there is no need for 32- and 64-bit variants of the bytecodes.  Just
 716 implement everything using the largest size you support.
 717
 718 GDB should certainly check to see what sizes the target supports, so the
 719 user can get an error earlier, rather than later.  But this information
 720 is not necessary for correctness.
 721
 722
 723 @item Why don't you have @code{>} or @code{<=} operators?
 724 I want to keep the interpreter small, and we don't need them.  We can
 725 combine the @code{less_} opcodes with @code{log_not}, and swap the order
 726 of the operands, yielding all four asymmetrical comparison operators.
 727 For example, @code{(x <= y)} is @code{! (x > y)}, which is @code{! (y <
 728 x)}.
 729
 730 @item Why do you have @code{log_not}?
 731 @itemx Why do you have @code{ext}?
 732 @itemx Why do you have @code{zero_ext}?
 733 These are all easily synthesized from other instructions, but I expect
 734 them to be used frequently, and they're simple, so I include them to
 735 keep bytecode strings short.
 736
 737 @code{log_not} is equivalent to @code{const8 0 equal}; it's used in half
 738 the relational operators.
 739
 740 @code{ext @var{n}} is equivalent to @code{const8 @var{s-n} lsh const8
 741 @var{s-n} rsh_signed}, where @var{s} is the size of the stack elements;
 742 it follows @code{ref@var{m}} and @var{reg} bytecodes when the value
 743 should be signed.  See the next bulleted item.
 744
 745 @code{zero_ext @var{n}} is equivalent to @code{const@var{m} @var{mask}
 746 log_and}; it's used whenever we push the value of a register, because we
 747 can't assume the upper bits of the register aren't garbage.
 748
 749 @item Why not have sign-extending variants of the @code{ref} operators?
 750 Because that would double the number of @code{ref} operators, and we
 751 need the @code{ext} bytecode anyway for accessing bitfields.
 752
 753 @item Why not have constant-address variants of the @code{ref} operators?
 754 Because that would double the number of @code{ref} operators again, and
 755 @code{const32 @var{address} ref32} is only one byte longer.
 756
 757 @item Why do the @code{ref@var{n}} operators have to support unaligned fetches?
 758 GDB will generate bytecode that fetches multi-byte values at unaligned
 759 addresses whenever the executable's debugging information tells it to.
 760 Furthermore, GDB does not know the value the pointer will have when GDB
 761 generates the bytecode, so it cannot determine whether a particular
 762 fetch will be aligned or not.
 763
 764 In particular, structure bitfields may be several bytes long, but follow
 765 no alignment rules; members of packed structures are not necessarily
 766 aligned either.
 767
 768 In general, there are many cases where unaligned references occur in
 769 correct C code, either at the programmer's explicit request, or at the
 770 compiler's discretion.  Thus, it is simpler to make the GDB agent
 771 bytecodes work correctly in all circumstances than to make GDB guess in
 772 each case whether the compiler did the usual thing.
 773
 774 @item Why are there no side-effecting operators?
 775 Because our current client doesn't want them?  That's a cheap answer.  I
 776 think the real answer is that I'm afraid of implementing function
 777 calls.  We should re-visit this issue after the present contract is
 778 delivered.
 779
 780 @item Why aren't the @code{goto} ops PC-relative?
 781 The interpreter has the base address around anyway for PC bounds
 782 checking, and it seemed simpler.
 783
 784 @item Why is there only one offset size for the @code{goto} ops?
 785 Offsets are currently sixteen bits.  I'm not happy with this situation
 786 either:
 787
 788 Suppose we have multiple branch ops with different offset sizes.  As I
 789 generate code left-to-right, all my jumps are forward jumps (there are
 790 no loops in expressions), so I never know the target when I emit the
 791 jump opcode.  Thus, I have to either always assume the largest offset
 792 size, or do jump relaxation on the code after I generate it, which seems
 793 like a big waste of time.
 794
 795 I can imagine a reasonable expression being longer than 256 bytes.  I
 796 can't imagine one being longer than 64k.  Thus, we need 16-bit offsets.
 797 This kind of reasoning is so bogus, but relaxation is pathetic.
 798
 799 The other approach would be to generate code right-to-left.  Then I'd
 800 always know my offset size.  That might be fun.
 801
 802 @item Where is the function call bytecode?
 803
 804 When we add side-effects, we should add this.
 805
 806 @item Why does the @code{reg} bytecode take a 16-bit register number?
 807
 808 Intel's IA-64 architecture has 128 general-purpose registers,
 809 and 128 floating-point registers, and I'm sure it has some random
 810 control registers.
 811
 812 @item Why do we need @code{trace} and @code{trace_quick}?
 813 Because GDB needs to record all the memory contents and registers an
 814 expression touches.  If the user wants to evaluate an expression
 815 @code{x->y->z}, the agent must record the values of @code{x} and
 816 @code{x->y} as well as the value of @code{x->y->z}.
 817
 818 @item Don't the @code{trace} bytecodes make the interpreter less general?
 819 They do mean that the interpreter contains special-purpose code, but
 820 that doesn't mean the interpreter can only be used for that purpose.  If
 821 an expression doesn't use the @code{trace} bytecodes, they don't get in
 822 its way.
 823
 824 @item Why doesn't @code{trace_quick} consume its arguments the way everything else does?
 825 In general, you do want your operators to consume their arguments; it's
 826 consistent, and generally reduces the amount of stack rearrangement
 827 necessary.  However, @code{trace_quick} is a kludge to save space; it
 828 only exists so we needn't write @code{dup const8 @var{SIZE} trace}
 829 before every memory reference.  Therefore, it's okay for it not to
 830 consume its arguments; it's meant for a specific context in which we
 831 know exactly what it should do with the stack.  If we're going to have a
 832 kludge, it should be an effective kludge.
 833
 834 @item Why does @code{trace16} exist?
 835 That opcode was added by the customer that contracted Cygnus for the
 836 data tracing work.  I personally think it is unnecessary; objects that
 837 large will be quite rare, so it is okay to use @code{dup const16
 838 @var{size} trace} in those cases.
 839
 840 Whatever we decide to do with @code{trace16}, we should at least leave
 841 opcode 0x30 reserved, to remain compatible with the customer who added
 842 it.
 843
 844 @end table