Just since this is bumped, I'll add my further two cents. In the past, I've only ever needed to use a kind of 'virtual update' technique on the Beeb, e.g.
Code:
.Update
LDA NumMonsters
BEQ NoMonsters
LDX #0
.UpdateLoop
LDY MonsterType,X
LDA MonsterUpdateLo,Y
STA UpdateCall + 1
LDA MonsterUpdateHi,Y
STA UpdateCall + 2
.UpdateCall
JSR &FFFF ; monster index in X, monster type in Y
INX
CPX NumMonsters
BNE UpdateLoop
.NoMonsters
....
....
MaxMonsters = 16 ; maximum which can exist at once
MaxMonsterTypes = 8 ; how many different types there are
.MonsterType SKIP(MaxMonsters) ; reserve MaxMonsters bytes
.MonsterX SKIP(MaxMonsters)
.MonsterY SKIP(MaxMonsters)
.MonsterState1 SKIP(MaxMonsters)
.MonsterState2 SKIP(MaxMonsters)
...
.MonsterState5 SKIP(MaxMonsters)
.MonsterUpdateLo SKIP(MaxMonsterTypes)
.MonsterUpdateHi SKIP(MaxMonsterTypes)
.MonsterSpriteLo SKIP(MaxMonsterTypes)
.MonsterSpriteHi SKIP(MaxMonsterTypes)
.MonsterWidth SKIP(MaxMonsterTypes)
.MonsterHeight SKIP(MaxMonsterTypes)
Essentially, it's the old state machine type approach, where the update routine updates a single frame's worth of behaviour (with the index of the monster being processed in X, and the type in Y, in case you want two different monsters to share largely the same update, with some very subtle differences).
For me, this is surprisingly clear for 6502 code, and has always served me sufficiently well. I never coded any kind of behaviour on the Beeb for which a fibre system would be advantageous, but I can see how more complex behaviour could benefit, even in 6502, for readability, and perhaps even speed.
I suppose a very minimal implementation might move the UpdateLo/Hi address to be an attribute of the monster instance, rather than the monster type. The Yield subroutine would then store its return address there, instead of actually returning to it, and instead JMP to the update loop again. But then, you would also have to save and restore the A, X, Y and P registers (hence requiring some another 4 bytes for these per monster instance). And as I said earlier, you'd have to be really careful with the stack. Here's a possible implementation:
Code:
.Update
LDA NumMonsters
BEQ NoMonsters
LDX #0
.UpdateLoop
LDY MonsterSaveY,X
LDA MonsterUpdateHi,X
PHA
LDA MonsterUpdateLo,X
PHA
LDA MonsterSaveP,X
PHA
LDA MonsterSaveA,X
PLP
RTS ; calls the pushed address
.BackFromUpdate
INX
CPX NumMonsters
BNE UpdateLoop
.NoMonsters
RTS
....
....
.CreateMonster ; Y = type of monster
LDX NumMonsters
CPX #MaxMonsters
BEQ NoMonstersLeft
LDA MonsterInitUpdateLo,Y
STA MonsterUpdateLo,X
LDA MonsterInitUpdateHi,Y
STA MonsterUpdateHi,X
TYA
STA MonsterType,X
LDA initxpos
STA MonsterX,X
LDA initypos
STA MonsterY,X
INX
STX NumMonsters
.NoMonstersLeft
RTS
....
....
.Yield ; have to call with X = monster index (hence no need to save X)
PHP ; remember the flags as they are RIGHT NOW!
STA MonsterSaveA,X
PLA ; only way we can access P directly
STA MonsterSaveP,X
TYA ; no STY abs,X
STA MonsterSaveY,X
PLA ; return address, i.e. next instruction of the update
STA MonsterUpdateLo,X
PLA
STA MonsterUpdateHi,X
JMP BackFromUpdate
....
....
MaxMonsters = 16 ; maximum which can exist at once
MaxMonsterTypes = 8 ; how many different types there are
; tables, one entry per monster instance
.MonsterType SKIP(MaxMonsters) ; reserve MaxMonsters bytes
.MonsterX SKIP(MaxMonsters)
.MonsterY SKIP(MaxMonsters)
.MonsterUpdateLo SKIP(MaxMonsters)
.MonsterUpdateHi SKIP(MaxMonsters)
.MonsterSaveA SKIP(MaxMonsters)
.MonsterSaveY SKIP(MaxMonsters)
.MonsterSaveP SKIP(MaxMonsters)
.MonsterState1 SKIP(MaxMonsters)
.MonsterState2 SKIP(MaxMonsters)
...
.MonsterState5 SKIP(MaxMonsters)
; tables, one entry per monster type
.MonsterInitUpdateLo SKIP(MaxMonsterTypes)
.MonsterInitUpdateHi SKIP(MaxMonsterTypes)
.MonsterSpriteLo SKIP(MaxMonsterTypes)
.MonsterSpriteHi SKIP(MaxMonsterTypes)
.MonsterWidth SKIP(MaxMonsterTypes)
.MonsterHeight SKIP(MaxMonsterTypes)
Example of use:
Code:
.UpdateRightThenLeft
LDY #0
.MoveRightLoop
INC MonsterX,X
JSR Yield ; stop processing this monster, will continue where we left off next update
INY
CPY #30
BNE MoveRightLoop
.MoveLeftLoop
DEC MonsterX,X
JSR Yield
DEY
BNE MoveLeftLoop
BEQ MoveRightLoop
It's admittedly quite neat to be able to focus on the isolated logic of one entity like this, but it doesn't come without its own gotchas:
There is a small overhead in Yield-ing, from having to save and restore the registers, both a memory and a speed cost. Simple logic might not particularly benefit from this extra overhead, particularly if there are many cheap entities being processed.
A big gotcha is to do with the stack: this routine will break if the Update routine leaves something on the stack before calling Yield - this also includes calling Yield inside a JSR called from the Update routine. That's because we're not creating separate stack frames for each fibre, and so one fibre's PHA could later be paired with another fibre's PLA, with disastrous results! To fix this, we would have to remember the stack pointer for each fibre too, and initially allocate each one 8 bytes or so of the stack. Now, the 6502 stack is tiny, so this is already sounding like a bad idea. An alternative is to copy the stack frame elsewhere upon Yielding, and to copy it back upon Restoring, but this too is starting to get a little involved and probably not really worthwhile.
So, with these limitations in mind, the fibre system could work OK on the Beeb - but for me, I'm probably likely to stick with my simple implementation for the moment
(Multiple edits: to correct bugs in my hastily-written code)