It is currently Mon Oct 20, 2014 5:46 pm

All times are UTC [ DST ]




Post new topic Reply to topic  [ 20 posts ] 
Author Message
 Post subject: Turning off OS routines
PostPosted: Thu Jul 24, 2008 8:58 am 
Offline
User avatar
 Profile

Joined: Wed Jan 09, 2008 7:30 am
Posts: 406
I seem to remember you can wring slightly more speed out of the Beeb for games by turning off some routine tasks undertaken by the OS. For example keyboard scanning. Now I think I'm already doing this with the code RTW gave me for vertical scrolling. But are there any other areas that can be turned of to improve speed a little ? And if so how ?


Top
 
PostPosted: Thu Jul 24, 2008 10:00 am 
Offline
User avatar
 WWW  Profile

Joined: Thu Apr 03, 2008 2:49 pm
Posts: 277
Location: Antarctica
Can't you just turn off everything?

I just enable timer1 and vsync IRQ


Top
 
PostPosted: Thu Jul 24, 2008 10:16 am 
Offline
User avatar
 Profile

Joined: Mon Jan 07, 2008 7:02 pm
Posts: 273
What Dave said.

The easiest way to do this is to intercept IRQ1V and never return to the old handler. This means you have to handle every interrupt yourself - and remember what you've enabled!


Top
 
PostPosted: Thu Jul 24, 2008 10:20 am 
Offline
User avatar
 Profile

Joined: Wed Jan 09, 2008 7:30 am
Posts: 406
Doing that already, two timers and vsync, damn, hoped I'd missed something, need more speed !!!!


Top
 
PostPosted: Thu Jul 24, 2008 11:59 am 
Offline
User avatar
 Profile

Joined: Mon Jan 07, 2008 6:46 pm
Posts: 380
Location: Málaga, Spain
I guess all you can do now is try to optimise what you have already a little bit more.

One way I use to see where the bottlenecks are in the main loop is to set the background palette colour to different colours at various points in the main loop. You'll see various bands of colour across your screen, the 'thickness' of each corresponding to the time spent in each one. Then, you can look at the ones which take the most time, and try to streamline them a bit more.

Some ways you can do this might be, but are not limited to :) :
* unrolling frequently called loops, e.g. in the background map plotter, sprite routines etc
* removing CLC/SEC inside frequently called code, when you already know what the C flag state will be
* Pay special attention to things like this:

worst way to do it:
(8 bit value) * 64 = (16 bit value)
Code:
; enter with A = value to multiply
LDX #0:STX val+1
ASL A:ROL val+1
ASL A:ROL val+1
ASL A:ROL val+1
ASL A:ROL val+1
ASL A:ROL val+1
ASL A:ROL val+1
STA val


better way to do it:
Code:
LDX #0:STX val
LSR A:ROR val
LSR A:ROR val
STA val+1


this is also a good example of a place you can save a CLC:

Code:
; calculate sprite address: &2800+(sprite num)*64
LDX #0:STX val
LSR A:ROR val
LSR A:ROR val
ADC #&28   ; no CLC needed as ROR will clear C
STA val+1



If still nothing will budge, then unfortunately you only have two options:
* cut some content
* reduce your frame rate :(

This is what happened with Blurp :(


Last edited by RichTW on Tue Oct 21, 2008 1:07 pm, edited 1 time in total.

Top
 
PostPosted: Thu Jul 24, 2008 12:24 pm 
Offline
User avatar
 Profile

Joined: Wed Jan 09, 2008 7:30 am
Posts: 406
Thanks Rich, I'd noticed your background palette thing on the stars demo I think.

As usual your always very helpful.

I haven't really gone through my routines yet to really have a go at optimising but that's my next task.

It doesn't help that I've got full masks on the sprites with them being able to float "effortlessly" over the background and one another. It's a real cycle eater.

Looks sweet with just a couple of 100Byte+ sprites but starts to lose it a little if I push the routines harder. I should get some improvement when I review all the code (hopes). It still includes a lot of "quick and dirty" stuff I wrote for the show.

If I can't get enough then there are areas I could cut back on :(


Top
 
PostPosted: Thu Jul 24, 2008 12:31 pm 
Offline
User avatar
 Profile

Joined: Mon Jan 07, 2008 6:46 pm
Posts: 380
Location: Málaga, Spain
Just thought I'd add a few more thoughts about optimisation.

Sometimes lookup tables can save you some time, if you're prepared to afford the memory.

e.g. the multiply by 64 can just become:

Code:
; X = number to be multiplied by 64
LDA mult64lo,X:STA val
LDA mult64hi,X:STA val+1


This is much quicker again than doing the maths by hand.

In your sprite routine, if you are using colour 0 (edited from colour 8) to represent a transparent pixel, you can speed up the masking using a table, if you dare to allocate 256 bytes (gasp) as a mask table lookup:

Code:
LDA (sprite),Y
STA pokeme+1
LDA screen,X
.pokeme AND masktable   ; must be 256-byte aligned
ORA pokeme+1
STA screen,X


...or that kind of thing, where masktable(n) just contains &00, &55, &AA or &FF to mask neither, both or one of the pixels in a Mode 2 byte.

Even better of course is to allocate a separate mask for each sprite, but I don't think it's so feasible to double the size of your sprite data in memory just for this. In general, the balance to be struck is between speed and memory usage. Tricky one.

Always remember about X and Y. If they're free, it's much quicker to preserve A by doing TAX .... TXA than by doing STA zp .... LDA zp. The stack is the slowest - avoid PHA .... PLA in critical code (unless you really need to use the stack due to reentrancy), because you can do just as good a job with STA ... LDA. If I don't want to consider which zp location is free, I sometimes use:

Code:
STA preserveme+1
  ...
  ...
.preserveme
LDA #0

which is just as fast as using the zero page.

Remember JSR/RTS have an overhead. If you have a very small subroutine, consider moving its code into the body of the calling code, if it's a critical section of code - e.g. don't JSR mult64, but repeat the code from my last post everywhere you need it (it's only 11 bytes long).

I'll add more as I think of them....


Top
 
PostPosted: Thu Jul 24, 2008 12:39 pm 
Offline
User avatar
 Profile

Joined: Mon Jan 07, 2008 6:46 pm
Posts: 380
Location: Málaga, Spain
Ha, just seen your message about your sprite routine (must've sent it while I was writing mine!)...

I guess it's worth saying that the sprite routine is always the bottleneck in any game, particularly a nice, fully-masked, non-EORed one. I would say that this should probably be the first thing you try to optimise.

How does your sprite routine look at the moment? (or what is it doing, more-or-less?)


Top
 
PostPosted: Thu Jul 24, 2008 12:43 pm 
Offline
User avatar
 Profile

Joined: Wed Jan 09, 2008 7:30 am
Posts: 406
Thanks Rich, the 256byte mask table is a possible option. Short on memory but he ho !

Loving all your suggestions so far :) :) :)


Top
 
PostPosted: Thu Jul 24, 2008 12:55 pm 
Offline
User avatar
 Profile

Joined: Mon Jan 07, 2008 6:46 pm
Posts: 380
Location: Málaga, Spain
Just another thing worth mentioning:

if you are using colour 0 as transparent for your sprites, and you don't require "opaque" black anywhere, the values for sprite data bytes will only ever be between 0 and 63, which means just a 64 byte mask table, which is maybe more affordable...


Top
 
PostPosted: Thu Jul 24, 2008 1:00 pm 
Offline
User avatar
 Profile

Joined: Wed Jan 09, 2008 7:30 am
Posts: 406
Yes, logical colour 0 is the transparency. Unfortunately I want to remain faithfully as I can to the game so I need opaque black as well.

:) In my head yesterday I was designing a hardware sprite add on :) But that would be cheating !


Top
 
PostPosted: Thu Jul 24, 2008 1:02 pm 
Offline
User avatar
 Profile

Joined: Mon Jan 07, 2008 6:46 pm
Posts: 380
Location: Málaga, Spain
What Blurp does is defines logical colour 0 as blue (i.e. the background colour) and black is some other logical colour, so I can have the outlines around everything. It just means I can't use the background colour in my sprites.

Maybe something similar would work for you?

Hardware sprite addon - that would be perfect :) proper sideways scrolling while we're about it too. And how about 'dark' versions of the standard colours instead of flashing? Then we'd have maroon, dark green, brown, dark blue, purple, 'teal' and grey too...


Top
 
PostPosted: Thu Jul 24, 2008 1:12 pm 
Offline
User avatar
 WWW  Profile

Joined: Thu Apr 03, 2008 2:49 pm
Posts: 277
Location: Antarctica
May also sound obvious but I have separate sprite routines for each spritesize I have instead of faffing with width/height arguments. So I have for eg:

Code:
plot16x16Tile()
plot8x8Masked()
plot16x16Masked()
plotLetter()


... and so on


Top
 
PostPosted: Thu Jul 24, 2008 1:19 pm 
Offline
User avatar
 Profile

Joined: Mon Jan 07, 2008 6:46 pm
Posts: 380
Location: Málaga, Spain
yeah, good point Dave, I have a few different routines for different sizes/alignments too.

On the same theme, you might even find that calling a different routine when the sprite is character row-aligned is worthwhile too (because you can unroll the loop 8 times). It's probably only worthwhile if the character is walking on platforms a lot of the time, but you never know.


Top
 
PostPosted: Thu Jul 24, 2008 2:14 pm 
Offline
User avatar
 Profile

Joined: Wed Jan 09, 2008 7:30 am
Posts: 406
All good stuff.

RichTW wrote:
Hardware sprite addon - that would be perfect :) proper sideways scrolling while we're about it too. And how about 'dark' versions of the standard colours instead of flashing? Then we'd have maroon, dark green, brown, dark blue, purple, 'teal' and grey too...


Heh heh, actually I'd planned it all out to include full pixel level scrolling in all directions, support for umpteen sprites and possible filled vectors as well. However extra colour depth could not be supported in the original design I envisaged as it was still making use of most of the Beebs existing video circuity.

All pipe dream stuff really, but feasible. Can't help thinking it would be a real cheat though :lol:

On subject of sprite routines. I have optimised plotters for background. In fact the background graphics are 8 bytes high by 2 wide, so these are stored in the natural flow of the screen memory. For the background there are 3 separate plotters depending on the type of background cell to be plotted (i.e. solid colour, normal and flipped horizontally).

For my main sprites I've not analysed the ones in the actual game to see if they follow fixed sizes. Perhaps they do. I'll investigate. End of level bosses are obviously exceptions. I suspect the game graphics are quite varied though :(


Top
 
PostPosted: Tue Jun 16, 2009 3:16 pm 
Offline
User avatar
 Profile

Joined: Wed Jan 09, 2008 7:30 am
Posts: 406
RichTW wrote:
In your sprite routine, if you are using colour 0 (edited from colour 8) to represent a transparent pixel, you can speed up the masking using a table, if you dare to allocate 256 bytes (gasp) as a mask table lookup:

Code:
LDA (sprite),Y
STA pokeme+1
LDA screen,X
.pokeme AND masktable   ; must be 256-byte aligned
ORA pokeme+1
STA screen,X


...or that kind of thing, where masktable(n) just contains &00, &55, &AA or &FF to mask neither, both or one of the pixels in a Mode 2 byte


Just now getting back into looking at this (been a long time) and decided this mask technique is the way for me to go (although it uses up a valueable 256 bytes of memory). Anyhow for anyone else wanting to do this I've genertated the mask table data with a bit of code so you can just 'lift' the numbers from here if you need them to save anyone a little time (not that it's a big effort !)

Code:
FF AA 55 00 AA AA 00 00
55 00 55 00 00 00 00 00
AA AA 00 00 AA AA 00 00
00 00 00 00 00 00 00 00
55 00 55 00 00 00 00 00
55 00 55 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
AA AA 00 00 AA AA 00 00
00 00 00 00 00 00 00 00
AA AA 00 00 AA AA 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
55 00 55 00 00 00 00 00
55 00 55 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
55 00 55 00 00 00 00 00
55 00 55 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00


Top
 
PostPosted: Tue Jun 16, 2009 7:47 pm 
Offline
User avatar
 WWW  Profile

Joined: Thu Apr 03, 2008 2:49 pm
Posts: 277
Location: Antarctica
Welcome back Steve :D


Top
 
PostPosted: Tue Jun 16, 2009 8:30 pm 
Offline
User avatar
 Profile

Joined: Wed Jan 09, 2008 7:30 am
Posts: 406
lol, well it has been just short of a year since I last worked on it seriously !

anyway I forgot to attach a file version of the data for people to directly use (oops), here it is.

Note : although the file extension is .zip, it is NOT a zip file, it is the pure 256 bytes of data. Using .zip as the extension was the only way I could get the data uploaded to the forum.


Attachments:
Mask.zip [256 Bytes]
Downloaded 5 times
Top
 
PostPosted: Thu Jul 30, 2009 2:56 pm 
Offline
User avatar
 Profile

Joined: Thu Jun 04, 2009 10:12 am
Posts: 11
Compiled sprites? You're trading memory for speed, but hey - it's a classic!

CSprites are just about the fastest way to draw. You literally turn your graphics into code. No masking necessary as you only draw the pixels you need to draw. and no lookups/loops.

I suppose there's a complication in that the beeb's screen memory isn't linear - is that right? Hmm, that just might kill this idea off but I'll leave it here to float and see what happens.

C


Top
 
PostPosted: Thu Jul 30, 2009 3:19 pm 
Offline
User avatar
 Profile

Joined: Wed Jan 09, 2008 7:30 am
Posts: 406
Oh yes, the Beebs screen memory is an 'interesting' layout to say the least :lol: Have a look in the advanced user guide


Top
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 20 posts ] 

All times are UTC [ DST ]


Who is online

Users browsing this forum: No registered users and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron